Part 2 Some Questions Answered

A couple of things happened last Saturday and the days before that - as you are a part of this group, I think it’s only fair that you at least have an understanding of what’s been going on for the past few days.

Hence, I’ll try to fill you in what’s been going on in the form of questions that you might have.

  1. What is the difference between the 5 - 15 page report and the 10-slide PowerPoint?

    Sean asked this question some nights ago in our tutorial group chat (group 5). According to the TA (Kai Xiang):

    "simply speaking, the report is the report (some ideas you can use here)

    the presentation is like a presentation to present your results (like an Ureca project or something

    – Kai Xiang

    And this is what professor Wilson mentioned:

    "Think of [the PowerPoint] as a pitch to highlight the key points of your work. In case the report is poorly written, but the ideas are good. That’s a lifeline.

    – professor Wilson

    So, the presentation is like the “business” end of our work. It highlights the key parts of our introduction, our analyses, our results, and our discussion in a manner similar to how a SBS seminar speaker would present them.

    The report - on the contrary - delves into the nitty-gritty of our work. The background of the data, our rationale for doing what we did, the code, the analyses, the discussions, and the graphs generated during exploratory data analysis or differential gene expression analysis - this should be included in the report.

  2. What exactly do you mean by "the applications of the data and what [K]evin created in R didn’t really make much sense?

    In the link Sean provided to the “Mice” dataset, the count matrix of the researchers’ sequencing experiments were present, but their RNA sequencing data were not. While this isn’t necessarily a bad thing: using the count matrix to perform differential gene expression analysis, this is very restrictive nonetheless (i.e., the less we do, the less we have to write about).

    I even consulted Kai Xiang - he didn’t sound too convinced in a private conversation I had with him on Sunday morning:

    "you’d want to check the data description of your data

    Using [the count matrix] to classify genes sounds weird to m"

    – Kai Xiang

    Granted - I had already done a fair amount of work on the data: several machine learning models, differential gene expression analysis, and even GO term analysis, but it doesn’t make sense to pigeonhole ideas into random contexts, only to end up shooting ourselves in the foot when we begin writing the report or making the PowerPoint presentation.

    So yes - we chose to focus on the cancer data that Sean found as it appeared the most workable out of everything we have so far. Yes: “cancer” might sound like a rather generic topic to focus on, but just because something is generic doesn’t mean that it’s bad. And that’s why we are having the meeting on Monday afternoon: to discuss ideas and points to focus on in the report and the presentation.

  3. What if this “cancer” idea goes south as well?

    I have some backup ideas in mind - I’ve also done the analyses for them as well, so rest assured: if this idea sinks like the Titanic, I have a lifeboat on deck for all of us.

  4. Actually, why did you choose to abandon the “rice” idea? Sean said that the “data was not really viable,” but what does he mean by that?

    If you open the rice.csv file in RStudio, you will see 12 (I believe) samples. Sean and I initially thought that the data contained 12 biological replicates - as it turns out, there were only four biological replicates; the researchers merely collected RNA sequence data on the biological replicates over the course of a week.

    I initially proposed trying our hand at time-series analysis as the data had a time element to it, but I could not think of a good rationale for going down this route.

    Hence, we decided to abandon the data altogether and look for datasets elsewhere.

  5. You sent out this guide on \(\LaTeX\) you made some time ago - do you expect us to pick up \(\LaTeX\) and use it to write the report?

    No I don’t. I made the guide under the assumption that somebody might be interested enough in \(\LaTeX\) to want to check it out in their own time.

    That said, if you are happy using Google Docs or Microsoft Word to write and format your report, then so be it. I originally proposed \(\LaTeX\) to give our report a more “polished” feel, but if we can achieve a similar “finished” quality to our reports, then I’m happy either way.

  6. Why are there broken links in this document?

    Rest assured - this was not done out of bad intentions. I initially wanted to include other chapters and / or links to other resources in this document to explain certain components of my workflow, but I did not have the time to implement them.

    So, after Monday, I will actively work on these links so that you guys will have more knowledge on my workflow and ideas.