RNA-seq analysis workshop
1
Differential gene expression (DGE) analysis overview
1.1
Review of the dataset
1.2
Setting up
1.2.1
Loading libraries
1.2.2
Loading data
1.2.3
Viewing data
1.3
Differential gene expression analysis overview
1.3.1
RNA-seq count distribution
1.3.2
Modeling count data
1.3.3
Improving mean estimates (i.e. reducing variance) with biological replicates
1.3.4
Differential expression analysis workflow
2
Count normalization
2.1
Normalization
2.1.1
Common normalization methods
2.1.2
RPKM/FPKM (not recommended)
2.1.3
DESeq2-normalized counts: Median of ratios method
2.2
Geometric Mean
2.3
Count normalization of Mov10 dataset
2.3.1
1. Match the metadata and counts data
2.3.2
2. Create DESEq2 object
2.3.3
3. Generate the Mov10 normalized counts
3
Quality Control
3.1
Sample-level QC
3.1.1
Principal Component Analysis (PCA)
3.1.2
Hierarchical Clustering Heatmap
3.2
Gene-level QC
3.3
Mov10 quality assessment and exploratory analysis using DESeq2
3.3.1
Transform normalized counts using the rlog transformation
3.3.2
Principal components analysis (PCA)
3.3.3
Hierarchical Clustering
4
DGE analysis workflow
4.1
Running DESeq2
4.1.1
Set the Design Formula
4.1.2
MOV10 Differential Expression Analysis
4.2
DESeq2 differential gene expression analysis workflow
4.2.1
Step 1: Estimate size factors
4.2.2
Step 2: Estimate gene-wise dispersion
4.2.3
Step 3: Fit Curve to Gene-Wise Dispersion Estimates
4.2.4
Step 4: Shrinking Gene-Wise Dispersion Estimates Toward the Global Trend
4.2.5
MOV10 Differential Expression Analysis: Exploring Dispersion Estimates
5
Model and hypothesis testing
5.1
Fitting the Generalized Linear Model for Each Gene
5.2
Shrunken Log2 Fold Changes (LFC)
5.3
Statistical test for LFC estimates: Wald test
5.3.1
MOV10 DE Analysis: Contrasts and Wald Tests
5.3.2
MOV10 DE Analysis:
Control vs. Knockdown
5.3.3
MOV10 DE Analysis:*
Control versus Knockdown
5.4
Summarizing Results
5.4.1
Extracting Significant Differentially Expressed Genes
5.4.2
MOV10 Knockdown Analysis: Control vs. Knockdown
6
Visualizing RNA-seq results
6.0.1
Plotting signicant DE genes
6.0.2
Heatmap
6.0.3
Volcano plot
7
Summary of DGE workflow
8
Functional analysis of RNAseq data
8.0.1
clusterProfiler
8.0.2
Gene set enrichment analysis (GSEA)
8.0.3
Other tools and resources
9
Codebook answers
9.1
DGE analysis overview
9.1.1
Setting up
9.1.2
DGE analysis workflow
9.2
Count normalization
9.2.1
Normalization
9.2.2
Count normalization of Mov10 dataset
9.3
DGE QC analysis
9.4
DGE analysis workflow
9.4.1
Running DESeq2
9.4.2
Set the Design Formula
9.4.3
DESeq2 differential gene expression analysis workflow
9.5
Model fitting
9.5.1
Generalized Linear Model fit for each gene
9.5.2
Summarizing results
9.6
Visualizing rna-seq results
9.6.1
Plotting signicant DE genes
9.6.2
Heatmap
9.7
Summary of differential expression analysis workflow
9.7.1
1. Import data into dds object:
9.7.2
2. Exploratory data analysis (PCA & heirarchical clustering) - identifying outliers and sources of variation in the data:
9.7.3
3. Run DESeq2:
9.7.4
4. Check the fit of the dispersion estimates:
9.7.5
5. Create contrasts to perform Wald testing on the shrunken log2 foldchanges between specific conditions:
9.7.6
6. Output significant results:
9.7.7
7. Visualize results: volcano plots, heatmaps, normalized counts plots of top genes, etc.
9.7.8
8. Make sure to output the versions of all tools used in the DE analysis:
9.8
Functional analysis
9.8.1
Over-representation analysis
9.8.2
clusterProfiler
9.8.3
Gene set enrichment analysis (GSEA)
9.8.4
Other tools and resources
RNA-seq-analysis
RNA-seq-analysis
RNA-seq analysis workshop
The published version for this module can be found on my bookdown site
RNA-seq-analysis
.