Chapter 3 Thale Cress Gene Expression

Thale Cress11

In Computer Lab 8B we will analyse genomic data from the plant Arabidopsis thaliana (thale cress).

It is important to note here that our analysis procedure would (generally speaking) be the same if we were conducting a gene expression analysis on human data, or data for other plants or animals.

3.1 Thale Cress gene data

In Computer Lab 8B we will assess RNA-Seq gene expression data for different time points over the germination and post-germination period of thale cress seeds. This data, collected by Narsai et al. (2017), is publicly available on the NCBI Gene Expression Omnibus website12.

As you might expect, there will be a lot of gene expression changes as the seeds germinate.

Let’s take a look at the characteristics of this data set.

Note: We have conducted some initial data cleaning and preparation, to make this data more accessible.

##           Chr X24hSL_1 X24hSL_2 X24hSL_3 X48hSL_1 X48hSL_2 X48hSL_3
## AT1G01010   1      282      136      315      646      622      610
## AT1G01020   1     1199      830     1341      768      769      888
## AT1G01030   1      264       79      267      266      218      333
## AT1G01040   1     1594      416      905     1640     1497     1893
## AT1G01050   1     4650     2976     4684     5350     5385     6000
## AT1G01060   1     8464     3007     8813     5066     5098     5923

Here, we note that:

  • Each row refers to a different gene.
  • The Chr column tells us which chromosome the gene is in.
  • The remaining columns contain the gene read counts for the different time points and replicates.

We have data for two time points, denoted X24hSL and X48hSL. These are, respectively, the time points for thale cress seeds 24 hours and 48 hours after exposure to sunlight, following a stratification process (whereby the seeds are encouraged to germinate).

For each time point, we have recordings for three replicates - hence the _1’s, _2’s and _3’s appended to the X24hSL and X48hSL column names.


Narsai, R., Q. Gouil, D. Secco, A. Srivastava, Y. V. Karpievitch, L. C. Liew, R. Lister, M. G. Lewsey, and J. Whelan. 2017. “Extensive Transcriptomic and Epigenomic Remodelling Occurs During Arabidopsis Thaliana Germination.” Genome Biology 18 (172): 1–18.

  1. “Thale cress (Arabidopsis thaliana)” by Deanster1983 who’s on and off is licensed under CC BY-ND 2.0↩︎

  2. Super- Series accession number GSE94459↩︎