Chapter 1 Introduction
Information in living organism communicates along the Central Dogma in different scales from individual, population, community to ecosystem. Metabolomics (i.e., the profiling and quantitation of metabolites) is a relatively new field of “omics” studies. Different from other omics studies, metabolomics always focused on small molecular (molecular weight below 1500 Da) with much lower mass than polypeptide with single or doubled charged ions. Here is a demo of the position of metabolomics in “omics” studies(B. Dunn et al. 2011).

Figure 1.1: The complex interactions of functional levels in biological systems.
Metabolomics studies always employ GC-MS(Theodoridis et al. 2012; Beale et al. 2018), GC*GC-MS(Tian et al. 2016a), LC-MS(Gika et al. 2014), LC-MS/MS(Begou et al. 2017a), IM-MS(Levy et al. 2019), infrared ion spectroscopy(Martens et al. 2017) or NMR(B. Dunn et al. 2011) to measure metabolites. For analytical methods, this review could be checked(A. Zhang et al. 2012). The overall technique progress of metabolomics (2012-2018) could be found here(Miggiels et al. 2019). However, this workflow will only cover mass spectrometry based metabolomics or XC-MS based research.
1.1 History
1.1.1 History of Mass Spectrometry
Here is a historical commentary for mass spectrometry(Yates Iii 2011). In details, here is a summary:
- 1913, Sir Joseph John Thomson “Rays of Positive Electricity and Their Application to Chemical Analyses.”

Figure 1.2: Sir Joseph John Thomson “Rays of Positive Electricity and Their Application to Chemical Analyses.”
Petroleum industry bring mass spectrometry from physics to chemistry
The first commercial mass spectrometer is from Consolidated Engineering Corp to analysis simple gas mixtures from petroleum
In World War II, U.S. use mass spectrometer to separate and enrich isotopes of uranium in Manhattan Project
U.S. also use mass spectrometer for organic compounds during wartime and extend the application of mass spectrometer
1946, TOF, William E. Stephens
1970s, quadrupole mass analyzer
1970s, R. Graham Cooks developed mass-analyzed ion kinetic energy spectrometry, or MIKES to make MRM analysis for multi-stage mass sepctrometry
1980s, MALDI rescue TOF and mass spectrometry move into biological application
1990s, Orbitrap mass spectrometry
2010s, Aperture Coding mass spectrometry
1.1.2 History of Metabolomcis
You could check this report(Baker 2011). According to this book section(Kusonmano, Vongsangnak, and Chumnanpuen 2016a):

Figure 1.3: Metabolomics timeline during pre- and post-metabolomics era
2000-1500 BC some traditional Chinese doctors who began to evaluate the glucose level in urine of diabetic patients using ants
300 BC ancient Egypt and Greece that traditionally determine the urine taste to diagnose human diseases
1913 Joseph John Thomson and Francis William Aston mass spectrometry
1946 Felix Bloch and Edward Purcell Nuclear magnetic resonance
late 1960s chromatographic separation technique
1971 Pauling’s research team “Quantitative Analysis of Urine Vapor and Breath by Gas–Liquid Partition Chromatography”
Willmitzer and his research team pioneer group in metabolomics which suggested the promotion of the metabolomics field and its potential applications from agriculture to medicine and other related areas in the biological sciences
2007 Human Metabolome Project consists of databases of approximately 2500 metabolites, 1200 drugs, and 3500 food components
post-metabolomics era high-throughput analytical techniques
1.1.3 Defination
Metabolomics is actually a comprehensive analysis with identification and quantification of both known and unknown compounds in an unbiased way. Metabolic fingerprinting is working on fast classification of samples based on metabolite data without quantifying or identification of the metabolites. Metabolite profiling always need a pre-defined metabolites list to be quantification(Madsen, Lundstedt, and Trygg 2010a).
Meanwhile, targeted and untargeted metabolomics are also used in publicaitons. For targeted metabolomics, the majority of the molecules within a biosynthetic pathway or a defined group of related metabolites are determined. Sometimes broad collection of known metabolites could also be referred as targeted analysis. Untargeted analysis detect all of possible metabolites unbiased in the samples of interest. A similar concept called non-targeted analysis/screen is actually describe the similar studies or workflow.
1.2 Reviews and tutorials
Some nice reviews and tutorials related to this workflow could be found in those papers or directly online:
1.2.1 Workflow
Those papers are recommended(González-Riano et al. 2020; Pezzatti et al. 2020; X. Liu et al. 2019; Barnes et al. 2016a; Cajka and Fiehn 2016; Gika et al. 2014; Theodoridis et al. 2012; X. Lu and Xu 2008; Fiehn 2002) for general metabolomics related topics.
- For targeted metabolomics, you could check those reviews(Griffiths et al. 2010; W. Lu, Bennett, and Rabinowitz 2008; Weljie et al. 2006; Yuan et al. 2012; Zhou and Yin 2016; Begou et al. 2017b).
1.2.2 Data analysis
You could firstly read those papers(Barnes et al. 2016b; Kusonmano, Vongsangnak, and Chumnanpuen 2016b; Madsen, Lundstedt, and Trygg 2010b; Uppal et al. 2016a; Alonso, Marsal, and Julià 2015) to get the concepts and issues for data analysis in metabolomics. Then this paper(Gromski et al. 2015) could be treated as a step-by-step tutorial.
For annotation, this paper(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018a) is a well organized review.
For database used in metabolomics, you could check this review(Vinaixa et al. 2016).
For metabolomics software, check this series of reviews for each year(Misra and van der Hooft 2016; Misra, Fahrmann, and Grapov 2017; Misra 2018).
For open sourced software, those reviews(Spicer et al. 2017; Dryden et al. 2017) could be a good start.
For DIA or DDA metabolomics, check those papers(Fenaille et al. 2017a; Bilbao et al. 2015a).
Here is the slides for metabolomics data analysis workshop and I have made presentations twice in UWaterloo and UC Irvine.
1.2.3 Application
For environmental research related metabolomics or exposome, check those papers(Matich et al. 2019; Tang et al. 2020; Warth et al. 2017; Bundy, Davey, and Viant 2009).
For toxicology, check this paper(Mark R. Viant et al. 2019).
Check this piece(Wishart 2016) for drug discovery and precision medicine.
For food chemistry, check this paper(Castro-Puyana et al. 2017), this paper for livestock(Goldansaz et al. 2017) and those papers for nutrition(Allam-Ndoul et al. 2016; Jones, Park, and Ziegler 2012; Müller and Bosy-Westphal 2020).
For disease related metabolomics such as oncology(Spratlin, Serkova, and Eckhardt 2009), Cardiovascular(Cheng et al. 2017) . This paper(Kennedy et al. 2018) cover the metabolomics realted clinic research.
For plant science, check those paper(Lloyd W. Sumner, Mendes, and Dixon 2003; Jorge, Mata, and António 2016a; Hansen and Lee 2018a).
For single cell metabolomics analysis, check here(Fessenden 2016; Zenobi 2013; Ali et al. 2019; Hansen and Lee 2018b).
For gut microbiota, check here(Smirnov et al. 2016).
1.2.4 Challenge
General challenge for metabolomics studies could be found here (Schymanski and Williams 2017; Uppal et al. 2016b; Schrimpe-Rutledge et al. 2016; Wolfender et al. 2015).
For reproducible research, check those papers (Verhoeven, Giera, and Mayboroda 2020; Mangul et al. 2019; Wallach, Boyack, and Ioannidis 2018; Hites and Jobst 2018; Considine et al. 2017; Sarpe and Schriemer 2017).
Quantitative Metabolomics related issues could be found here(Kapoore and Vaidyanathan 2016; Jorge, Mata, and António 2016b).
For quality control issues, check here(Dudzik et al. 2018; Siskos et al. 2017; Lloyd W. Sumner et al. 2007).
1.3 Trends in Metabolomics
library(gtrendsR)
<- gtrends(c("metabolomics", "metabolomics"), geo = c("CA","US"))
res plot(res)
library(rentrez)
<- function(years, search_term){
papers_by_year return(sapply(years, function(y) entrez_search(db="pubmed",term=search_term, mindate=y, maxdate=y, retmax=0)$count))
}<- 1987:2018
years <- papers_by_year(years, "")
total_papers <- c("genomic", "epigenomic", "metagenomic", "proteomic", "transcriptomic","metabolomics","exposome", "pharmacogenomic", "connectomic")
omics <- sapply(omics, function(t) papers_by_year(years, t))
trend_data <- trend_data/total_papers
trend_props library(reshape)
library(ggplot2)
<- melt(data.frame(years, trend_data), id.vars="years")
trend_df <- ggplot(trend_df, aes(years, value, colour=variable))
p + geom_line(size=1) + scale_y_log10("number of papers") p