1 Starting a meta-analysis

1.2 Article screening

After you identify the set of candidate articles for your meta-analysis it’s time to sort through all of the articles and determine which will included in your meta-analysis. Your first step should be to delineate your inclusion criteria. These are a set of criteria that each of the articles must meet prior to inclusion into your meta-analysis.

In order to help shape the inclusion criteria, we recommend adhering to the Population, Treatment, Control, Outcome (PTRO, or sometimes called the PICO) framework (Huang, Lin, and Demner-Fushman 2006). First, make a rule about the population that an article must focus on for you to include it in the meta-analysis. For example, will you include only studies that investigate invasive crayfish? Can these studies come from populations all over the world or only in the US? Next, identify the treatment you will focus on in your meta-analysis. This criteria is called treatment because the framework was developed for medical sciences. But we can thin of the treatment group as the experimental sites. For example, the treatment sites may receive extra nutrient input or an increased amount of deer herbivory. Then you should develop a rule about the design of the control sites. For example, they may lack a certain nutrient input or experience lower levels of deer herbivory. Lastly, a rule about the focal outcome will help define the quantitative metrics you will eventually extract from each article. Following the examples from above, we may want to include only articles that measure dissolved Soil Organic Carbon or abundance of a certain flowering plant species.

After deciding on your inclusion criteria, it’s time to sort article and then determine whether you’ll include the articles or exclude them based on you PTRO criteria (see above). We recommend using a Preferred reporting Items for Systematic review and Meta-Analysis flowchart (PRISMA). This flowchart allows for easy visualization of the major steps involved in sorting through articles. We recommend sorting through articles in 4 distinct steps. First, identify and remove any duplicate articles that you picked up during your systematic search. Duplicate articles are common if you searched in multiple databases. If you use a reference manager like Zotero finding duplicates will be a breeze and really all steps of sorting will be more organized.

Then move through articles always checking for consistency with your pre-defined inclusion criteria. First first by title, then abstract, then read the full text of the article and make a decision on whether to include or exclude the article. We recommend keeping track of the articles you exclude at each step using a reference manager like zotero (mentioned above and free) or endnote (great interface, but expensive). We also recommend using a Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) diagram for easily conveying your screening process to readers (Moher et al. 2009). Below is an example of a PRISMA flowchart that provides boxes in green that show the number of articles retained in each screening step and red boxes that show the number of articles excluded in each step. If you want to make your own PRISMA flowchart, we recommend either powerpoint or the R package DiagrammeR (Iannone 2018).

## Loading required package: DiagrammeR

Figure 1.3: An example of a PRISMA flowchart showing 3 screening steps that researchers might use when deciding which publications they will use in a meta-analysis. The flowchart shows how you might choose to screen articles by titles, then abstracts, and finally the full text.

Applying this to our example data

In order to create the inclusion criteria for our example meta-analysis, we used on the PTCO framework described above. In the table below, I provide a shorthand name for each criteria in one column and then a more detailed rationale for the criteria in the next column. As in all steps of the meta-analysis, the goal is to provide enough detail so that someone could repeat all the steps of the analysis if they want to.

Criteria Rationale
Original research (Population) Include only primary published research
Invaded sites (Treatment) Articles must include sites with invasive species. If studies measure impacts of multiple invasive species, we count each species separately in our database
Sites without invasive species (Control) Experiments must have control sites without invasive species or with species at very low densities
Measurements of species richness (Outcome) Articles must provide mean measurement of richness at invaded and control sites (\(\bar{X}\)) as well as sample size (N) and measure of variation (s, \(s^2\), or SE)

Should you decide to use the PRISMA flowchart to organize your article screening process, here’s an template of what the flowchart can include using example data from this book. At each screening step (green box) we indicate the number of articles screened. We also use red boxes to highlight the number of articles that we exclude at each screening step, and we also provide the most common reasons for exclusion.

Example PRISMA flowchart for the invasive tree database we are using in this book.

Figure 1.4: Example PRISMA flowchart for the invasive tree database we are using in this book.

Note that you can modify any of the boxes to suite the needs of your screening protocol. We excluded 30 articles because they lacked any measurements of species richness. If your meta-analysis doesn’t focus on species richness then you will likely have a different box to represent the outcome you need to conduct your meta-analysis.

1.3 Data extraction

Extracting data for a meta-analysis is one of the most challenging parts of any of the steps outlined in this book. Planning your data extraction spreadsheet out in advance, and then keeping it organized will eventually pay dividends (in :) not $) as your work becomes more complex.

In the data extraction phase, you will have already identified the set of candidate articles that match all of the inclusion criteria for your meta-analysis Now it’s time to sift through all of the papers in your database and pull relevant information from each article into your data extraction spreadsheet. To do this, you of course need a data extraction spreadsheet.

Setting up your data extraction spreadsheet
In my experience, data extraction spreadsheets are typically created in either Microsoft Excel or in Google Sheets. With Google Sheets, the benefit is that you can easily share and collaborate on this document more seamlessly than in Excel.

Below, you see a typical example of how I setup my data extraction spreadsheet. Just like with establishing search terms ahead of time to minimize injecting your own biases into the articles that are included in the review, it’s typical to try to outline as much of the data extraction spreadsheet as possible before actually extracting the data. In the header row, you see I am collecting some data about each article (e.g., lastname, publicationyear, impactfactor) as well as some data that is specific to the research that is the focus of the article (invasivespecies and latinname). I also assigned a unique ID to each article with format of randomnumber-firstauthor-publicationyear and I correspondingly re-name each PDF with its assigned unique code. This helps me create a type of relational database where I’m always able to quickly search through my reference manager to find the article in question. You’ll notice that we haven’t even gotten to the data needed for a meta-analysis yet, and there is already a decent amount of information we are pulling from each article.

A small selection of a data extraction spreadsheet that includes headings such as 'author last name', 'publication year' and 'invasive species common name'.

Figure 1.5: A small selection of a data extraction spreadsheet that includes headings such as ‘author last name’, ‘publication year’ and ‘invasive species common name’.

The data extraction spreadsheet shown above has a total of 43 columns each characterizing a different element or piece of data from the manuscript. Without delving too much into what all of those columns are, you may, especially for ecological meta-analyses, be interested in whether other variables may moderate the main effect you are investigating. For example, we could collect data on variables like:

  • ecological impact of invasive species
  • country where impact occurred
  • tropic position
  • sampling frequency
  • study length (in days)

The list goes on, but these are a few examples of variables that, while not the main focus of our meta-analysis, can be incorporated into models to try and account for some of the variation we see in the data.

Now it’s time to extract the data that you need for a meta-analysis. I am focusing on some common metrics to extract (mean, standard deviation [SD], and sample size[N]), but there are many others which I will not address here (e.g., pearson’s r). The data we extract here, will be useful for calculating two commonly used meta-analytic effect sizes in Ecology, Hedge’s g and the response ration.

So in order to collect these data, we create separate columns for the mean, SD, and N as you can see below. You can certainly deviate from the naming conventions in the image below, but make sure they are descriptive enough so that you know which data are from your control sites and which are from your treatment (or experimental) sites.

Another example of a data extraction spreadsheet that shows the numerical data we typically extract for a meta-analysis including mean and standard deviation from the control and experimental sites.

Figure 1.6: Another example of a data extraction spreadsheet that shows the numerical data we typically extract for a meta-analysis including mean and standard deviation from the control and experimental sites.

Best case scenario is that when you go into extract data from an article you find mean

** What happens if I can’t find the data within an article? 1. Download supplemental information. This includes repository 2. Extract data from figure 3. Contact author

Applying this to our example data


Crystal-Ornelas, Robert, Jeffrey A. Brown, Rafael E. Valentin, Caroline Beardsley, and Julie L Lockwood. n.d. “Meta-Analysis Shows That Overabundant Deer (Cervidae) Populations Consistently Decrease Average Species Abundance and Richness of Forest Birds.”

Crystal-Ornelas, Robert, and Julie L. 2020b. “The ‘Known Unknowns’ of Invasive Species Impact Measurement.” Biological Invasions 22 (4): 1513–25. https://doi.org/10.1007/s10530-020-02200-0.

Gough, David, Sandy Oliver, and James Thomas, eds. 2012. An Introduction to Systematic Reviews. London ; Thousand Oaks, Calif: SAGE.

Huang, Xiaoli, Jimmy Lin, and Dina Demner-Fushman. 2006. “Evaluation of PICO as a Knowledge Representation for Clinical Questions.” AMIA Annual Symposium Proceedings 2006: 359–63. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1839740/.

Iannone, R. 2018. “DiagrammeR: Graph/Network Visualization.” R Package 1 (0).

Koricheva, Julia, Jessica Gurevitch, and Kerrie Mengersen. 2013. Handbook of Meta-Analysis in Ecology and Evolution. Princeton University Press.

Moher, David, Alessandro Liberati, Jennifer Tetzlaff, Douglas G. Altman, and Prisma Group. 2009. “Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement.” PLoS Medicine 6 (7): e1000097. https://doi.org/10/bq3jpc.

Wang Wei, J., B. P. Lee, and L. Bing Wen. 2016. “Citizen Science and the Urban Ecology of Birds and Butterflies - A Systematic Review.” PLoS ONE 11 (6): e0156425. https://doi.org/10/gbnb34.