3.1 Data preparation in Excel

To conduct meta-analyses in R, you need to have your study data prepared. In the following, we describe the (preferred) way in which you should structure your dataset to facilitate the import into RStudio. We do this for two types of data: “raw” effect size data and pre-calculated effect size data. Usually, data imported into RStudio is stored in Excel spreadsheets first. We recommend to store your data there, because this makes it very easy to do the import.

What to (not) to do in your Excel spreadsheet

  • It is very important how you name the columns of your spreadsheet. If you already named the columns of your sheet adequately in Excel, you can save a lot of time because your data does not have to be transformed in RStudio later on. “Naming” the columns of the spreadsheet simply means to write the name of the variable into the first line of the column; RStudio will automatically detect that this is the name of the column then.

  • It doesn’t matter how columns are ordered in your Excel spreadsheet. They just have to be labeled correctly.

  • There’s also no need to format the columns in any way. If you type the column name in the first line of your spreadsheet, R will automatically detect it as a column name.

  • It is also important to know that the import will distort letters like ä,ü,ö,á,é,ê, etc. So be sure to transform them to “normal” letters before you proceed.

  • Make sure that your Excel file only contains one sheet. If you want to import Excel files with more than one sheet, you can have a look at the xlsx package.

  • If you have one or several empty rows or columns which used to contain data, make sure to delete those columns/rows completely, because RStudio could think that these columns contain data and import them also.

3.1.1 Setting the columns of the Excel spreadsheet (raw effect size data)

3.1.1.1 “Standard” effect size data (M, SD, N)

For a “standard” meta-analysis which uses the mean, standard deviation, and sample size from both groups in a study, the following information is needed for every study.

  • The names of the individual studies, so that they can be easily identified later on. Usually, the first author and publication year of a study is used for this (e.g. “Ebert et al., 2018”). The names must be unique for each study.
  • The mean of both the intervention and the control group at the same assessment point.
  • The standard deviation of both the intervention and the control group at the same assessment point.
  • The number of participants (\(N\)) in each group of the trial.
  • If you want to have a look at differences between various study subgroups later on, you also need a subgroup code for each study which signifies to which subgroup it belongs. For example, if a study was conducted in children, you might give it the subgroup code “children”.

Here is how you should name the data columns in your EXCEL spreadheet containing your Meta-Analysis data

Column Description
Author This signifies the column for the study label (i.e., the first author)
Me The mean of the experimental/intervention group
Se The standard deviation of the experimental/intervention group
Mc The Mean of the control group
Sc The Standard Deviation of the control group
Ne The number of participants in the experimental/intervention group
Nc The number of participants in the control group
Subgroup This is the label for one of your subgroup codes. It is not that important how you name this column, so you can give it a more informative name (e.g. population). In this column, each study should then be given an subgroup code, which should be exactly the same for each subgroup, including upper/lowercase letters. Of course, you can also include more than one subgroup column with different subgroup codings, but the column name has to be unique



3.1.1.2 Event rate data

For a meta-analysis of event rates, which uses the number of events and sample size from both groups in a study, the following information is needed for every study.

  • The names of the individual studies, so that they can be easily identified later on. Usually, the first author and publication year of a study is used for this (e.g. “Ebert et al., 2018”). The names must be unique for each study.
  • The number of events in both the intervention and the control group at the same assessment point.
  • The number of participants (\(N\)) in each group of the trial.
  • If you want to have a look at differences between various study subgroups later on, you also need a subgroup code for each study which signifies to which subgroup it belongs. For example, if a study was conducted in children, you might give it the subgroup code “children”.
Column Description
Author This signifies the column for the study label (i.e., the first author)
Ee Number of events in the experimental treatment arm
Ne Number of participants in the experimental treatment arm
Ec Number of events in the control arm
Nc Number of participants in the control arm
Subgroup This is the label for one of your subgroup codes. It’s not that important how you name it, so you can give it a more informative name (e.g. population). In this column, each study should then be given a subgroup code, which should be exactly the same for each subgroup, including upper/lowercase letters. Of course, you can also include more than one subgroup column with different subgroup codings, but the column name has to be unique



3.1.1.3 Incidence rate data

For a meta-analysis of incidence rates, which uses the number of events and person-time at risk from both groups in a study, the following information is needed for every study.

  • The names of the individual studies, so that they can be easily identified later on. Usually, the first author and publication year of a study is used for this (e.g. “Ebert et al., 2018”). The names must be unique for each study.
  • The number of events of both the intervention and the control group at the same assessment point.
  • The person-time at risk in each group of the trial.
  • If you want to have a look at differences between various study subgroups later on, you also need a subgroup code for each study which signifies to which subgroup it belongs. For example, if a study was conducted in children, you might give it the subgroup code “children”.
Column Description
Author This signifies the column for the study label (i.e., the first author)
event.e Number of events in the experimental treatment arm
time.e The person-time at risk in the experimental treatment arm
event.c Number of events in the control arm
time.c The person-time at risk in the control arm
Subgroup This is the label for one of your subgroup codes. It is not that important how you name it, so you can give it a more informative name (e.g. population). In this column, each study should then be given a subgroup code, which should be exactly the same for each subgroup, including upper/lowercase letters. Of course, you can also include more than one subgroup column with different subgroup codings, but the column name has to be unique

3.1.2 Setting the columns of the Excel spreadsheet (pre-calculated effect size data)

If you have already calculated the effect sizes for each study on your own, for example using Comprehensive Meta-Analysis, RevMan, or one of the effect size calculators we present in this guide, there is another way to prepare your data which makes things a little easier. In this case, you only have to include the following columns:

Column Description
Author This signifies the column for the study label (i.e., the first author)
TE The calculated effect size of the study (either Cohen’s d or Hedges’ g, or some other form of effect size)
seTE The standard error (SE) of the calculated effect
Subgroup This is the label for one of your Subgroup codes. It is not that important how you name it, so you can give it a more informative name (e.g. population). In this column, each study should then be given an subgroup code, which should be exactly the same for each subgroup, including upper/lowercase letters. Of course, you can also include more than one subgroup column with different subgroup codings, but the column name has to be unique




banner