17 Effect Size Calculation & Conversion

A problem meta-analysts frequently face is that suitable “raw” effect size data cannot be extracted from all included studies. Most functions in the {meta} package, such as metacont (Chapter 4.2.2) or metabin (Chapter 4.2.3.1), can only be used when complete raw effect size data is available.

In practice, this often leads to difficulties. Some published articles, particularly older ones, do not report results in a way that allows to extract the needed (raw) effect size data. It is not uncommon to find that a study reports the results of a $t$-test, one-way ANOVA, or $\chi^2$-test, but not the group-wise mean and standard deviation, or the number of events in the study conditions, that we need for our meta-analysis.

The good news is that we can sometimes convert reported information into the desired effect size format. This makes it possible to include affected studies in a meta-analysis with pre-calculated data (Chapter 4.2.1) using metagen. For example, we can convert the results of a two-sample $t$-test to a standardized mean difference and its standard error, and then use metagen to perform a meta-analysis of pre-calculated SMDs. The {esc} package (Lüdecke 2019) provides several helpful functions which allow us to perform such conversions directly in R.

17.1 Mean & Standard Error

When calculating SMDs or Hedges’ $g$ from the mean and standard error, we can make use of the fact that the standard deviation of a mean is defined as its standard error, with the square root of the sample size “factored out” (Thalheimer and Cook 2002):

\[\begin{equation} \text{SD} =\text{SE}\sqrt{n} \tag{17.1} \end{equation}\]

We can calculate the SMD or Hedges’ $g$ using the esc_mean_se function. Here is an example:

library(esc)

esc_mean_se(grp1m = 8.5,   # mean of group 1
            grp1se = 1.5,  # standard error of group 1
            grp1n = 50,    # sample in group 1
            grp2m = 11,    # mean of group 2
            grp2se = 1.8,  # standard error of group 2
            grp2n = 60,    # sample in group 2
            es.type = "d") # convert to SMD; use "g" for Hedges' g

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: mean and se to effect size d
##     Effect Size:  -0.2012
##  Standard Error:   0.1920
##        Variance:   0.0369
##        Lower CI:  -0.5774
##        Upper CI:   0.1751
##          Weight:  27.1366

17.2 Regression Coefficients

It is possible to calculate SMDs, Hedges’ $g$ or a correlation $r$ from standardized or unstandardized regression coefficients (Lipsey and Wilson 2001, Appendix B). For unstandardized coefficients, we can use the esc_B function in {esc}. Here is an example:

library(esc)

esc_B(b = 3.3,       # unstandardized regression coefficient
      sdy = 5,       # standard deviation of predicted variable y
      grp1n = 100,   # sample size of the first group
      grp2n = 150,   # sample size of the second group
      es.type = "d") # convert to SMD; use "g" for Hedges' g

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: unstandardized regression coefficient to effect size d
##     Effect Size:   0.6962
##  Standard Error:   0.1328
##        Variance:   0.0176
##        Lower CI:   0.4359
##        Upper CI:   0.9565
##          Weight:  56.7018

esc_B(b = 2.9,       # unstandardized regression coefficient
      sdy = 4,       # standard deviation of the predicted variable y
      grp1n = 50,    # sample size of the first group
      grp2n = 50,    # sample size of the second group
      es.type = "r") # convert to correlation

## Effect Size Calculation for Meta Analysis
## 
##      Conversion: unstandardized regression coefficient 
##                  to effect size correlation
##     Effect Size:   0.3611
##  Standard Error:   0.1031
##        Variance:   0.0106
##        Lower CI:   0.1743
##        Upper CI:   0.5229
##          Weight:  94.0238
##      Fisher's z:   0.3782
##       Lower CIz:   0.1761
##       Upper CIz:   0.5803

Standardized regression coefficients can be transformed using esc_beta.

esc_beta(beta = 0.32,   # standardized regression coefficient
         sdy = 5,       # standard deviation of the predicted variable y
         grp1n = 100,   # sample size of the first group
         grp2n = 150,   # sample size of the second group
         es.type = "d") # convert to SMD; use "g" for Hedges' g

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: standardized regression coefficient to effect size d
##     Effect Size:   0.6867
##  Standard Error:   0.1327
##        Variance:   0.0176
##        Lower CI:   0.4266
##        Upper CI:   0.9468
##          Weight:  56.7867

esc_beta(beta = 0.37,   # standardized regression coefficient
         sdy = 4,       # standard deviation of predicted variable y
         grp1n = 50,    # sample size of the first group
         grp2n = 50,    # sample size of the second group
         es.type = "r") # convert to correlation

## Effect Size Calculation for Meta Analysis
## 
##      Conversion: standardized regression coefficient 
##                  to effect size correlation
##     Effect Size:   0.3668
##  Standard Error:   0.1033
##        Variance:   0.0107
##        Lower CI:   0.1803
##        Upper CI:   0.5278
##          Weight:  93.7884
##      Fisher's z:   0.3847
##       Lower CIz:   0.1823
##       Upper CIz:   0.5871

Please note that using regression coefficients in meta-analysis can be tricky, because we assume that the the same model has been used in all studies. This is particularly problematic if coefficients are extracted from multiple regression models, because studies may have controlled for different co-variates in their models, which means that the $b$ values are not directly comparable.

17.3 Correlations

For equally sized groups ($n_1=n_2$), we can use the following formula to derive the SMD from the point-biserial correlation (Lipsey and Wilson 2001, chap. 3).

\[\begin{equation} r_{pb} = \frac{\text{SMD}}{\sqrt{\text{SMD}^2+4}} ~~~~~~~~ \text{SMD}=\frac{2r_{pb}}{\sqrt{1-r^2_{pb}}} \tag{17.2} \end{equation}\]

A different formula has to be used for unequally sized groups (Aaron, Kromrey, and Ferron 1998):

\[\begin{align} r_{pb} &= \frac{\text{SMD}}{\sqrt{\text{SMD}^2+\dfrac{(N^2-2N)}{n_1n_2}}} \notag \\ \text{SMD} &= \dfrac{r_{pb}}{\sqrt{(1-r^2)\left(\frac{n_1}{N}\times\left(1-\frac{n_1}{N}\right)\right)}} \tag{17.3} \end{align}\]

To convert $r_{pb}$ to an SMD or Hedges’ $g$, we can use the esc_rpb function.

library(esc)

esc_rpb(r = 0.25,      # point-biserial correlation
        grp1n = 99,    # sample size of group 1
        grp2n = 120,   # sample size of group 2
        es.type = "d") # convert to SMD; use "g" for Hedges' g

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: point-biserial r to effect size d
##     Effect Size:   0.5188
##  Standard Error:   0.1380
##        Variance:   0.0190
##        Lower CI:   0.2483
##        Upper CI:   0.7893
##          Weight:  52.4967

17.4 One-Way ANOVAs

We can also derive the SMD from the $F$-value of a one-way ANOVA with two groups. Such ANOVAs can be identified by looking at the degrees of freedom. In a one-way ANOVA with two groups, the degrees of freedom should always start with 1 (e.g. $F_{\text{1,147}}$=5.31).

The formula used for the transformation looks like this (based on Rosnow and Rosenthal 1996; Rosnow, Rosenthal, and Rubin 2000; see Thalheimer and Cook 2002):

\[\begin{equation} \text{SMD} = \sqrt{ F\left(\frac{n_1+n_2}{n_1 n_2}\right)\left(\frac{n_1+n_2}{n_1+n_2-2}\right)} \tag{17.4} \end{equation}\]

To calculate the SMD or Hedges’ $g$ from $F$-values, we can use the esc_f function. Here is an example:

esc_f(f = 5.04,      # F value of the one-way anova
      grp1n = 519,   # sample size of group 1 
      grp2n = 528,   # sample size of group 2
      es.type = "g") # convert to Hedges' g; use "d" for SMD

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: F-value (one-way-Anova) to effect size Hedges' g
##     Effect Size:   0.1387
##  Standard Error:   0.0619
##        Variance:   0.0038
##        Lower CI:   0.0174
##        Upper CI:   0.2600
##          Weight: 261.1022

17.5 Two-Sample $t$-Tests

An effect size expressed as a standardized mean difference can also be derived from an independent two-sample $t$-test value, using the following formula (Rosnow, Rosenthal, and Rubin 2000; Thalheimer and Cook 2002):

\[\begin{equation} \text{SMD} = \frac {t(n_1+n_2)}{\sqrt{(n_1+n_2-2)(n_1n_2)}} \tag{17.5} \end{equation}\]

In R, we can calculate the SMD or Hedges’ g from a $t$-value using the esc_t function. Here is an example:

esc_t(t = 3.3,     # t-value 
      grp1n = 100, # sample size of group1
      grp2n = 150, # sample size of group 2
      es.type="d") # convert to SMD; use "g" for Hedges' g

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: t-value to effect size d
##     Effect Size:   0.4260
##  Standard Error:   0.1305
##        Variance:   0.0170
##        Lower CI:   0.1703
##        Upper CI:   0.6818
##          Weight:  58.7211

17.6 $p$-Values

At times, studies only report the effect size (e.g. a value of Cohen’s $d$), the $p$-value of that effect, and nothing more. Yet, to pool results in a meta-analysis, we need a measure of the precision of the effect size, preferably the standard error.

In such cases, we must estimate the standard error from the $p$-value of the effect size. This is possible for effect sizes based on differences (i.e. SMDs), or ratios (i.e. risk or odds ratios), using the formulas by Altman and Bland (2011). These formulas are implemented in the se.from.p function in R.

The “se.from.p” Function

The se.from.p function is included in the {dmetar} package. Once {dmetar} is installed and loaded on your computer, the function is ready to be used. If you did not install {dmetar}, follow these instructions:

Access the source code of the function online.
Let R “learn” the function by copying and pasting the source code in its entirety into the console (bottom left pane of R Studio), and then hit “Enter”.

Assuming a study with $N=$ 71 participants, reporting an effect size of $d=$ 0.71 for which $p=$ 0.013, we can calculate the standard error like this:

library(dmetar)

se.from.p(0.71,
          p = 0.013,
          N = 71,
          effect.size.type = "difference")

##   EffectSize StandardError StandardDeviation  LLCI  ULCI
## 1       0.71         0.286             2.410 0.149 1.270

For a study with $N=$ 200 participants reporting an effect size of OR = 0.91 with $p=$ 0.38, the standard error is calculated this way:

library(magrittr) # for pipe

se.from.p(0.91, p = 0.38, N = 200,
          effect.size.type = "ratio") %>% t()

##                        [,1]
## logEffectSize        -0.094
## logStandardError      0.105
## logStandardDeviation  1.498
## logLLCI              -0.302
## logULCI               0.113
## EffectSize            0.910
## LLCI                  0.739
## ULCI                  1.120

When effect.size.type = "ratio", the function automatically also calculates the log-transformed effect size and standard error, which are needed to use the metagen function (Chapter 4.2.1).

17.7 $\chi^2$ Tests

To convert a $\chi^2$ statistic to an odds ratio, the esc_chisq function can be used (assuming that d.f. = 1; e.g. $\chi^2_1$ = 8.7). Here is an example:

esc_chisq(chisq = 7.9,        # chi-squared value
          totaln = 100,       # total sample size
          es.type = "cox.or") # convert to odds ratio

## 
## Effect Size Calculation for Meta Analysis
## 
##      Conversion: chi-squared-value to effect size Cox odds ratios
##     Effect Size:   2.6287
##  Standard Error:   0.3440
##        Variance:   0.1183
##        Lower CI:   1.3394
##        Upper CI:   5.1589
##          Weight:   8.4502

17.8 Number Needed To Treat

Effect sizes such as Cohen’s $d$ or Hedges’ $g$ are often difficult to interpret from a practical standpoint. Imagine that we found an intervention effect of $g=$ 0.35 in our meta-analysis. How can we communicate what such an effect means to patients, public officials, medical professionals, or other stakeholders?

To make it easier for others to understand the results, meta-analyses also often report the number needed to treat (NNT). This measure is most commonly used in medical research. It signifies how many additional patients must receive the treatment under study to prevent one additional negative event (e.g. relapse) or achieve one additional positive event (e.g. symptom remission, response). If NNT = 3, for example, we can say that three individuals must receive the treatment to avoid one additional relapse case; or that three patients must be treated to achieve one additional case of reliable symptom remission, depending on the research question.

When we are dealing with binary effect size data, calculation of NNTs is relatively easy. The formula looks like this:

\[\begin{equation} \text{NNT} = (p_{e_{\text{treat}}}-p_{e_{\text{control}}})^{-1} \tag{17.6} \end{equation}\]

In this formula, $p_{e_{\text{treat}}}$ and $p_{e_{\text{control}}}$ are the proportions of participants who experienced the event in the treatment and control group, respectively. These proportions are identical to the “risks” used to calculate the risk ratio (Chapter 3.3.2.1), and also known as the experimental group event rate (EER) and control group event rate (CER). Given its formula, the NTT can also be described as the inverse of the (absolute) risk difference.

Converting standardized mean differences or Hedges’ $g$ to a NNT is more complicated. There are two commonly used methods:

The method by Kraemer and Kupfer (2006), which calculates the NNT from an area under the curve (AUC), defined as the probability that a patient in the treatment group has an outcome preferable to the one in the control group. This method allows to calculate the NNT directly from an SMD or $g$ without any extra information.
The method by Furukawa and Leucht calculates NNT values from SMDs using the CER, or a reasonable estimate thereof. Furukawa’s method has been shown to be superior in estimating the true NNT value compared to the Kraemer & Kupfer method (Furukawa and Leucht 2011). If we can make reasonable estimates of the CER, Furukawa’s method should therefore always be preferred.

When we use risk or odds ratios as effect size measures, NNTs can be calculated directly from {meta} objects using the nnt function. After running our meta-analysis using metabin (Chapter 4.2.3.1), we only have to plug the results into the nnt function. Here is an example:

library(meta)
data(Olkin1995)

# Run meta-analysis with binary effect size data
m.b <- metabin(ev.exp, n.exp, ev.cont, n.cont, 
               data = Olkin1995,
               sm = "RR")
nnt(m.b)

## Number needed to treat (common effect model): 
## 
##      RR    p.c    NNTB             95%-CI
##  0.7728 0.1440 30.5677 [26.1222; 37.2386]
##  0.7728 0.3750 11.7383 [10.0312; 14.3001]
## 
## Number needed to treat (random effects model): 
## 
##      RR    p.c    NNTB             95%-CI
##  0.7694 0.1440 30.1139 [24.0662; 41.3519]
##  0.7694 0.3750 11.5641 [ 9.2417; 15.8796]

The nnt function provides the number needed to treat for different assumed CERs. The three lines show the result for the minimum, mean, and maximum CER in our data set. The mean CER estimate is the “typical” NNT that is usually reported.

It is also possible to use nnt with metagen models, as long as the summary measure sm is either "RR" or "OR". For such models, we also need to specify the assumed CER in the p.c argument in nnt. Here is an example using the m.gen_bin meta-analysis object we created in Chapter 4.2.3.1.5:

# Also show fixed-effect model results
m.gen_bin <- update(m.gen_bin, fixed = TRUE)

## Warning: Use argument 'common' instead of 'fixed' (deprecated).

nnt(m.gen_bin, 
    p.c = 0.1) # Use a CER of 0.1

## Number needed to treat (common effect model): 
## 
##      RR    p.c   NNTH            95%-CI
##  2.0319 0.1000 9.6906 [8.2116; 11.6058]
## 
## Number needed to treat (random effects model): 
## 
##      RR    p.c   NNTH            95%-CI
##  2.0218 0.1000 9.7870 [6.4761; 16.4843]

Standardized mean differences or Hedges’ $g$ can be converted to the NNT using the NNT function in {dmetar}.

The “NNT” Function

If you did not install {dmetar}, follow these instructions:

Access the source code of the NNT function online.
Let R “learn” the function by copying and pasting the source code in its entirety into the console (bottom left pane of R Studio), and then hit “Enter”.

To use the Kraemer & Kupfer method, we only have to provide the NNT function with an effect size (SMD or $g$). Furukawa’s method is automatically used as soon as a CER value is supplied.

NNT(d = 0.245)

## Kraemer & Kupfer method used. 
## [1] 7.270711

NNT(d = 0.245, CER = 0.35)

## Furukawa & Leucht method used. 
## [1] 10.61533

A Number to be Treated with Care: Criticism of the NNT

While common, usage of NNTs to communicate the results of clinical trials is not uncontroversial. Criticisms include that lay people often misunderstand it (despite purportedly being an “intuitive” alternative to other effect size measures, Christensen and Kristiansen 2006); and that researchers often calculate NNTs incorrectly (Mendes, Alves, and Batel-Marques 2017).

Furthermore, it is not possible to calculate reliable standard errors (and confidence intervals) of NNTs, which means that they can not be used in meta-analyses (Hutton 2010). It is only possible to convert results to the NNT after pooling has been conducted using another effect size measure.

17.9 Multi-Arm Studies

To avoid unit-of-analysis errors (Chapter 3.5.2), it is sometimes necessary to pool the mean and standard deviation of two or more trial arms before calculating a (standardized) mean difference. To pool continuous effect size data of two groups, we can use these equations:

\[\begin{align} n_{\text{pooled}} &= n_1 + n_2 \\ m_{\text{pooled}} &= \frac{n_1m_1+n_2m_2}{n_1+n_2} \\ SD_{\text{pooled}} &= \sqrt{\frac{(n_1-1)SD^{2}_{1}+ (n_2-1)SD^{2}_{2}+\frac{n_1n_2}{n_1+n_2}(m^{2}_1+m^{2}_2-2m_1m_2)} {n_1+n_2-1}} \end{align}\]

We can apply this formula in R using the pool.groups function.

The “pool.groups” Function

The pool.groups function is included in the {dmetar} package. Once {dmetar} is installed and loaded on your computer, the function is ready to be used. If you did not install {dmetar}, follow these instructions:

Access the source code of the function online.
Let R “learn” the function by copying and pasting the source code in its entirety into the console (bottom left pane of R Studio), and then hit “Enter”.

Here is an example:

library(dmetar)

pool.groups(n1 = 50,   # sample size group 1
            n2 = 50,   # sample size group 2
            m1 = 3.5,  # mean group 1
            m2 = 4,    # mean group 2
            sd1 = 3,   # sd group 1
            sd2 = 3.8) # sd group2

##   Mpooled SDpooled Npooled
## 1    3.75 3.415369     100

17.10 Aggregation of Effect Sizes

The aggregate function in {metafor} can be used to aggregate several dependent, pre-calculated effect sizes into one estimate, for example because they are part of the same study or cluster. This is a way to avoid the unit-of-analysis error (see Chapter 3.5.2), but requires us to assume a value for the within-study correlation, which is typically unknown. Another (and often preferable) way to deal with effect size dependencies are (correlated) hierarchical models, which are illustrated in Chapter 10.

In this example, we aggregate effect sizes of the Chernobyl data set (see Chapter 10.2), so that each study only provides one effect size:

library(metafor)
library(dmetar)
data("Chernobyl")

# Convert 'Chernobyl' data to 'escalc' object
Chernobyl <- escalc(yi = z,           # Effect size
                    sei = se.z,       # Standard error
                    data = Chernobyl)

# Aggregate effect sizes on study level
# We assume a correlation of rho=0.6
Chernobyl.agg <- aggregate(Chernobyl, 
                           cluster = author,
                           rho = 0.6)

# Show aggregated results
Chernobyl.agg[,c("author", "yi", "vi")]

##                       author     yi     vi 
## 1 Aghajanyan & Suskov (2009) 0.2415 0.0079 
## 2     Alexanin et al. (2010) 1.3659 0.0012 
## 3             Bochkov (1993) 0.2081 0.0014 
## 4      Dubrova et al. (1996) 0.3068 0.0132 
## 5      Dubrova et al. (1997) 0.4453 0.0110
## [...]

Please note that aggregate returns the aggregated effect sizes yi as well as their variance vi, the square root of which is the standard error.

\[\tag*{$\blacksquare$}\]

16 Reporting & Reproducibility

A Questions & Answers

Doing Meta-Analysis in R: A Hands-on Guide