Chapter 14 Meta-analysis

Results from a single study are not sufficient to detect or exclude a relevant effect of a new therapy. Meta-analysis describes statistical techniques to summarize the effect estimates obtained from several independent studies on the same research question (Schmid, Stijnen, and White 2021). The studies included in a meta-analysis are usually the result of a systematic review (Egger, Higgins, and Smith 2022). There are also narrative systematic reviews where a subsequent meta-analysis is not possible.

14.1 Systematic review

A systematic review aims to identify relevant studies from a literature review. This is usually done with a literature search with pre-defined search terms in a database such as MEDLINE. Results from unpublished studies may also be included, if available from platforms such as clinicaltrials.gov. The goal of a subsequent meta-analysis is to summarize the results from a number of clinical studies on the same research question and to integrate the findings. A systematic review with subsequent meta-analysis proceeds as follows:

  • Systematically review the available evidence.
  • Provide quantitative summaries of the results from each study.
  • Combine these results across studies, if appropriate (meta-analysis).
  • Provide summary effect estimate.

Meta-analysis is the basis of evidence-based medicine (EBM).

The studies entering a systematic review should be sufficiently homogeneous regarding in- and exclusion criteria and should use the same effect measure. Typical effect measures are the (standardized) difference in means \(\theta\) between treatment groups for continuous outcomes. For binary/time to event outcomes, relative treatment effects are preferred such as relative risk \(\mbox{RR}\), odds ratio \(\mbox{OR}\) and hazard ratio \(\mbox{HR}\). These are usually considered on a log-scale, so the relevant effect estimate is \(\theta=\log(\mbox{RR})\), \(\theta=\log(\mbox{OR})\), \(\theta=\log(\mbox{HR})\).

Example 14.1 Consider nine RCTs about incidence of preeclampsia. The reported effect measure is an odds ratio for diuretic vs. control, see Table 14.1.

Table 14.1: Results of nine RCTs about incidence of preeclampsia comparing diuretic vs. control.
Study Diuretic Control OR
Weseley 11% (14/131) 10% (14/136) 1.04
Flowers 5% (21/385) 13% (17/134) 0.40
Menzies 25% (14/57) 50% (24/48) 0.33
Fallis 16% (6/38) 45% (18/40) 0.23
Cuadros 1% (12/1011) 5% (35/760) 0.25
Landesman 10% (138/1370) 13% (175/1336) 0.74
Krans 3% (15/506) 4% (20/524) 0.77
Tervila 6% (6/108) 2% (2/103) 2.97
Campbell 42% (65/153) 39% (40/102) 1.14

Figure 14.1 shows a graphical summary of the log odds ratios with 95% CIs. This kind of graphical summary is called a forest plot.

Forest plot for the estimated effects with 95\% confidence intervals in the nine studies from Example \@ref(exm:preeclampsia).

Figure 14.1: Forest plot for the estimated effects with 95% confidence intervals in the nine studies from Example 14.1.

14.2 Meta-analysis

We set the following notation for a meta-analysis over \(i=1,\ldots, n\) trials:

  • \(\theta_i\), \(\hat \theta_i\) are the true and estimated study-specific treatment effects,
  • \(s_i = \mathop{\mathrm{se}}(\hat \theta_i)\) is the standard error of \(\hat \theta_i\),
  • \(v_i = s_i^2\) is the variance of \(\hat \theta_i\) and
  • \(w_i=1/v_i\) is the (estimated) precision of \(\hat \theta_i\).

There are two different types of analyses.

14.2.1 Fixed effect model

A fixed effect model is based on the homogeneity assumption: \(\theta_i = \theta\) for all \(i\). The estimate \(\hat \theta\) of the overall treatment effect \(\theta\) is then a weighted average of study-specific estimates \(\hat \theta_i\) with inverse variance weights \(w_i=1/v_i\):

\[\begin{equation*} \hat \theta = \frac{\sum w_i \hat \theta_i}{\sum w_i} \mbox{ with } \mathop{\mathrm{se}}(\hat \theta) = 1/\sqrt{\sum w_i}. \end{equation*}\]

The overall effect is usually presented in the forest plot as in Figure 14.2 for the preeclampsia Example 14.1. This figure is generated with the following R code:

library(meta)
meta1 <- metabin(event.e = treatedPre, n.e = treatedTot, 
                 event.c = controlPre, n.c = controlTot, 
                 sm = "OR" , method = "Inverse", studlab = study)
forest(meta1, comb.fixed = TRUE, comb.random = TRUE) 
Results with forest plot from package meta for a fixed effect analysis in Example \@ref(exm:preeclampsia)

Figure 14.2: Results with forest plot from package meta for a fixed effect analysis in Example 14.1

14.2.2 Cochran’s test for heterogeneity

Under the homogeneity assumption, we have

\[\begin{equation*} Q = \sum w_i (\hat \theta_i - \hat \theta)^2 \mathrel{\overset{a}{\thicksim}}\chi^2_{n-1} \end{equation*}\]

This can be used to calculate the \(p\)-value of Cochran’s test for heterogeneity. But the test “lacks power” and its usefulness in the context of meta-analysis is limited.

Example 14.2 For the preeclampsia data (Example 14.1), this test yields \(Q = 27.3\) at \(n-1=8\) degrees of freedom (\(p=0.00064\)). There is strong evidence for heterogeneity between studies. A fixed effect model is questionable.

14.2.3 Random effects model

A random effects model assumes that the \(\theta_i\)’s come from a normal distribution with mean \(\theta^*\) and heterogeneity variance \(\tau^2\):

\[\begin{equation*} \hat \theta_i \,\vert\,\theta_i \sim \mathop{\mathrm{N}}(\theta_i, v_i) \quad \mbox{ and } \quad \theta_i \sim \mathop{\mathrm{N}}(\theta^*, \tau^2), \end{equation*}\]

so marginally

\[\begin{equation*} \hat \theta_i \sim \mathop{\mathrm{N}}(\theta^*, v_i + \tau^2). \end{equation*}\]

The overall effect estimate \(\hat \theta^*\) and its standard error are now:

\[\begin{equation*} \hat \theta^* = \frac{\sum w_i^* \hat \theta_i}{\sum w_i^*}, \quad \mathop{\mathrm{se}}(\hat \theta^*) = {1}/{\sqrt{\sum w_i^*}}, \end{equation*}\]

with weights \(w_i^* = {1}/{(v_i + \tau^2)}\). Compared to the fixed effect model, the CIs for the overall effect will become wider and large studies will obtain less weight.

14.2.3.1 Estimate of heterogeneity variance and Higgins’ \(I^2\)

The moment estimator of the heterogeneity variance compares the \(Q\)-statistic to its expectation \(n-1\) under homogeneity. Truncation at zero and appropriate scaling gives

\[\begin{equation*} \hat \tau^2 = \left\{Q-(n-1)\right\}_{+} / \left\{\textstyle\sum w_i - \textstyle\sum w_i^2/\textstyle\sum w_i\right\} \end{equation*}\]

The standard DerSimonian-Laird approach plugs \(\hat \tau^2\) in \(w_i^* = {1}/{(v_i + \tau^2)}\), the alternative Hartung-Knapp adjustment accounts for the uncertainty of \(\hat \tau^2\).

Easier to interpret is Higgins’ \(I^2\), the percentage of variance that is attributable to study heterogeneity. The value of \(I^2\) will somewhat depend on the method used to estimate \(\tau^2\), although in practice this aspect can be neglected.

Figure 14.2 shows also the results of a random effects analysis, together with the estimated Higgins’ \(I^2\) and heterogeneity variance \(\tau^2\), in the preeclampsia Example 14.1.

14.3 Reporting Bias

Reporting bias occurs when the publication of research results depends on their nature and direction. Sources of reporting bias are:

  • Publication bias: Failure to publish due to negative or null findings (on the side of the researchers or referees/editors/journals).
  • Outcome reporting bias: Selective reporting of outcomes (due to changes in research plan).
  • Selective citation of positive results.

Consequently, there is danger of false conclusions and patient harm.

14.3.1 Funnel plot

A funnel plot is a scatter plot of a measure of study size, usually the (reversed) standard error, against the estimated treatment effects from individual studies. See the funnel plot for the preeclampsia data as an example in Figure 14.3. The smaller the study size, the wider the spread of the treatment effects and vice versa. If there is no bias, the point cloud has the form of a funnel (symmetrical). If there is bias, it is asymmetrical. However, it is an explorative tool and there is no quantitative information on the amount or the source of the bias. Funnel plots and tools for meta-analysis in R are provided by packages meta and rmeta.

Funnel plot for preeclampsia data from Example \@ref(exm:preeclampsia)

Figure 14.3: Funnel plot for preeclampsia data from Example 14.1

Example 14.3 Passive smoking: Consider a meta-analysis of 37 studies of the effect of environmental tobacco smoke on the risk of lung cancer in lifetime non-smokers (Hackshaw 1998). Spouses of smokers and non-smokers have been compared in terms of log relative risk. Are the results shown in a funnel plot in Figure 14.4 affected by publication bias?

Funnel plot of results in the meta-analysis about risk of lung cancer from passive smoking by @hackshaw1998.

Figure 14.4: Funnel plot of results in the meta-analysis about risk of lung cancer from passive smoking by Hackshaw (1998).

Smaller studies tend to show greater effects, so there is some asymmetry. However, only visual inspection is not enough to decide if it is real bias or chance.

14.3.2 The trim-and-fill method

Based on key assumptions that

  • publication bias is the reason of funnel plot asymmetry and
  • studies with negative findings are suppressed,

the idea of the trim-and-fill method is to:

  1. Trim-off the “asymmetric” side of a funnel plot, after estimating the number of studies in this group.
  2. Use the symmetric remainder to estimate the “true center”.
  3. Impute trimmed studies and their missing “counterparts” around the center.
  4. Estimate \(\theta\) and its variance based on the “filled” funnel plot.

Example 14.4 Figure 14.5 illustrates the trim-and-fill method for 11 studies of the effect of using gangliosides in reducing case fatality and disability in acute ischaemic stroke.

Figure 14.5: Trim-and-fill method in Example 14.4. The black circles represent the 11 observed studies whereas the red studies on the left are imputed, based on symmetrically ‘mirroring’ the trimmed studies on the right.

Trim-and-fill with R:

library(meta) # load the library
gs <- as.matrix(read.table("data/ganglioside.txt", header = TRUE))
tf_gs <- trimfill(x = gs[,1], seTE = gs[,2], 
                  ma.fixed = FALSE, type = "L", silent = TRUE)
summary(tf_gs)
##                               95%-CI %W(random)
## 1          -0.2000 [-1.0036; 0.6036]        7.4
## 2          -0.0700 [-0.4228; 0.2828]       38.3
## 3           0.0400 [-0.5480; 0.6280]       13.8
## 4           0.1600 [-0.8788; 1.1988]        4.4
## 5           0.2100 [-0.7896; 1.2096]        4.8
## 6           0.2700 [-0.3768; 0.9168]       11.4
## 7           0.5300 [-0.9204; 1.9804]        2.3
## 8           0.5600 [-1.5568; 2.6768]        1.1
## 9           0.8000 [-0.4152; 2.0152]        3.2
## 10          1.0800 [-0.2136; 2.3736]        2.9
## 11          2.1100 [-0.9279; 5.1479]        0.5
## Filled: 7  -0.5005 [-1.9509; 0.9498]        2.3
## Filled: 8  -0.5305 [-2.6473; 1.5862]        1.1
## Filled: 9  -0.7705 [-1.9857; 0.4446]        3.2
## Filled: 10 -1.0505 [-2.3441; 0.2430]        2.9
## Filled: 11 -2.0805 [-5.1185; 0.9574]        0.5
## 
## Number of studies: k = 16 (with 5 added studies)
## 
##                                        95%-CI    z p-value
## Random effects model 0.0147 [-0.2037; 0.2332] 0.13  0.8948
## 
## Quantifying heterogeneity:
##  tau^2 < 0.0001 [0.0000; 0.8626]; tau = 0.0022 [0.0000; 0.9288]
##  I^2 = 0.0% [0.0%; 52.3%]; H = 1.00 [1.00; 1.45]
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  14.88   15  0.4604
## 
## Details on meta-analytical method:
## - Inverse variance method
## - Restricted maximum-likelihood estimator for tau^2
## - Q-Profile method for confidence interval of tau^2 and tau
## - Trim-and-fill method to adjust for funnel plot asymmetry (L-estimator)

Example 14.5 Data from the passive smoking example is illustrated in Figure 14.6. Results from the trim-and-fill method are shown in Figure 14.7. The summary effect with (in red) and without (in black) the results from the imputed studies is shown at the bottom.

Funnel plot in the passive smoking Example \@ref(exm:smoking).

Figure 14.6: Funnel plot in the passive smoking Example 14.5.

Model Number of studies OR 95% CI
Observed 37 1.24 (1.13, 1.37)
{ Filled} 44 1.19 (1.08, 1.31)

Figure 14.7: Trim-and-fill method in the passive smoking Example 14.5.

14.3.3 Tests for funnel plot asymmetry

If visual inspection of a funnel plot is not enough, perform a statistical test. Popular methods are: Rank correlation method or Egger’s test (weighted regression).

14.3.3.1 Begg’s rank correlation test

Begg’s rank correlation test computes standardized treatment estimates

\[\begin{equation*} \theta_i^*=\frac{\theta_i-\hat{\theta}}{\sqrt{v_i^*}}, \end{equation*}\]

where \(\hat{\theta}\) is the fixed-effect estimate of the summary effect and \(v_i^*=v_i - 1/(\sum v_j^{-1})\) is the variance of \(\theta_i-\hat{\theta}\). Then, it tests the null hypothesis that Kendell’s rank correlation (Kendell’s \(\tau\)) between \(\theta_i^*\) and \(v_i^*\) is zero. This test may have low power if the number of studies is small.

A rank correlation test can be performed using R as shown below for the passive smoking example:

head(ps)
##     lnRR selnRR
## 1  0.231  0.100
## 2 -0.030  0.112
## 3 -0.236  0.127
## 4  0.166  0.137
## 5  0.182  0.150
## 6  0.501  0.180
metabias(ps$lnRR, ps$selnRR, method="Begg")
## Rank correlation test of funnel plot asymmetry
## 
## Test result: z = 1.27, p-value = 0.2044
## Bias estimate: 97.0000 (SE = 76.4395)
## 
## Reference: Begg & Mazumdar (1993), Biometrics

14.3.3.2 Egger’s weighted regression

The idea is to perform weighted regression of \(\hat \theta_i\) on \(s_i\),

\[\begin{equation*} \hat \theta_i= a + b \cdot s_i + \mbox{error}, \end{equation*}\]

with weights \(w_i\). The slope \(b\) is a measure of bias.

Example 14.6 Egger’s weighted regression can be performed using R as shown below for the passive smoking example:

theta <- ps$lnRR
s <- ps$selnRR
egger1 <- lm(theta ~ s, weights = 1/s^2)
% latex table generated in R 4.3.3 by xtable 1.8-4 package % Tue Oct 15 15:51:27 2024
Coefficient 95%-confidence interval \(p\)-value
Intercept 0.009 from -0.16 to 0.18 0.92
s 0.89 from 0.13 to 1.66 0.024

Figure 14.8 shows a visualization of Egger’s weighted regression in the passive smoking example.

Visualization of Egger's weighted regression in the passive smoking example.

Figure 14.8: Visualization of Egger’s weighted regression in the passive smoking example.

Egger’s test in R:

metabias(ps$lnRR, ps$selnRR, method="Egger")
## Linear regression test of funnel plot asymmetry
## 
## Test result: t = 2.37, df = 35, p-value = 0.0236
## Bias estimate: 0.8915 (SE = 0.3768)
## 
## Details:
## - multiplicative residual heterogeneity variance (tau^2 = 1.1704)
## - predictor: standard error
## - weight:    inverse variance
## - reference: Egger et al. (1997), BMJ

14.3.3.3 Limit meta-analysis

The aim of a limit meta-analysis is to compute a bias-adjusted overall effect estimate by extrapolating the weighted regression to \(s_i \to 0\). \begin{example}

A limit meta-analysis can be performed in R using the function limitmeta() from the package metasens:

psMeta <- metagen(TE=ps$lnRR, seTE=ps$selnRR)
library(metasens)
limitmeta(psMeta)
## Result of limit meta-analysis:
## 
##  Random effects model                   95%-CI    z
##     Adjusted estimate 0.0982 [-0.0574; 0.2538] 1.24
##   Unadjusted estimate 0.2183 [ 0.1222; 0.3143] 4.45
##      pval
##    0.2160
##  < 0.0001
## 
## Quantifying heterogeneity:
## tau^2 = 0.0220; I^2 = 24.2% [0.0%; 49.8%]; G^2 = 95.0%
## 
## Test of heterogeneity:
##      Q d.f. p-value
##  47.52   36  0.0949
## 
## Test of small-study effects:
##   Q-Q' d.f. p-value
##   6.55    1  0.0105
## 
## Test of residual heterogeneity beyond small-study effects:
##     Q' d.f. p-value
##  40.96   35  0.2252
## 
## Details on adjustment method:
## - expectation (beta0)

14.4 Additional references

M. Bland (2015) (Chapter 17) and J. N. S. Matthews (2006)(Chapter 12)

References

Bland, Martin. 2015. An Introduction to Medical Statistics. Fourth. Oxford University Press.
Egger, M., J. P. T. Higgins, and G. D. Smith. 2022. Systematic Reviews in Health Research – Meta-Analysis in Context. Third. Hoboken NJ, USA: John Wiley & Sons Ltd.
Hackshaw, AK. 1998. “Lung Cancer and Passive Smoking.” Statistical Methods in Medical Research 7 (2): 119–36.
Matthews, John N. S. 2006. Introduction to Randomized Controlled Clinical Trials. Second. Chapman & Hall/CRC.
Schmid, C. H., T. Stijnen, and I. R. White. 2021. Handbook of Meta-Analysis. Boca Raton FL, USA: CRC Press.