Chapter 11 Network Meta-Analysis in R

Often when performing a meta-analysis on the effectiveness of certain interventions, we are less interested in the question if one particular intervention is effective (e.g., because it is quite well established that the intervention is efficacious), but weather one intervention is more or less effective than another type of intervention for some condition or population. Yet, once we are interested in head-to-head comparisons between two treatments, we often face the problem that only very few, if any, randomized controlled trials have compared the effects of two interventions directly. This makes it very hard, if not impossible to conduct conventional meta-analyses to answer questions on the comparative effects of two or more interventions for one indication or outcome (e.g., different types of psychotherapy for major depression).

Nevertheless, while direct comparisons between two or more interventions may not exist, it is often the case that such interventions were evaluated in separate randomized controlled trials using the same control group (e.g., waitlist control groups, or placebos). This means that we do have indirect comparisons of the effects of different interventions, because they were compared to the same control condition. Multiple-treatments meta-analysis (MTM) is an extension of conventional meta-analysis which allows us to incorporate such indirect comparisons, and thus the simultaneous analysis of several interventions.

These meta-analysis methods are also referred to as network meta-analyses (Dias et al. 2013) or mixed-treatment comparison meta-analyses (Valkenhoef et al. 2012), because such methods allow for multiple direct and indirect intervention comparisons to be integrated into our analysis, which can be formalized as a “network” of comparisons. Network Meta-Analysis is a “hot” research topic, and in the last decade, its methodology has been increasingly picked up by applied researchers in the medical field (Schwarzer, Carpenter, and Rücker 2015; Efthimiou et al. 2016). However, Network Meta-Analysis also comes with additional challenges and potential pitfalls, particularly in terms of heterogeneity or network inconsistency (Salanti et al. 2014; Schwarzer, Carpenter, and Rücker 2015). Therefore, it is very important to first discuss the core components and assumptions of network meta-analytical models before we proceed. The underpinnings of network meta-analysis can be a little abstract at times; we will therefore go through the essential parts in small steps to get a better understanding of the idea behind network meta-analysis models.

The idea behind network meta-analysis

First, we have to understand what meta-analysts mean when they talk about a “network” of treatments. Let us first consider a simple pairwise comparison between two conditions. The example we present here is no different from the kind of data we discussed in Chapter 4, where we showed how to perform a conventional meta-analysis. Let us assume we have a randomized controlled trial \(i\), which compared the effect of one treatment A (e.g., Cognitive Behavioral Therapy for depression) to another condition B (e.g., a waitlist control group). We can illustrate this comparison in a graphical way like this:

The form in which the treatment comparison is displayed here is called a graph. Graphs are structures used to model how different objects relate to each other, and there is an entire subfield of mathematics devoted to it: Graph theory. The graph has two core components. First, the two blue points (so-called nodes), which represent the two conditions \(A\) and \(B\) in trial \(i\). The second component is the line connecting the two nodes, which is called an edge. This edge represents how \(A\) and \(B\) relate to each other. In our case, the interpretation of this line is quite straightforward: we can describe the relationship between \(A\) and \(B\) as the effect size \(\hat\theta_{i,A,B}\) we observe for the comparison between \(A\) and \(B\). This effect size can be expressed through different metrics, such as the standardized mean difference (SMD), Hedges’ g, Odds Ratio, Incidence Rate Ratio, and so forth, depending on the context.

Now let us proceed by assuming that we also have data of another study \(j\). In this trial, the condition \(B\) (which we imagined to be a waitlist control group) was also included. But instead of using treatment \(A\), like in the first study, this study used another treatment \(C\) (e.g., psychodynamic therapy), which was compared to \(B\). We can add this information to our graph:

This creates our first small network. We can see now that we have two effect size estimates in this network: \(\hat\theta_{i,A,B}\), comparing \(A\) to \(B\), and \(\hat\theta_{j,C,B}\), the comparison between \(C\) and \(B\). Since we took both of these effect size estimates from actual comparisons which were made in “real” randomized trials, we call such information direct evidence. Thus, we can formalize these effect sizes as \(\hat\theta_{B,A}^{direct}\) and \(\hat\theta_{B,C}^{direct}\). Notice how \(B\) comes first in this notation because we determine this condition to be our reference condition, since both effect size estimates contain this condition.

In this first graph, all nodes (conditions) are either directly or indirectly connected. The \(B\) condition (our waitlist control group), is directly connected to all other nodes, i.e., it takes only one “step” on the graph to get from \(B\) to all the other nodes \(A\) and \(C\): \(B \rightarrow A, B \rightarrow C\). \(A\) and \(C\) both have only one direct connection, and they both connect to \(B\): \(A \rightarrow B\) and \(C \rightarrow B\). However, there is an indirect connection between \(A\) and \(C\), where \(B\) serves as the link, or “brige” between the two conditions: \(A \rightarrow B \rightarrow C\). This indirect connection means that we have indirect evidence for the relationship between \(A\) and \(C\), which we can infer from the information the entire network provides us with:

Using the information from our direct evidence, we can therefore calculate the indirect evidence \(\hat\theta_{A,C}^{indirect}\), the effect size between \(A\) and \(C\) (e.g., Cognitive-Behavioral Therapy and Psychodynamic Therapy) like this:

\[\begin{align} \tag{1} \hat\theta_{A,C}^{indirect} = \hat\theta_{B,A}^{direct} - \hat\theta_{B,C}^{direct} \end{align}\]

This is a crucial component of network meta-analytical models. This equation effectively lets us calculate an estimate of the effect size of a comparison, even if the two conditions were never directly compared in an RCT. Network Meta-Analysis in general means that we can combine both direct and indirect evidence in one model to estimate the effect sizes resulting from several treatment comparisons. This also means that even if there is direct evidence for a specific comparison (e.g. \(A-B\)), we can also add information from indirect evidence to further strengthen our model and make our effect sizes estimations even more precise (thus the name mixed-treatment comparison meta-analysis). The example we gave you here should illustrate the great strength of network meta-analytical models:

  • They allow us to pool all available information in a set of connected studies in one analysis. Imagine how we would usually deal in pairwise meta-analysis with trials comparing different treatments to, say, a placebo. We would have to pool each comparison (e.g. treatment \(A\) compared to a placebo, treatment \(B\) compared to a placebo, treatment \(A\) compared to treatment \(B\)) in a seperate meta-analysis.
  • They allow us to incorporate indirect evidence in a network, which we have to discard in conventional meta-analysis. Usually in pairwise meta-analysis, we can only pool direct evidence from comparisons which were actually conducted and reported in randomized controlled trials.
  • If all assumptions are met, and results are conclusive enough, network meta-analyses allow us to draw cogent conclusions concerning which type of treatment may be more or less preferable for a specific target population under study.

Of course all of this sounds intruiging, but there are some important limitations we have to consider here. First, we should look at how the variance of the indirect effect size estimate is calculated:

\[\begin{align} \tag{2} V_{A,C}^{indirect} = V_{B,A}^{direct} + V_{B,C}^{direct} \end{align}\]

As you can see, to calculate the variance of the indirect comparison, we actually add up the variance of the direct comparisons. This basically means that the effect size estimated from indirect evidence will always have a greater variance, and thus a lower precision than direct evidence (Dias et al. 2018). This, of course, makes quite a lot of sense because we have a higher confidence in effect size estimates which were actually observed (because researchers actually performed a study using this comparison), and thus give it a higher weight, compared to effect size estimates derived from indirect evidence.

Furthermore, an essential point is that equation \((1)\) only holds if a core assumption of network meta-analysis is met: the assumption of transitivity, or statistically speaking, network consistency (Efthimiou et al. 2016). We will explain what this means in the following, and why this assumption is important.

Transitivity and Consistency

Although network meta-analysis is certainly a valuable extension of the meta-analytical arsenal, the validity of this method has not remained uncontested. Most of the criticism of network meta-analysis revolves around, as you might have guessed, the use of indirect evidence, especially when direct evidence for a comparison is actually available (Edwards et al. 2009; Ioannidis 2006). The key issue addressed here is that while participants in a randomized controlled trial (which we use as direct evidence in network meta-analysis) are randomly allocated to one of the treatment conditions (e.g., \(A\) and \(B\)), the treatment conditions themselves (\(A, B, ...\)) were not randomly selected in the trials included in our network (Edwards et al. 2009). This is of course quite logical, since we have not, for example, forced all researchers to determine which conditions they compare in their trial through, for example, a dice roll, before they were allowed to roll out their study. However, the fact that the selected treatment comparisons in our study pool will hardly ever follow a random pattern across trials does not constitute a problem for network meta-analytical models per se (Dias et al. 2018). In fact, what is required for equations \((1)\) and \((2)\) to hold is the following: the selection, or non-selection, of a specific comparison in a specific trial must be unrelated to the true (relative) effect size of that comparison (Dias et al. 2013). This statement is very abstract, so let us elaborate on it a little.

This requirement is derived from the transitivity assumption of network meta-analyses. There is disagreement about whether this is an additional assumption of network meta-analysis, or simply an extension of the assumptions of standard pairwise meta-analysis; this disagreement may also be partly caused by an inconsistent usage of terms in the literature (Dias et al. 2018; Efthimiou et al. 2016; Song et al. 2009; Lu and Ades 2009). The transitivity assumption’s core tenet is that we can combine direct evidence (e.g. from the comparisons \(A-B\) and \(C-B\)) to create indirect evidence about a related comparison (e.g. \(A-C\)), as we have already expressed in formula \((1)\) above (Efthimiou et al. 2016).

The assumption of transitivity also relates to, or is derived from, the exchangeability assumption we described in our Chapter about the random-effects model. This assumption presupposes that an effect size \(\hat\theta_i\) of a comparison \(i\) is randomly drawn from an “overarching” distribution of true effect sizes, the mean of which can be estimated. Translating this assumption to our scenario, we can think about network meta-analytical models as consisting of a set of \(K\) trials which each contain all possible \(M\) treatment comparisons (e.g. \(A-B\),\(A-C\),\(B-C\), and so forth), but that some of the treatment comparisons have been “deleted”, and are thus “missing”" in some trials; the reason for this of course being that studies in practice do not assess all possible treatment options for a specific condition, but only two or three (Dias et al. 2018). The key assumption here is that the relative effect of a comparison, e.g. \(A-B\) is exchangeable between trials, no matter if a trial actually assessed this comparison or if this comparison is “missing”. The assumption of exchangeability thus basically means that the effect size \(\hat\theta\) of a specific comparison (e.g. \(A-B\)) must stem from a random draw from the same overarching distribution of effect sizes, no matter if this effect size is derived through direct or indirect evidence.

The assumption of transitivity may be violated when covariates, or effect modifiers (such as the age of the studied populations, or the treatment intensity) are not evenly distributed across trials reporting data on, for example, \(A-B\) and \(C-B\) comparisons (Song et al. 2009). Transitivity as such can not be tested statistically, but the risk for violating this assumption may be attenuated by only including studies for which the population, methodology and studied target condition is as similar as possible (Salanti et al. 2014).

The statistical manifestation of transitivity has been referred to as consistency (Efthimiou et al. 2016; Cipriani et al. 2013). Consistency means that the direct evidence in a network for the effect size between two treatments (e.g. \(A\) and \(B\)) does not differ from the indirect evidence calculated for that same comparison (Schwarzer, Carpenter, and Rücker 2015):

\[\theta_{A,B}^{indirect} = \theta_{A,B}^{direct}\]

Several methods have been proposed to evaluate inconsistency in network meta-analysis models, including net heat plots (Krahn, Binder, and König 2013) and node splitting (Dias et al. 2010). We will describe these methods in further detail in the following two subchapters where we explain how to perform a network meta-analysis in R.

Above, we described the basic theory and assumptions of network meta-analysis models. We illustrated these properties using a simple network with three nodes and edges. In practice, however, the number of treatments we as meta-analysts want to include in a network meta-analysis may be much higher, resulting in much more complex network, which may look more like this:

However, with an increasing number of treatments \(S\) in our network, the number of (direct and indirect) pairwise comparisons \(C\) we have to estimate skyrockets (Dias et al. 2018):

We therefore need a computational model which allows us to efficiently pool all available network data in a coherent and internally consistent manner. Several statistical approaches have been developed for network meta-analysis (Efthimiou et al. 2016). In the following subchapters, we will discuss two major approaches, a frequentist as well as a bayesian hierarchical model, and how they can be implemented in R.

Which modeling approach should i use?

The good message is that while network meta-analysis models may differ in their statistical approach, they should produce the same results when the sample size is large (Shim et al. 2019), and none of them is more or less valid than the other. You can therefore safely choose one or the other approach, depending on which one you find more intuitive, or depending on the functionality of the package which implements it (Efthimiou et al. 2016). One asset of frequentist models is that this approach is very common, and used for most applications in the statistical world. This means that many people might understand the results this method produces more easily. The frequentist network meta-analysis package netmeta, which we will present in the following, however, does not yet provide a straightforward way to conduct meta-regression, while this is possible using the bayesian approach.

In practice, a useful strategy may also be to choose one approach as the main analysis, and then perform the other approach as a sensitivity analysis (e.g. Cipriani et al. 2018). This makes it possible to compare where the two methods come to the same conclusion (which may indicate that these specific results are robust), and where they diverge.


Dias, Sofia, Alex J Sutton, AE Ades, and Nicky J Welton. 2013. “Evidence Synthesis for Decision Making 2: A Generalized Linear Modeling Framework for Pairwise and Network Meta-Analysis of Randomized Controlled Trials.” Medical Decision Making 33 (5). Sage Publications Sage CA: Los Angeles, CA: 607–17.

Valkenhoef, Gert van, Guobing Lu, Bert de Brock, Hans Hillege, AE Ades, and Nicky J Welton. 2012. “Automating Network Meta-Analysis.” Research Synthesis Methods 3 (4). Wiley Online Library: 285–99.

Schwarzer, Guido, James R Carpenter, and Gerta Rücker. 2015. Meta-Analysis with R. Springer.

Efthimiou, Orestis, Thomas PA Debray, Gert van Valkenhoef, Sven Trelle, Klea Panayidou, Karel GM Moons, Johannes B Reitsma, Aijing Shang, Georgia Salanti, and GetReal Methods Review Group. 2016. “GetReal in Network Meta-Analysis: A Review of the Methodology.” Research Synthesis Methods 7 (3). Wiley Online Library: 236–63.

Salanti, Georgia, Cinzia Del Giovane, Anna Chaimani, Deborah M Caldwell, and Julian PT Higgins. 2014. “Evaluating the Quality of Evidence from a Network Meta-Analysis.” PloS One 9 (7). Public Library of Science: e99682.

Dias, Sofia, AE Ades, Nicky J Welton, Jeroen P Jansen, and Alexander J Sutton. 2018. Network Meta-Analysis for Decision-Making. Wiley.

Edwards, SJ, MJ Clarke, S Wordsworth, and J Borrill. 2009. “Indirect Comparisons of Treatments Based on Systematic Reviews of Randomised Controlled Trials.” International Journal of Clinical Practice 63 (6). Wiley Online Library: 841–54.

Ioannidis, John PA. 2006. “Indirect Comparisons: The Mesh and Mess of Clinical Trials.” The Lancet 368 (9546). Elsevier: 1470–2.

Song, Fujian, Yoon K Loke, Tanya Walsh, Anne-Marie Glenny, Alison J Eastwood, and Douglas G Altman. 2009. “Methodological Problems in the Use of Indirect Comparisons for Evaluating Healthcare Interventions: Survey of Published Systematic Reviews.” Bmj 338. British Medical Journal Publishing Group: b1147.

Lu, Guobing, and AE Ades. 2009. “Modeling Between-Trial Variance Structure in Mixed Treatment Comparisons.” Biostatistics 10 (4). Oxford University Press: 792–805.

Cipriani, Andrea, Julian PT Higgins, John R Geddes, and Georgia Salanti. 2013. “Conceptual and Technical Challenges in Network Meta-Analysis.” Annals of Internal Medicine 159 (2). Am Coll Physicians: 130–37.

Krahn, Ulrike, Harald Binder, and Jochem König. 2013. “A Graphical Tool for Locating Inconsistency in Network Meta-Analyses.” BMC Medical Research Methodology 13 (1). BioMed Central: 35.

Dias, S, NJ Welton, DM Caldwell, and AE Ades. 2010. “Checking Consistency in Mixed Treatment Comparison Meta-Analysis.” Statistics in Medicine 29 (7-8). Wiley Online Library: 932–44.

Shim, Sung Ryul, Seong-Jang Kim, Jonghoo Lee, and Gerta Rücker. 2019. “Network Meta-Analysis: Application and Practice Using R Software.” Epidemiology and Health 41. Korean Society of Epidemiology: e2019013.

Cipriani, Andrea, Toshi A Furukawa, Georgia Salanti, Anna Chaimani, Lauren Z Atkinson, Yusuke Ogawa, Stefan Leucht, et al. 2018. “Comparative Efficacy and Acceptability of 21 Antidepressant Drugs for the Acute Treatment of Adults with Major Depressive Disorder: A Systematic Review and Network Meta-Analysis.” Focus 16 (4). Am Psychiatric Assoc: 420–29.