14.2 Multivariate Meta-Analysis

Now it is time to delve into our first worked example of a meta-analytic SEM using R. We will begin by using the SEM-based approach for multivariate meta-analysis, which has not been covered before. In multivariate meta-analyses, each study contributes more than just one effect size at the same time. This may be helpful in cases where we are studying a problem or condition for which there are several main outcomes, not just one. For example, for some type of treatment, there could be two types of outcomes which are deemed as important in the literature, and are thus assessed in most studies (Schwarzer, Carpenter, and Rücker 2015). In multivariate meta-analyses, we can estimate the effect sizes for both outcomes simultaneously in one model. Taking the correlation between the two outcomes into account, we can also determine if studies with a high effect size on one outcome also have higher effect sizes on the other outcome, or if there is no, or a negative relationship.

It is of note here that multivariate meta-analysis can also be performed outside a SEM framework (Schwarzer, Carpenter, and Rücker 2015). Nevertheless, as an introduction, we will to show you how multivariate meta-analysis can be performed from a SEM perspective. In this and the following examples, we will work with metaSEM, a magnificent package for meta-analytic SEM developed by Mike Cheung (M. W.-L. Cheung 2015). To begin, as always, install the metaSEM package and load it from your library.


For my multivariate meta-analysis, i will use the dat.da dataset. This is a fictitious dataset (adapted from the wvs94a data in metaSEM) containing \(k=37\) studies examining the effects of psychotherapy on symptoms of Depression and Anxiety, expressed as the Standardized Mean Difference (SMD). The dataset can be downloaded here.

Let’s have a look at the data first.


As we can see, the dataset contains the effect sizes on both Depression and Anxiety, along with the variances \(v_k\) for both effect sizes. There is also a column called Covariance in which the covariance between Depression and Anxiety reported in each study is stored.

A common problem is that the covariance or correlation between two outcomes is not reported in original studies. If this is the case, we have to estimate the covariance using a reasonable assumption on the correlation between the outcomes. Let us assume that we would not know the covariance in each study yet. How can we estimate it? A good way is to look for previous literature reporting the correlation between the two outcomes, optimally in the same kind of context we are synthesizing. Let us say we found in the literature that Depression and Anxiety are highly correlated in post-tests of psychotherapy trials, with \(r_{D,A} \approx 0.6\). We could then approximate the covariance for each study \(k\) by using \(cov_k = SD_{D_{k}} \times SD_{A_{k}} \times r_{D,A}\). We can do this in R like this:

estimated.cov <- sqrt(dat.da$Depression_var) * sqrt(dat.da$Anxiety_var) * 0.6

Where we take the square root because \(SD = \sqrt{Var}\).

14.2.1 Specifying the Model

To specify the model for the multivariate meta-analysis, we do not have to follow the TSSEM procedure programmatically, nor do we have to specify the RAM matrices. For such a relatively simple model, we can use the meta function in metaSEM to pool the studies, specify and fit the model all at once.

Here are the most important parameters we have to specify for the meta() function.

Parameter Description
y The columns of our dataset which contain the effect size data. In a multivariate meta-analysis, we have to combine the effect size columns we want to include using cbind().
v The columns of our dataset which contain the variance for the effect size. In a multivariate meta-analysis, we have to combine the variance columns we want to include using cbind(). We also have to include the column including the covariance between the effect sizes. The structure of the cbind() function is cbind(variance_1, covariance, variance_2)
data The dataset in which the effect sizes and their variances are stored.

I will save my meta-analysis model as m.mv. My code then looks like this:

m.mv <- meta(y = cbind(Depression, Anxiety), 
             v = cbind(Depression_var, Covariance, Anxiety_var),
             data = dat.da)

To get the results, i have to plug the m.mv object into the summary() function.

meta(y = cbind(Depression, Anxiety), v = cbind(Depression_var, 
    Covariance, Anxiety_var), data = dat.da)

95% confidence intervals: z statistic approximation
              Estimate   Std.Error      lbound      ubound z value  Pr(>|z|)    
Intercept1  0.50321391  0.01466266  0.47447563  0.53195220 34.3194 < 2.2e-16 ***
Intercept2  0.45553567  0.01748527  0.42126518  0.48980617 26.0525 < 2.2e-16 ***
Tau2_1_1    0.00459325  0.00186286  0.00094210  0.00824439  2.4657  0.013675 *  
Tau2_2_1    0.00287809  0.00167184 -0.00039865  0.00615483  1.7215  0.085158 .  
Tau2_2_2    0.00783290  0.00254639  0.00284206  0.01282374  3.0761  0.002097 ** 
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Q statistic on the homogeneity of effect sizes: 223.8427
Degrees of freedom of the Q statistic: 72
P value of the Q statistic: 0

Heterogeneity indices (based on the estimated Tau2):
Intercept1: I2 (Q statistic)   0.6079
Intercept2: I2 (Q statistic)   0.7220

Number of studies (or clusters): 37
Number of observed statistics: 74
Number of estimated parameters: 5
Degrees of freedom: 69
-2 log likelihood: -140.7221 
OpenMx status1: 0 ("0" or "1": The optimization is considered fine.
Other values may indicate problems.)

14.2.2 Evaluating the Output

Given that the SEM model is fitted using the Maximum-Likelihood algorithm, the first thing we always do is check the OpenMx status right at the end of the output. Maximum-Likelihood is an optimization procedure, in which parameters are changed iteratively until the optimal solution for the data at hand is found. However, especially with more complex models, it can happen that this optimum is not reached even after many iterations; the Maximum-Likelihood algorithm will then stop and output the parameters values it has approximated so far. However, those values for our model components will then be very likely to be incorrect and should not be trusted. The OpenMx status for my model is 0, which indicates that the Maximum-Likelihood estimation worked fine. However, if the status would have been other than 0 or 1, it would have been necessary for me to rerun the fitting process with this code:


In the output, the two pooled effect sizes are shown as Intercept1 and Intercept2. The effect sizes are numbered be the order in which we inserted them into our call to meta(). We can see that the pooled effect sizes are \(SMD_{D} = 0.50\) for Depression and \(SMD_{A} = 0.46\). Both effect sizes are significant. Under Heterogeneity indices, we can also see the values of \(I^2\) (see Chapter 6), which are \(I^{2}_D = 61\%\) and \(I^{2}_A = 72\%\), indicating substantial between-study heterogeneity.

The direct estimates of the between study heterogeneity \(\tau^2\) are also provided. We see that there are not only two estimates, but three. To understand what this means, we can extract the "random" values from the m.mv object first.

tau.coefs <- coef(m.mv, select = "random")

Then, we can use the vec2symMat() function on the coefficients to create a matrix. We then give the matrix rows and columns the names of our variables, Depression and Anxiety

tc.mat <- metaSEM::vec2symMat(tau.coefs)
dimnames(tc.mat)[[1]] <- dimnames(tc.mat)[[2]] <- c("Depression", "Anxiety")

##             Depression     Anxiety
## Depression 0.004593246 0.002878092
## Anxiety    0.002878092 0.007832899

We now understand better what the three \(\tau^2\) values mean: they represent the between study “variance”/heterogeneity at the diagonal of the matrix; in the other two fields, they are the estimate for the covariance between Depression and Anxiety. Given that the covariance is just an unstandardized version of the correlation, we can also calculate the correlations using the cov2cor() function.

##            Depression   Anxiety
## Depression  1.0000000 0.4798257
## Anxiety     0.4798257 1.0000000

We see that, quite logically, the correlations in the diagonal elements of the matrix are 1. The correlation between effect sizes on Depression and effect sizes on Anxiety is \(r_{D,A} = 0.48\). This is an interesting finding: is shows that there is a positive association between a treatment’s effect on Depression and its effect on Anxiety. Treatments which have high effects on Depression are very likely to have high effects on Anxiety too.

It is of note that the confidence intervals presented in the output are so-called Wald-type intervals. Such Wald-type intervals have been found to be prone to inacurateness, especially in small samples (DiCiccio and Efron 1996). It may thus be valuable to constuct confidence intervals in another way, using so-called likelihood-based confidence intervals. We can get these CIs by rerunning the meta() function and additionally specifying intervals.type = "LB"

m.mv <- meta(y = cbind(Depression, Anxiety), 
             v = cbind(Depression_var, Covariance, Anxiety_var),
             data = dat.da, 
             intervals.type = "LB")

We have already seen that the output for our m.mv contains estimates of the between-study heterogeneity \(\tau^2\). If we recall our knowledge of the meta-analysis model, we can therefore conclude that the model we just fitted is a random-effects model. The meta() function uses a random-effects model automatically, because the fixed-effects model is only a special case of the random-effects model in which we set \(\tau^2 = 0\). From the \(I^2\) in our output, we can conclude that the random-effects model is indeed indicated; however, if we wanted to fit a fixed-effect model, we can redo the analysis, this time adding the parameter RE.constraints = matrix(0, nrow=2, ncol=2), which creates a matrix of zeros of the same structure as the tc.mat matrix, constraining our tau2 values to zero:

m.mv <- meta(y = cbind(Depression, Anxiety), 
             v = cbind(Depression_var, Covariance, Anxiety_var),
             data = dat.da, 
             RE.constraints = matrix(0, nrow=2, ncol=2))

14.2.3 Plotting the Model

To plot the multivariate meta-analysis model, we can simply use the plot() function. I will make some additional specifications here to change the appearance of the plot slightly. If you want to see all styling options, you can paste ?metaSEM::plot.meta() into the Console and then hit Enter.

     axis.labels = c("Depression", "Anxiety"), 
     randeff.ellipse.col = "black",
     univariate.arrows.col = "black", 
     univariate.polygon.col = "blue")

Let us go through the plot one by one:

  • We see that the plot has two axes: the x-axis, displaying the effect size on Depression, and the y-axis displaying the effect size on Anxiety.
  • We also see the pooled effect and its 95% CI for both outcomes, symbolized by the blue diamond and the black arrows.
  • In the middle of the plot, the pooled effect for both variables is represented by another blue diamond.
  • The red ellipse represents the 95% confidence interval* ellipse for out pooled effect sizes.
  • The black ellipse represents the space in which we expect 95% of all studies to fall based on the random-effects model.
  • The black dots represent the individual studies, and the dashed lines are the 95% CIs of the individual studies.



Schwarzer, Guido, James R Carpenter, and Gerta Rücker. 2015. Meta-Analysis with R. Springer.

Cheung, Mike W.-L. 2015. “metaSEM: An R Package for Meta-Analysis Using Structural Equation Modeling.” Frontiers in Psychology 5 (1521). https://doi.org/10.3389/fpsyg.2014.01521.

DiCiccio, Thomas J, and Bradley Efron. 1996. “Bootstrap Confidence Intervals.” Statistical Science. JSTOR, 189–212.