Chapter 8 Meta-Regression

Conceptually, Meta-Regression does not differ much from a subgroup analysis. In fact, subgroup analyses with more than two groups are nothing more than a meta-regression with categorial predictors. However, meta-regression does also allow us to use continuous data as predictors and check whether these variables are associated with effect size differences.

The idea behind meta-regression

You may have already performed regressions in regular data where participants or patients are the unit of analysis. In typical meta-analyses, we do not have the individual data for each participant available, but only the aggregated effects, which is why we have to perform meta-regressions with predictors on a study level. This also means that while we conduct analyses on participant samples much larger than usual for single studies, it is still very likely that we do not have enough data for a meta-regression to be sensible. In Chapter 7, we told you that subgroup analyses make no sense when \(k<10\). For meta-regression, Borenstein and colleages (Borenstein et al. 2011) recommend that each covariate should at least contain ten studies, although this should not be seen as an iron-clad rule.

In a conventional regression, we want to estimate a parameter \(y\) using a covariate \(x_i\) with \(n\) regression coefficients \(\beta\). A standard regression equation therefore looks like this:

\[y=\beta_0 + \beta_1x_1 + ...+\beta_nx_n\]

In a meta-regression, we want to estimate the effect size \(\theta\) for different values of the predictor(s), so our regression looks like this:

\[\hat \theta_k = \theta + \beta_1x_{1k} + ... + \beta_nx_{nk} + \epsilon_k + \zeta_k\]

You might have seen that when estimating the effect size \(\theta_k\) of a study \(k\) in our regression model, there are two extra terms in the equation, \(\epsilon_k\) and \(\zeta_k\). The same terms can also be found in the equation for the random-effects-model in Chapter 4.2. The two terms signify two types of independent errors which cause our regression prediction to be imperfect. The first one, \(\epsilon_k\), is the sampling error through which the effect size of the study deviates from its “true” effect. The second one, \(\zeta_k\), denotes that even the true effect size of the study is only sampled from an overarching distribution of effect sizes (see the chapter on the Random-Effects Model for more details). In a fixed-effect model, we assume that all studies actually share the same true effect size and that the between-study heterogeneity \(\tau^2 = 0\). In this case, we do not consider \(\zeta_k\) in our equation, but only \(\epsilon_k\).

As the equation above includes fixed effects (the \(\beta\) coefficients) as well as random effects (\(\zeta_k\)), the model used in meta-regression is often called a mixed-effects-model. Mathematically, this model is identical to the mixed-effects-model we described in Chapter 7 where we explained how subgroup analyses work.

Indeed, as mentioned above, subgroup analyses are nothing else than a meta-regression with a categorical predictor. For meta-regression, these subgroups are then dummy-coded, e.g.

\[ D_k = \{\begin{array}{c}0:ACT \\1:CBT \end{array}\]

\[\hat \theta_k = \theta + \beta x_{k} + D_k \gamma + \epsilon_k + \zeta_k\]

In this case, we assume the same regression line, which is simply “shifted” up or down for the different subgroups \(D_k\).

Visualisation of a Meta-Regression with dummy-coded categorial predictors

Figure 8.1: Visualisation of a Meta-Regression with dummy-coded categorial predictors



Assessing the fit of a regression model

To evaluate the statistical significance of a predictor, we a t-test of its \(\beta\)-weight is performed.

\[ t=\frac{\beta}{SE_{\beta}}\]

Which provides a \(p\)-value telling us if a variable significantly predicts effect size differences in our regression model. If we fit a regression model, our aim is to find a model which explains as much as possible of the current variability in effect sizes we find in our data.

In conventional regression, \(R^2\) is commonly used to quantify the goodness of fit of our model in percent (0-100%). As this measure is commonly used, and many researchers know how to to interpret it, we can also calculate a \(R^2\) analog for meta-regression using this formula:

\[R^2=\frac{\hat\tau^2_{REM}-\hat\tau^2_{MEM}}{\hat\tau^2_{REM}}\]

Where \(\hat\tau^2_{REM}\) is the estimated total heterogeneity based on the random-effects-model and \(\hat\tau^2_{REM}\) the total heterogeneity of our mixed-effects regression model.




References

Borenstein, Michael, Larry V Hedges, Julian PT Higgins, and Hannah R Rothstein. 2011. Introduction to Meta-Analysis. John Wiley & Sons.

banner