# Chapter 8 Meta-Regression

Conceptually, **Meta-Regression** does not differ much from a **subgroup analysis**. In fact, subgroup analyses with more than two groups are nothing more than a meta-regression with categorial predictors. However, meta-regression does also allow us to use **continuous data** as predictors and check whether these variables are associated with effect size differences.

**The idea behind meta-regression**

You may have already performed regressions in regular data where participants or patients are the **unit of analysis**. In typical meta-analyses, we do not have the individual data for each participant available, but only the **aggregated effects**, which is why we have to perform meta-regressions with predictors on a **study level**. This also means that while we conduct analyses on participant samples much larger than usual for single studies, it is still very likely that **we do not have enough data for a meta-regression to be sensible**. In Chapter 7, we told you that subgroup analyses make no sense when \(k<10\). For **meta-regression**, Borenstein and colleages (Borenstein et al. 2011) recommend that **each covariate should at least contain ten studies**, although this should not be seen as an iron-clad rule.

In a conventional regression, we want to estimate a parameter \(y\) using a covariate \(x_i\) with \(n\) regression coefficients \(\beta\). A standard regression equation therefore looks like this:

\[y=\beta_0 + \beta_1x_1 + ...+\beta_nx_n\]

In a meta-regression, we want to estimate the **effect size** \(\theta\) for different values of the predictor(s), so our regression looks like this:

\[\hat \theta_k = \theta + \beta_1x_{1k} + ... + \beta_nx_{nk} + \epsilon_k + \zeta_k\]

You might have seen that when estimating the effect size \(\theta_k\) of a study \(k\) in our regression model, there are two **extra terms in the equation**, \(\epsilon_k\) and \(\zeta_k\). The same terms can also be found in the equation for the random-effects-model in Chapter 4.2. The two terms signify two types of **independent errors** which cause our regression prediction to be **imperfect**. The first one, \(\epsilon_k\), is the sampling error through which the effect size of the study deviates from its “true” effect. The second one, \(\zeta_k\), denotes that even the true effect size of the study is only sampled from **an overarching distribution of effect sizes** (see the chapter on the Random-Effects Model for more details). In a **fixed-effect model**, we assume that all studies actually share the **same true effect size** and that the **between-study heterogeneity** \(\tau^2 = 0\). In this case, we do not consider \(\zeta_k\) in our equation, but only \(\epsilon_k\).

As the equation above includes **fixed effects** (the \(\beta\) coefficients) as well as **random effects** (\(\zeta_k\)), the model used in meta-regression is often called **a mixed-effects-model**. Mathematically, this model is identical to the **mixed-effects-model** we described in Chapter 7 where we explained how **subgroup analyses** work.

Indeed, as mentioned above, **subgroup analyses** are nothing else than a **meta-regression** with a **categorical predictor**. For meta-regression, these subgroups are then **dummy-coded**, e.g.

\[ D_k = \{\begin{array}{c}0:ACT \\1:CBT \end{array}\]

\[\hat \theta_k = \theta + \beta x_{k} + D_k \gamma + \epsilon_k + \zeta_k\]

In this case, we assume the same **regression line**, which is simply “shifted” **up or down for the different subgroups** \(D_k\).

**Assessing the fit of a regression model**

To evaluate the **statistical significance of a predictor**, we a **t-test** of its \(\beta\)-weight is performed.

\[ t=\frac{\beta}{SE_{\beta}}\]

Which provides a \(p\)-value telling us if a variable significantly predicts effect size differences in our regression model. If we fit a regression model, our aim is to find a model **which explains as much as possible of the current variability in effect sizes** we find in our data.

In conventional regression, \(R^2\) is commonly used to quantify the **goodness of fit** of our model in percent (0-100%). As this measure is commonly used, and many researchers know how to to interpret it, we can also calculate a \(R^2\) analog for meta-regression using this formula:

\[R^2=\frac{\hat\tau^2_{REM}-\hat\tau^2_{MEM}}{\hat\tau^2_{REM}}\]

Where \(\hat\tau^2_{REM}\) is the estimated total heterogeneity based on the random-effects-model and \(\hat\tau^2_{REM}\) the total heterogeneity of our mixed-effects regression model.

### References

Borenstein, Michael, Larry V Hedges, Julian PT Higgins, and Hannah R Rothstein. 2011. *Introduction to Meta-Analysis*. John Wiley & Sons.