Chapter 7 Subgroup Analyses

In Chapter 6, we discussed in depth why between-study heterogeneity is such an important issue when we are interpreting the results of our meta-analysis, and how we can explore sources of heterogeneity using outlier and influence analyses.

Another source of between-study heterogeneity making our effect size estimate less precise could be that there are slight differences in the study design or intervention components between the studies. For example, in a meta-analysis on the effects of cognitive behavioral therapy (CBT) for depression in university students, it could be the case that some studies delivered the intervention in a group setting, while others delivered the therapy to each student individually. In the same example, it is also possible that studies used different criteria to determine if a student suffers from depression (e.g. they either used diagnostic interviews or self-report questionnaires).

Many other differences of this sort are possible, and it seems plausible that such study differences may also be associated with differences in the overall effect.

In subgroup analyses, we therefore have a look at different subgroups within the studies of our meta-analysis and try to determine if they differ in terms of their effects.

The idea behind subgroup analyses

Boiled down, every subgroup analysis consists of two parts: (1) pooling the effect of each subgroup, and (2) comparing the effects of the subgroups (Borenstein and Higgins 2013).

1. Pooling the effect of each subgroup

This point is rather straightforward, as the same criteria as the ones for a simple meta-analysis without subgroups (see Chapter 4 and Chapter 4.2) apply here.

  • If you assume that all studies in subgroup stem from the same population, and all have one shared true effect, you may use the fixed-effect-model. As we mention in Chapter 4, many doubt that this assumption is ever true in psychological and medical research, even when we partition our studies into subgroups.
  • The alternative, therefore, is to use a random-effect-model which assumes that the studies within a subgroup are drawn from a universe of populations, for which we want to estimate the mean.

2. Comparing the effects of the subgroups

After we calculated the pooled effect for each subgroup, we can compare the size of the effects of each subgroup. However, to know if this difference is in fact singnificant and/or meaningful, we have to calculate the Standard Error of the differences between subgroup effect sizes \(SE_{diff}\), to calculate confidence intervals and conduct significance tests. There are two ways to calculate \(SE_{diff}\), and both are based on different assumptions.

  • Fixed-effects (plural) model: The fixed-effects-model for subgroup comparisons is appropriate when we are only interested in the subgroups at hand (Borenstein and Higgins 2013). This is the case when the subgroups we chose to examine were not randomly “chosen”, but represent fixed levels of a characteristic we want to examine. Sex is such a characteristic, as its two subgroups female and male were not randomly chosen, but are the two subgroups that sex (in its “classical” conception) has. Same does also apply, for example, if we were to examine if studies in patients with clinical depression versus subclinical depression yield different effects. Borenstein and Higgins (Borenstein and Higgins 2013) argue that the fixed-effects (plural) model may be the only plausible model for most analyses in medical research, prevention, and other fields.

As this model assumes that no further sampling error is introduced at the subgroup level (because subgroups were not randomly sampled, but are fixed), \(SE_{diff}\) only depends on the variance within the subgroups \(A\) and \(B\), \(V_A\) and \(V_B\).

\[V_{Diff}=V_A + V_B\]

The fixed-effects (plural) model can be used to test differences in the pooled effects between subgroups, while the pooling within the subgroups is still conducted using a random-effects model. Such a combination is sometimes called a mixed-effects model. We will show you how to use this model in R in the next chapter.

  • Random-effects model: The random-effects-model for between-subgroup-effects is appropriate when the subgroups we use were randomly sampled from a population of subgroups. For example, we could be interested in the question if the effect of an intervention varies by region by looking at studies from 5 different countries (e.g., Netherlands, USA, Australia, China, Argentina). This variable “region” has many different potential subgroups (countries), from which we randomly selected five. This means that we introduced a new sampling error, which we have to control for using the random-effects model for between-subgroup comparisons.

The (simplified) formula for the estimation of \(V_{Diff}\) using this model therefore looks like this:

\[V_{Diff}=V_A + V_B + \frac{\hat T^2_G}{m} \]

Where \(\hat T^2_G\) is the estimated variance between the subgroups, and \(m\) is the number of subgroups.

Be aware that subgroup analyses should always be based on an informed, a priori decision which subgroup differences within the study might be practically relevant, and would lead to information gain on relevant research questions in your field of research. It is also good practice to specify your subgroup analyses before you do the analysis, and list them in the registration of your analysis.

It is also important to keep in mind that the capabilites of subgroup analyses to detect meaningful differences between studies is often limited. Subgroup analyses also need sufficient power, so it makes no sense to compare two or more subgroups when your entire number of studies in the meta-analysis is smaller than \(k=10\) (Higgins and Thompson 2004).


Borenstein, Michael, and Julian PT Higgins. 2013. “Meta-Analysis and Subgroups.” Prevention Science 14 (2). Springer: 134–43.

Higgins, Julian PT, and Simon G Thompson. 2004. “Controlling the Risk of Spurious Findings from Meta-Regression.” Statistics in Medicine 23 (11). Wiley Online Library: 1663–82.