13.2 Forest Plots for Bayesian Meta-Analysis

As you have seen before, Bayesian models allow us to extract their sampled posterior distribution, which can be extremely helpful to directly assess the probability of specific values given our model. We can exploit this feature of Bayesian models to create enhanced Forest Plots which are both very informative and pleasing to the eye (see blog posts by A. Salomon Kurz and Matti Vuorre for more information). Unfortunately, there is currently no maintained “pre-packaged” function to create such forest plots, but it is possible to build one oneself using functions of the tidybayes package. So, let us first load this package along with a few other ones before we proceed.

library(tidybayes)
library(dplyr)
library(ggplot2)
library(ggridges)
library(glue)
library(stringr)
library(forcats)

Before we can generate the plot, we have to prepare the data. In particular, we need to extract the posterior distribution for each study individually (since forest plots also depict the specific effect size of each study). To achieve this, we can use the spread_draws function in the tidybayes package. The function needs three arguments as input: our fitted brms model, the random-effects factor by which the results should be indexed, and the parameter we want to extract (here b_Intercept, since we want to extract the fixed term, the effect size). Using the pipe operator %>%, we can directly manipulate the output; using the mutate function in dplyr, we calculate the actual effect size of each study by adding the pooled effect size b_Intercept to the estimated deviation for each study. We save the result as study.draws.

study.draws <- spread_draws(m.brm, r_Author[Author,], b_Intercept) %>% 
  mutate(b_Intercept = r_Author + b_Intercept)

Next, we want to generate the distribution of the pooled effect in a similar way (since in forest plots, the summary effect is usally displayed in the last row). We therefore slightly adapt the code from before, dropping the second argument to only get the pooled effect. The call to mutate only adds an extra column called “Author”. We save the result as pooled.effect.draws.

pooled.effect.draws <- spread_draws(m.brm, b_Intercept) %>% 
  mutate(Author = "Pooled Effect")

Next, we bind study.draws and pooled.effect.draws together in one data frame. We then start a pipe again, calling ungroup first, and then using mutate to (1) clean the study labels (i.e., replace dots with spaces), and (2) reorder the study factor levels by effect size. This is the data we will need for plotting, so we save it as forest.data.

forest.data <- bind_rows(study.draws, pooled.effect.draws) %>% 
   ungroup() %>%
   mutate(Author = str_replace_all(Author, "[.]", " ")) %>% 
   mutate(Author = reorder(Author, b_Intercept))

Lastly, we also need summarized data (the mean and credibility interval) of each study. To do this, we use our newly generated forest.data dataset, group it by Author, and then use the mean_qi function to calculate the values. We save the output as forest.data.summary.

forest.data.summary <- group_by(forest.data, Author) %>% 
  mean_qi(b_Intercept)

\[~\]

Here is the complete code for you to copy, paste, and adapt:

study.draws <- spread_draws(m.brm, r_Author[Author,], b_Intercept) %>% 
  mutate(b_Intercept = r_Author + b_Intercept)

pooled.effect.draws <- spread_draws(m.brm, b_Intercept) %>% 
  mutate(Author = "Pooled Effect")

forest.data <- bind_rows(study.draws, pooled.effect.draws) %>% 
   ungroup() %>%
   mutate(Author = str_replace_all(Author, "[.]", " ")) %>% 
   mutate(Author = reorder(Author, b_Intercept))

forest.data.summary <- group_by(forest.data, Author) %>% 
  mean_qi(b_Intercept)

\[~\]

We are now ready to generate the forest plot using the ggplot2 package. The code to generate the plot looks like this:

ggplot(aes(b_Intercept, relevel(Author, "Pooled Effect", after = Inf)), 
       data = forest.data) +
  geom_vline(xintercept = fixef(m.brm)[1, 1], color = "grey", size = 1) +
  geom_vline(xintercept = fixef(m.brm)[1, 3:4], color = "grey", linetype = 2) +
  geom_vline(xintercept = 0, color = "black", size = 1) +
  geom_density_ridges(fill = "blue", rel_min_height = 0.01, col = NA, scale = 1,
                      alpha = 0.8) +
  geom_pointintervalh(data = forest.data.summary, size = 1) +
  geom_text(data = mutate_if(forest.data.summary, is.numeric, round, 2),
    aes(label = glue("{b_Intercept} [{.lower}, {.upper}]"), x = Inf), hjust = "inward") +
  labs(x = "Standardized Mean Difference",
       y = element_blank()) +
  theme_minimal()

Looks good!

One thing is very important to mention here: the effect sizes displayed in the forest plot do not represent the observed effect sizes (i.e., the effect sizes we find reported in the original studies), but the estimate of the “true” effect size (\(\theta_k\)) of a study based on the Bayesian model. The effect sizes shown in the forest plot are equivalent to the study-wise estimates we saw when extracting the random effects using ranef(m.brm) (except that these were centered around the pooled effect size).



banner