## 6.4 GOSH Plot Analysis

In the previous subchapter, we explored the robustness of our meta-analysis results using influence analyses and the leave-one-out method. An even more sophisticated way to explore the patterns of effect sizes and heterogeneity in our data are so-called **Graphic Display of Heterogeneity** (**GOSH**) plots (Olkin, Dahabreh, and Trikalinos 2012). For those plots, we fit the same meta-analysis model to **all possible subsets** of our included studies. In constrast to the leave-one-out method, we therefore not only fit \(K-1\) models, but all \(2^{k-1}\) possible study combinations. This means that creating GOSH plots can become quite computationally intensive when the number of included studies is large. The *R* implementation we cover here therefore only fits a maximum of 1 million randomly selected models.

Once the models are calculated, we can plot them, displaying the **pooled effect size** on the **x-axis** and the **between-study heterogeneity** at the **y-axis**. This allows us to look for specific patterns, for example subclusters with different effect sizes. This would indicate that there is in fact more than one “population” of effect sizes in our data, warranting a subgroup analysis. If the effect sizes in our sample are homogeneous, the GOSH plot should form a **symmetric distribution with one peak**. To generate GOSH plots, we can use the `gosh`

function in the `metafor`

package. If you have not installed the package yet, install it and then load it from your library.

`library(metafor)`

I will generate plots for my `m.hksj`

meta-analysis object. Before we can generate the plot, we have to **“transform”** this object created by the `meta`

package into a `metafor`

meta-analysis object which can be used for the `gosh`

function. To do this, i can simply use the data stored in `m.hksj`

. The function to perform a meta-analysis in `metafor`

is called `rma()`

. I only have to provide the function with the effect size (`TE`

), Standard Error (`seTE`

), and between-study heterogeneity estimator (`method.tau`

) stored in `m.hksj`

. Because I want to replicate the results completely, I also use Knapp-Hartung adjustments in `rma()`

by setting `test = "knha"`

.

```
m.rma <- rma(yi = m.hksj$TE,
sei = m.hksj$seTE,
method = m.hksj$method.tau,
test = "knha")
```

Please note that if you use the fixed-effect model, you have to set `method = "FE"`

.

**We can then use this object to generate the GOSH plot object.** Depending on the number of studies in your analysis, this can take some time, up to a few hours. I save it as `dat.gosh`

.

`dat.gosh <- gosh(m.rma)`

I can then display the the GOSH plot by pluggin the `dat.gosh`

object into the `plot()`

function.

`plot(dat.gosh, alpha= 0.1, col = "blue")`

Interestingly, we see that there are **two patterns** in our data: one with lower effect sizes and lower heterogeneity, and one with higher effects, but considerable between-study heterogeneity: it seems like there are **two subclusters in our data**.

Now that we know the **effect size-heterogeneity** patterns in our data, the really important question of course is: which studies cause those patterns, and may thus belong to which subcluster? To answer this question we have developed the `gosh.diagnostics`

function. This function uses **three clustering (also known as supervised machine learning) algorithms** to detect clusters in the GOSH plot data and determine which studies contribute the most to them automatically. The function is part of the `dmetar`

package. If you have the package installed already, you have to load it into your library first.

`library(dmetar)`

If you don’t want to use the `dmetar`

package, you can find the source code for this function here. In this case, *R* doesn’t know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**. The function then requires the `dplyr`

, `cluster`

, `mvtnorm`

, `factoextra`

, `fpc`

, `cowplot`

, `reshape2`

, and `flexmix`

package installed and loaded in your library to function properly. We will use the default settings for the function here, but many more are available (see documentation). We simply plug the `dat.gosh`

object into the function. Again, as the there is a lot of data to process, the function may take need some time before it is finished.

`gosh.diagnostics(dat.gosh)`

```
Perform Clustering...
[==============================================================================================] DONE
Number of k-means clusters used: 3
Number of DBSCAN clusters detected: 3
Number of GMM clusters detected: 2
Identification of potential outliers
---------------------------------
Studies identified as potential outliers:
- K-means: 3 16
- DBSCAN: 3 10 16
- Gaussian Mixture Model: 3 16
```

We see that the **three algorithms**, \(k\)-means, DBSCAN and the Gaussian Mixture Model, have detected **three studies** which might potentially contribute to the **cluster imbalance**: study 3, study 10, and study 16. These studies seem to nearly fully account for the second high-effect size, high-heterogeneity cluster we found.

**Let us see what happens if we rerun the meta-analysis, this time removing these three studies.**

```
metagen(TE,
seTE,
data=madata,
studlab=paste(Author),
comb.fixed = FALSE,
comb.random = TRUE,
method.tau = "SJ",
hakn = TRUE,
prediction=TRUE,
sm="SMD",
exclude = c(3, 10, 16))
```

```
## SMD 95%-CI %W(random) exclude
## Call et al. 0.7091 [ 0.1979; 1.2203] 4.6
## Cavanagh et al. 0.3549 [-0.0300; 0.7397] 7.1
## DanitzOrsillo 1.7912 [ 1.1139; 2.4685] 0.0 *
## de Vibe et al. 0.1825 [-0.0484; 0.4133] 13.2
## Frazier et al. 0.4219 [ 0.1380; 0.7057] 10.6
## Frogeli et al. 0.6300 [ 0.2458; 1.0142] 7.2
## Gallego et al. 0.7249 [ 0.2846; 1.1652] 5.8
## Hazlett-Stevens & Oren 0.5287 [ 0.1162; 0.9412] 6.5
## Hintz et al. 0.2840 [-0.0453; 0.6133] 8.8
## Kang et al. 1.2751 [ 0.6142; 1.9360] 0.0 *
## Kuhlmann et al. 0.1036 [-0.2781; 0.4853] 7.2
## Lever Taylor et al. 0.3884 [-0.0639; 0.8407] 5.6
## Phang et al. 0.5407 [ 0.0619; 1.0196] 5.1
## Rasanen et al. 0.4262 [-0.0794; 0.9317] 4.7
## Ratanasiripong 0.5154 [-0.1731; 1.2039] 2.8
## Shapiro et al. 1.4797 [ 0.8618; 2.0977] 0.0 *
## SongLindquist 0.6126 [ 0.1683; 1.0569] 5.8
## Warnecke et al. 0.6000 [ 0.1120; 1.0880] 5.0
##
## Number of studies combined: k = 15
##
## SMD 95%-CI t p-value
## Random effects model 0.4299 [0.3234; 0.5364] 8.66 < 0.0001
## Prediction interval [0.1428; 0.7169]
##
## Quantifying heterogeneity:
## tau^2 = 0.0152; H = 1.00 [1.00; 1.44]; I^2 = 0.0% [0.0%; 51.8%]
##
## Test of heterogeneity:
## Q d.f. p-value
## 13.48 14 0.4890
##
## Details on meta-analytical method:
## - Inverse variance method
## - Sidik-Jonkman estimator for tau^2
## - Hartung-Knapp adjustment for random effects model
```

We see that the between-study heterogeneity has **dropped to** \(0\%\), indicating that the studies we are now analyzing **stem from one homogeneous (sub)population**, the lower cluster in our GOSH plot.

### References

Olkin, Ingram, Issa J Dahabreh, and Thomas A Trikalinos. 2012. “GOSH–a Graphical Display of Study Heterogeneity.” *Research Synthesis Methods* 3 (3). Wiley Online Library: 214–23.