Chapter 3 Applications

3.1 Application I: Individual-level data

The dataset used here is a subset of data taken from ‘MetaboAnalyst’ (Xia et al. 2009) example datasets in the statistical meta-analysis section. These data are from a study investigating biomarkers of adenocarcinoma in blood and this dataset consists of a total of 14 metabolites (N = 14) and 172 samples collected from two studies (K = 2). We will use `MetaHD’ for combining summary estimates obtained from these multiple metabolomic studies.

This real dataset is available in the `MetaHD’ package and can be loaded using the following command.

# Read the data
example_data <- realdata

As explained in Section 3.1 `MetaHDInput’ function can then be used to calculate the observed effect sizes and their associated variances and within-study covariance matrices as shown below.

input_data <- MetaHDInput(example_data)

Y <- input_data$Y
Slist <- input_data$Slist

Y
##               met1      met2       met3      met4        met5         met6
## study1 -0.08509765 0.0693374 0.21205246 0.1555140 -0.07131115 -0.010729262
## study2 -0.19103975 0.1985993 0.05266387 0.3042082 -0.08507508 -0.002200277
##               met7        met8        met9      met10      met11       met12
## study1 -0.06351313  0.07239534 -0.02854481 -0.1874413 0.07670815 -0.04170964
## study2 -0.02427915 -0.08497286 -0.10315078 -0.1196099 0.07105905  0.21213759
##            met13       met14
## study1 0.3378766 -0.20944830
## study2 0.2136121 -0.03345043
head(Slist[[1]])
##               met1         met2          met3         met4         met5
## met1  0.0194438538 0.0005017922 -0.0025219163 -0.002766591  0.004992537
## met2  0.0005017922 0.0161291933  0.0042058671  0.019843118  0.002535913
## met3 -0.0025219163 0.0042058671  0.0078052190  0.007030307 -0.001285968
## met4 -0.0027665912 0.0198431182  0.0070303073  0.074406890 -0.003247293
## met5  0.0049925375 0.0025359133 -0.0012859676 -0.003247293  0.007725915
## met6  0.0036952818 0.0035899796  0.0007582538  0.004244171  0.003155352
##              met6          met7          met8        met9        met10
## met1 0.0036952818  0.0036649961  0.0014738763 0.002466527  0.006057970
## met2 0.0035899796  0.0023004763  0.0009291760 0.003908964  0.001622693
## met3 0.0007582538 -0.0002967142  0.0010746462 0.001839553 -0.002939468
## met4 0.0042441714  0.0033677211  0.0053704235 0.004464203 -0.001121570
## met5 0.0031553517  0.0026034856 -0.0006582206 0.002925312  0.005845677
## met6 0.0035383608  0.0025416742  0.0003542152 0.003045991  0.003098343
##            met11        met12        met13        met14
## met1 0.003742686 0.0058389366 -0.017656900  0.006189749
## met2 0.006128897 0.0057661483 -0.015584915  0.002615291
## met3 0.002551313 0.0003375944  0.001663327 -0.001850528
## met4 0.005545238 0.0033374745 -0.012707585  0.001019416
## met5 0.004305489 0.0060072612 -0.020351263  0.004988612
## met6 0.003080440 0.0039539767 -0.012411644  0.003099160

MetaHD model can now be fitted as follows:

# Multivariate high dimensional meta analyis. 
model1 <- MetaHD(Y, Slist, method = "reml", bscov = "unstructured")
model1$estimate
##  [1] -0.1295064336  0.1390727748  0.1186529890  0.2030849313 -0.0405892469
##  [6] -0.0001319413 -0.0338595880 -0.0052455116 -0.0474149096 -0.1239637300
## [11]  0.0946834953  0.1106047982  0.1908312201 -0.1083826617
model1$pVal
##       met1       met2       met3       met4       met5       met6       met7 
## 0.09918387 0.08598070 0.09625876 0.25815737 0.36333498 0.99449676 0.28099443 
##       met8       met9      met10      met11      met12      met13      met14 
## 0.92768049 0.41526530 0.01380831 0.03063691 0.34974577 0.01362111 0.12480495

When both within and between study correlations are set to zero, MetaHD reduces to the usual random effects model analysis of individual metabolites (i.e. univariate meta-analyses). Additionally, when the between-study variances are also set to zero, then MetaHD reduces to a fixed-effects model univariate meta-analysis. We can perform univariate random-effects and univariate fixed-effects meta-analysis in MetaHD as follows:

diag_Slist <- list()
K <- nrow(Y)
for (k in 1:K) {
  diag_Slist[[k]] <- diag(diag(Slist[[k]]))
}
# Univariate random-effects meta-analysis
model2 <- MetaHD(Y, diag_Slist, method = "reml", bscov = "diag")
model2$estimate
##  [1] -0.152812327  0.137090057  0.125448886  0.236485102 -0.079158201
##  [6] -0.003400281 -0.041614290 -0.005823006 -0.067093550 -0.145743651
## [11]  0.073429537  0.097636180  0.224770248 -0.115441898
model2$pVal
##  [1] 0.06809548 0.11762486 0.11412429 0.19892382 0.16958255 0.87887530
##  [7] 0.28159217 0.94100531 0.29722212 0.01244124 0.16187645 0.43956056
## [13] 0.00950358 0.18855174
# Univariate fixed-effects meta-analysis
model3 <- MetaHD(Y, diag_Slist, method = "fixed")
model3$estimate
##  [1] -0.152812327  0.137090057  0.119263841  0.236485102 -0.079158213
##  [6] -0.003400281 -0.041614290 -0.004927685 -0.067093550 -0.145743651
## [11]  0.073429398  0.139309196  0.224769883 -0.102735148
model3$pVal
##  [1] 0.068095484 0.117624861 0.036764299 0.198923825 0.169579632 0.878875296
##  [7] 0.281592165 0.914730005 0.297222122 0.012441239 0.161809425 0.011347289
## [13] 0.009502382 0.034970100

When missing data are available, we can use MetaHD to handle missing values by setting the argument “impute.na = TRUE”. This replaces the missing effect sizes and variances by zero and a large constant value respectively, so that the missing outcome is allocated a lower weight in the meta-analysis.

3.2 Application II: Summary-level data

In this example, we use a simulated dataset as described by (Liyanage et al. 2024). Here we have prepared separate data frames for observed effect sizes and within-study covariance matrices in the ‘simdata.1’ dataset as follows..

#Observed effect sizes
Observed_effects <- simdata.1$Y
head(Observed_effects)
##         metabolite_1 metabolite_2 metabolite_3 metabolite_4 metabolite_5
## study_1  0.231755235   -0.9377271    -3.306843   0.28142576   -0.5547623
## study_2  0.305408170   -1.2421302    -3.547565   0.66254064   -0.3162607
## study_3 -0.006523691   -1.6762418    -3.631345   0.68023162   -0.1846545
## study_4  0.283789796   -1.2533024    -3.545591   0.08533545   -1.0310925
## study_5 -0.748073950   -2.5502749    -4.386687  -1.08561122   -1.7532039
## study_6 -0.805069877   -2.2028554    -3.684388  -0.42021699   -1.3547757
##         metabolite_6 metabolite_7 metabolite_8 metabolite_9 metabolite_10
## study_1     2.822603    -1.268180     2.593808   -0.3936202    -0.6887844
## study_2     2.660019    -1.434220     2.558514   -0.2623882    -0.0245477
## study_3     2.421983    -1.085327     3.099015   -0.2417435     0.3493618
## study_4     2.245691    -1.448434     2.231622   -0.6266361    -0.4590747
## study_5     1.245282    -2.156295     1.503325   -1.4695383    -1.3845496
## study_6     2.730196    -1.792440     1.812767   -0.4075956    -0.9052131
##         metabolite_11 metabolite_12 metabolite_13 metabolite_14 metabolite_15
## study_1    1.25723073    -0.1669118     1.4819259      2.067361    -0.3992064
## study_2    1.50515960    -0.1288290     1.6942841      1.696545    -0.5495592
## study_3    1.02398696    -0.4566538     1.2831249      2.635289    -0.6367507
## study_4    1.01206691    -0.4866803     1.2184663      2.340882    -0.8044879
## study_5    0.05619328    -1.0199653     0.6820576      0.597100    -1.3620158
## study_6    0.11930101    -0.9310851     1.1849203      1.426632    -1.2488842
##         metabolite_16 metabolite_17 metabolite_18 metabolite_19 metabolite_20
## study_1      2.035879     1.6594976     1.4785488      2.025255     1.6325979
## study_2      1.757267     1.5286110     1.4712913      2.068347     1.7831811
## study_3      2.208511     1.2437205     1.7993070      1.877336     1.0564191
## study_4      1.971764     0.6971900     1.3849356      2.472191     1.4691468
## study_5      1.162320     0.5003652     0.3460756      1.408792     0.6553110
## study_6      1.383772     1.3982598     0.8289669      1.587887     0.7490209
##         metabolite_21 metabolite_22 metabolite_23 metabolite_24 metabolite_25
## study_1   -0.31458426    0.24703204      3.177385     1.5929012  -0.005622678
## study_2   -0.59096083   -0.18343696      2.512141     1.2888813   0.139087964
## study_3   -0.24805374   -0.08827114      3.071959     0.7913664   0.724998374
## study_4   -0.08455623   -0.18155226      2.534846     1.0723134  -0.258109359
## study_5   -1.20477797   -1.12187511      1.513707     0.2390821  -1.276384331
## study_6   -0.84545412   -0.58702267      2.308492     0.3502428  -0.838241155
##         metabolite_26 metabolite_27 metabolite_28 metabolite_29 metabolite_30
## study_1     -2.648980    -0.8436625     -0.904995      3.653576      5.543804
## study_2      1.797236    -0.4594717     -1.428674      3.077779      5.607509
## study_3     -3.313190    -1.0387204     -1.281120      3.045154      5.590184
## study_4     -3.528483    -0.8659648     -1.185520      4.045979      5.222089
## study_5     -4.118482    -2.0616843     -2.203919      2.531608      5.210625
## study_6     -3.853180    -0.9640716     -1.899273      3.069246      5.242326
#Covariance matrices 
Slist<-simdata.1$Slist
#E.g., for the first study,
head(Slist[[1]])
##              metabolite_1 metabolite_2 metabolite_3 metabolite_4 metabolite_5
## metabolite_1    0.2051971    0.2360390    0.1451389    0.2538377    0.2159263
## metabolite_2    0.2360390    0.3914241    0.2185096    0.2999849    0.2919311
## metabolite_3    0.1451389    0.2185096    0.1429655    0.2067593    0.1722842
## metabolite_4    0.2538377    0.2999849    0.2067593    0.3705385    0.2909224
## metabolite_5    0.2159263    0.2919311    0.1722842    0.2909224    0.3120427
## metabolite_6    0.3196974    0.3981609    0.2568697    0.4433588    0.3949941
##              metabolite_6 metabolite_7 metabolite_8 metabolite_9 metabolite_10
## metabolite_1    0.3196974    0.1667637    0.2076705   0.08103280    0.11133998
## metabolite_2    0.3981609    0.2405715    0.2527721   0.12311977    0.15263329
## metabolite_3    0.2568697    0.1591633    0.1504863   0.07664409    0.09779216
## metabolite_4    0.4433588    0.2554422    0.2650113   0.11012275    0.15222817
## metabolite_5    0.3949941    0.2104929    0.2520871   0.09973932    0.14457419
## metabolite_6    0.5994060    0.3069871    0.3358156   0.13793029    0.19313481
##              metabolite_11 metabolite_12 metabolite_13 metabolite_14
## metabolite_1     0.1579164     0.2122468     0.1870629     0.2068235
## metabolite_2     0.2037668     0.2946624     0.2341660     0.3079808
## metabolite_3     0.1147814     0.1891951     0.1460312     0.1852916
## metabolite_4     0.1918580     0.2831199     0.2571253     0.2892108
## metabolite_5     0.1842593     0.2537483     0.2475945     0.2650591
## metabolite_6     0.2576985     0.3589582     0.3438236     0.3801845
##              metabolite_15 metabolite_16 metabolite_17 metabolite_18
## metabolite_1     0.2090655     0.1991912    0.11785788     0.2532525
## metabolite_2     0.2931316     0.2709879    0.16230953     0.2834488
## metabolite_3     0.1698101     0.1665578    0.09688847     0.1828497
## metabolite_4     0.2479776     0.2599880    0.15284705     0.3259542
## metabolite_5     0.2702558     0.2520948    0.13001982     0.3052326
## metabolite_6     0.3523785     0.3548082    0.19915187     0.4357530
##              metabolite_19 metabolite_20 metabolite_21 metabolite_22
## metabolite_1    0.11496091     0.1299937     0.1487927     0.2316234
## metabolite_2    0.14725054     0.1725120     0.2002373     0.3218278
## metabolite_3    0.09486947     0.1084088     0.1201728     0.1890557
## metabolite_4    0.15936720     0.1823390     0.1959123     0.2944990
## metabolite_5    0.13993093     0.1637614     0.1944129     0.2788441
## metabolite_6    0.20217217     0.2420832     0.2604766     0.4135079
##              metabolite_23 metabolite_24 metabolite_25 metabolite_26
## metabolite_1     0.2108364     0.2635543     0.2442990     0.2897116
## metabolite_2     0.3002607     0.3586782     0.3305478     0.3275086
## metabolite_3     0.1783501     0.2206520     0.1977199     0.2085774
## metabolite_4     0.2774928     0.3359515     0.2996849     0.3678765
## metabolite_5     0.2557165     0.3522710     0.2888747     0.3518957
## metabolite_6     0.3512251     0.4720034     0.3845858     0.4956480
##              metabolite_27 metabolite_28 metabolite_29 metabolite_30
## metabolite_1     0.1090710     0.2335329     0.1896202     0.1358149
## metabolite_2     0.1323510     0.2954209     0.2487014     0.2008984
## metabolite_3     0.0808657     0.1835814     0.1600222     0.1192158
## metabolite_4     0.1407640     0.2946390     0.2564001     0.1816158
## metabolite_5     0.1330913     0.2770655     0.2196668     0.1739497
## metabolite_6     0.1689191     0.3889943     0.3132906     0.2299339
#Number of studies
K <- nrow(Observed_effects)
#Number of metabolites
N<-ncol(Observed_effects)

Now, we can carry out the meta-analysis using MetaHD as follows.

## Multivariate random-effects meta analyis
model_MetaHD <- MetaHD(Y=Observed_effects,
                  Slist=Slist,
                  method = "reml",
                  bscov = "unstructured")
print(model_MetaHD$estimate) 
##  [1] -0.1653862 -1.9799547 -3.7332264 -0.3679426 -1.2794924  2.2045394
##  [7] -1.8353620  1.9753857 -0.7671977 -0.9399812  0.6337954 -0.8684518
## [13]  1.0743271  1.3247917 -1.2438876  1.4045603  1.2003022  0.8966010
## [19]  1.7545069  1.1164275 -0.8513849 -0.6180309  2.4147547  0.5117378
## [25] -0.8791242 -3.6862738 -1.2616082 -1.7269992  3.0219733  5.5122717
# TRUE VALUES ARE: 
true_values<-c(-0.1675687 , -1.9658875 ,-3.7501346, -0.3722893, -1.2669714,  2.1815949, -1.8274545,  2.0032794, -0.7985332 ,-0.9362461 , 0.6539242 ,-0.8254938,  1.1240729,  1.3267165, -1.2057946,1.3967554,  1.1916929 , 0.9041837,  1.7934879,  1.1444330, -0.8233060, -0.5886543,  2.4371479,  0.4882229, -0.8903039, -3.6956073, -1.2576506, -1.7221614,  3.0298406,  5.4704779)
print(true_values)
##  [1] -0.1675687 -1.9658875 -3.7501346 -0.3722893 -1.2669714  2.1815949
##  [7] -1.8274545  2.0032794 -0.7985332 -0.9362461  0.6539242 -0.8254938
## [13]  1.1240729  1.3267165 -1.2057946  1.3967554  1.1916929  0.9041837
## [19]  1.7934879  1.1444330 -0.8233060 -0.5886543  2.4371479  0.4882229
## [25] -0.8903039 -3.6956073 -1.2576506 -1.7221614  3.0298406  5.4704779

In cases where the within-study correlations are not available, these could be estimated from the data as follows:

# If within-study correlations are not available, use only the variances and enter these values in a K x N matrix
Smat <- matrix(0, nrow = K, ncol = N)
for (i in 1:K) {
  Smat[i, ] <- diag(Slist[[i]])
}

model_MetaHD_est.wscor<- MetaHD(Observed_effects ,
                 Smat,
                 method = "reml",
                 est.wscor = TRUE)

Univariate random-effects meta analysis can be carried out as follows:

#Create matrices containing variances
diag_Slist <- list()
for (k in 1:K) {
  diag_Slist[[k]] <- diag(diag(Slist[[k]]))
}

# Univariate random-effects meta-analysis
model_random <- MetaHD(Observed_effects ,
                 diag_Slist,
                 method = "reml",
                 bscov = "diag")

Univariate fixed-effects meta analysis can be carried out as follows:

model_fixed <- MetaHD(Observed_effects ,
                      diag_Slist,
                      method = "fixed")

To compare the estimated values with the true values:

# Create a data frame of the differences
df <- data.frame(
  Model = factor(rep(c("MetaHD", "MetaHD_est.wscor", "Random", "Fixed"), 
                     each = N),
                 levels=c("MetaHD", "MetaHD_est.wscor", "Random", "Fixed")),
  Difference = c( true_values- model_MetaHD$estimate, 
                 true_values- model_MetaHD_est.wscor$estimate,
                 true_values -model_random$estimate,
                 true_values -model_fixed$estimate))

# Create the boxplots
ggplot(df, aes(x = Model, y = Difference, fill = Model)) +
  geom_boxplot() +
  geom_hline(yintercept = 0, linetype = "dashed") +
  scale_fill_manual(values = c("darkgreen", "red", "blue", "purple")) +
  labs(title = "Differences between true and estimated values",
       x = "Model",
       y = "Difference (True value - Estiamted values)")

References

Liyanage, Jayamini, Luke Prendergast, Robert Staudte, and Alysha De Livera. 2024. “MetaHD: A Multivariate Meta Analysis Model for Metabolomics Data.” Submitted for Review (2024)- Article Available on Request to Authors.
Xia, Jianguo, Nick Psychogios, Nelson Young, and David S Wishart. 2009. “MetaboAnalyst: A Web Server for Metabolomic Data Analysis and Interpretation.” Nucleic Acids Research 37 (suppl_2): W652–60.