Chapter 3 Applications

3.1 Application I: Individual-level data

The dataset used here consists of a total of 14 metabolites (N = 14) and two groups (Liyanage et al. 2024). The 12 samples in the two groups had been extracted using the same method in a controlled experiment but run separately in two different instruments: Liquid Chromatography-Mass Spectrometry (LC/MS) and Gas Chromatography-Mass Spectrometry (GC/MS), leading to two separate datasets. Thus, in this example, each separate dataset from each instrument forms a different ‘study’ (\(K = 2\)), which can be integrated using meta-analysis methods. We will use `MetaHD’ for combining summary estimates obtained from these multiple metabolomic studies.

This GC-LC dataset is available in the `MetaHD’ package and can be loaded using the following command.

# Read the data
GCLC_data <- realdata$all

As explained in Section 3.1 `MetaHDInput’ function can then be used to calculate the observed effect sizes and their associated variances and within-study covariance matrices as shown below.

input_data <- MetaHDInput(GCLC_data)

Y <- input_data$Y
Slist <- input_data$Slist

Y
##           met1      met2         met3        met4        met5       met6
## GC -0.12216161 0.2408217  0.532996014 -0.06407204 -0.46280293 0.39930861
## LC  0.08730353 0.1074026 -0.002691788 -0.06451353 -0.02644625 0.08896313
##           met7       met8        met9      met10     met11     met12     met13
## GC -0.03582016  0.3558101 -0.33696133 -0.5623149 0.3591404 0.6540438 0.5257092
## LC  0.08684274 -0.1067551 -0.07022999  0.6563386 0.8378621 0.4823036 0.6567706
##         met14
## GC -0.6275364
## LC  0.5955336
head(Slist[[1]])
##             met1          met2        met3        met4          met5
## met1  0.66816536  0.0342995006 -0.13023460 0.042994441  0.0410169309
## met2  0.03429950  0.0219976435  0.04471369 0.007700712 -0.0003259779
## met3 -0.13023460  0.0447136868  0.99714813 0.028561626 -0.0153806902
## met4  0.04299444  0.0077007122  0.02856163 0.027139572  0.0142878276
## met5  0.04101693 -0.0003259779 -0.01538069 0.014287828  0.0456121062
## met6  0.06753777  0.0191195280  0.07398638 0.015472822  0.0042490167
##             met6       met7       met8         met9        met10       met11
## met1 0.067537772 0.06644299 0.19414892  0.042673455 -0.001689788 0.047069456
## met2 0.019119528 0.01098242 0.02419960  0.001167182 -0.006991485 0.022944009
## met3 0.073986377 0.02910402 0.02191034 -0.013459772 -0.029697299 0.142481398
## met4 0.015472822 0.01730561 0.03445267  0.012254275  0.005002414 0.022998932
## met5 0.004249017 0.01644365 0.03498665  0.019199078  0.015381516 0.009998191
## met6 0.073516461 0.02205751 0.05878542  0.005020851 -0.010286200 0.044605360
##              met12        met13      met14
## met1 -0.0200050944  0.005789995 0.07014588
## met2  0.0075920184  0.011643333 0.01405622
## met3  0.0777879265  0.080134515 0.08293142
## met4 -0.0003712348  0.006499256 0.03996221
## met5 -0.0099810545 -0.003704279 0.04585254
## met6  0.0151338588  0.021938095 0.01626860

MetaHD model can now be fitted as follows:

# Multivariate high dimensional meta analyis. 
model1 <- MetaHD(Y, Slist, method = "reml", bscov = "unstructured")
model1$estimate
##  [1]  0.075866139  0.102629970 -0.008317808 -0.049244565 -0.089172625
##  [6]  0.088244078  0.059983684 -0.092853540 -0.070839356  0.054232031
## [11]  0.556648094  0.428719822  0.540946192  0.298754741
model1$pVal
##         met1         met2         met3         met4         met5         met6 
## 2.464928e-03 1.681099e-01 9.279186e-01 3.846982e-01 4.302016e-01 3.140623e-01 
##         met7         met8         met9        met10        met11        met12 
## 4.237150e-01 5.205543e-02 2.005473e-01 9.049333e-01 4.354933e-02 2.102174e-03 
##        met13        met14 
## 1.611432e-05 3.884817e-01

When both within and between study correlations are set to zero, MetaHD reduces to the usual random effects model analysis of individual metabolites (i.e. univariate meta-analyses). Additionally, when the between-study variances are also set to zero, then MetaHD reduces to a fixed-effects model univariate meta-analysis. We can perform univariate random-effects and univariate fixed-effects meta-analysis in MetaHD as follows:

diag_Slist <- list()
K <- nrow(Y)
for (k in 1:K) {
  diag_Slist[[k]] <- diag(diag(Slist[[k]]))
}
# Univariate random-effects meta-analysis
model2 <- MetaHD(Y, diag_Slist, method = "reml", bscov = "diag")
model2$estimate
##  [1]  0.086997460  0.154229024  0.003813175 -0.064452935 -0.198813436
##  [6]  0.145163312  0.063695130 -0.104116978 -0.152139261 -0.045908911
## [11]  0.687629243  0.611359068  0.634467616  0.095250338
model2$pVal
##  [1] 5.365076e-03 7.921502e-02 9.723565e-01 2.909377e-01 3.513227e-01
##  [6] 2.245657e-01 4.851326e-01 4.208121e-02 2.162800e-01 9.392322e-01
## [11] 1.964117e-03 1.895315e-04 4.440892e-16 8.741418e-01
# Univariate fixed-effects meta-analysis
model3 <- MetaHD(Y, diag_Slist, method = "fixed")
model3$estimate
##  [1]  0.086997997  0.154228997  0.003810829 -0.064452935 -0.074409448
##  [6]  0.132726314  0.063695809 -0.104119067 -0.100490404 -0.431983086
## [11]  0.726651494  0.611359399  0.634468201  0.426236574
model3$pVal
##  [1] 5.324401e-03 7.921483e-02 9.723685e-01 2.909362e-01 2.933119e-01
##  [6] 1.923816e-01 4.851192e-01 4.199529e-02 9.265717e-02 9.209603e-03
## [11] 1.630553e-05 1.895119e-04 4.440892e-16 4.421984e-02

When missing data are available, we can use MetaHD to handle missing values by setting the argument “impute.na = TRUE”. This replaces the missing effect sizes and variances by zero and a large constant value respectively, so that the missing outcome is allocated a lower weight in the meta-analysis.

3.2 Application II: Summary-level data

In this example, we use a simulated dataset as described by (Liyanage et al. 2024). Here we have prepared separate data frames for observed effect sizes and within-study covariance matrices in the ‘simdata.1’ dataset as follows..

#Observed effect sizes
Observed_effects <- simdata.1$Y
head(Observed_effects)
##         metabolite_1 metabolite_2 metabolite_3 metabolite_4 metabolite_5
## study_1  0.231755235   -0.9377271    -3.306843   0.28142576   -0.5547623
## study_2  0.305408170   -1.2421302    -3.547565   0.66254064   -0.3162607
## study_3 -0.006523691   -1.6762418    -3.631345   0.68023162   -0.1846545
## study_4  0.283789796   -1.2533024    -3.545591   0.08533545   -1.0310925
## study_5 -0.748073950   -2.5502749    -4.386687  -1.08561122   -1.7532039
## study_6 -0.805069877   -2.2028554    -3.684388  -0.42021699   -1.3547757
##         metabolite_6 metabolite_7 metabolite_8 metabolite_9 metabolite_10
## study_1     2.822603    -1.268180     2.593808   -0.3936202    -0.6887844
## study_2     2.660019    -1.434220     2.558514   -0.2623882    -0.0245477
## study_3     2.421983    -1.085327     3.099015   -0.2417435     0.3493618
## study_4     2.245691    -1.448434     2.231622   -0.6266361    -0.4590747
## study_5     1.245282    -2.156295     1.503325   -1.4695383    -1.3845496
## study_6     2.730196    -1.792440     1.812767   -0.4075956    -0.9052131
##         metabolite_11 metabolite_12 metabolite_13 metabolite_14 metabolite_15
## study_1    1.25723073    -0.1669118     1.4819259      2.067361    -0.3992064
## study_2    1.50515960    -0.1288290     1.6942841      1.696545    -0.5495592
## study_3    1.02398696    -0.4566538     1.2831249      2.635289    -0.6367507
## study_4    1.01206691    -0.4866803     1.2184663      2.340882    -0.8044879
## study_5    0.05619328    -1.0199653     0.6820576      0.597100    -1.3620158
## study_6    0.11930101    -0.9310851     1.1849203      1.426632    -1.2488842
##         metabolite_16 metabolite_17 metabolite_18 metabolite_19 metabolite_20
## study_1      2.035879     1.6594976     1.4785488      2.025255     1.6325979
## study_2      1.757267     1.5286110     1.4712913      2.068347     1.7831811
## study_3      2.208511     1.2437205     1.7993070      1.877336     1.0564191
## study_4      1.971764     0.6971900     1.3849356      2.472191     1.4691468
## study_5      1.162320     0.5003652     0.3460756      1.408792     0.6553110
## study_6      1.383772     1.3982598     0.8289669      1.587887     0.7490209
##         metabolite_21 metabolite_22 metabolite_23 metabolite_24 metabolite_25
## study_1   -0.31458426    0.24703204      3.177385     1.5929012  -0.005622678
## study_2   -0.59096083   -0.18343696      2.512141     1.2888813   0.139087964
## study_3   -0.24805374   -0.08827114      3.071959     0.7913664   0.724998374
## study_4   -0.08455623   -0.18155226      2.534846     1.0723134  -0.258109359
## study_5   -1.20477797   -1.12187511      1.513707     0.2390821  -1.276384331
## study_6   -0.84545412   -0.58702267      2.308492     0.3502428  -0.838241155
##         metabolite_26 metabolite_27 metabolite_28 metabolite_29 metabolite_30
## study_1     -2.648980    -0.8436625     -0.904995      3.653576      5.543804
## study_2      1.797236    -0.4594717     -1.428674      3.077779      5.607509
## study_3     -3.313190    -1.0387204     -1.281120      3.045154      5.590184
## study_4     -3.528483    -0.8659648     -1.185520      4.045979      5.222089
## study_5     -4.118482    -2.0616843     -2.203919      2.531608      5.210625
## study_6     -3.853180    -0.9640716     -1.899273      3.069246      5.242326
#Covariance matrices 
Slist<-simdata.1$Slist
#E.g., for the first study,
head(Slist[[1]])
##              metabolite_1 metabolite_2 metabolite_3 metabolite_4 metabolite_5
## metabolite_1    0.2051971    0.2360390    0.1451389    0.2538377    0.2159263
## metabolite_2    0.2360390    0.3914241    0.2185096    0.2999849    0.2919311
## metabolite_3    0.1451389    0.2185096    0.1429655    0.2067593    0.1722842
## metabolite_4    0.2538377    0.2999849    0.2067593    0.3705385    0.2909224
## metabolite_5    0.2159263    0.2919311    0.1722842    0.2909224    0.3120427
## metabolite_6    0.3196974    0.3981609    0.2568697    0.4433588    0.3949941
##              metabolite_6 metabolite_7 metabolite_8 metabolite_9 metabolite_10
## metabolite_1    0.3196974    0.1667637    0.2076705   0.08103280    0.11133998
## metabolite_2    0.3981609    0.2405715    0.2527721   0.12311977    0.15263329
## metabolite_3    0.2568697    0.1591633    0.1504863   0.07664409    0.09779216
## metabolite_4    0.4433588    0.2554422    0.2650113   0.11012275    0.15222817
## metabolite_5    0.3949941    0.2104929    0.2520871   0.09973932    0.14457419
## metabolite_6    0.5994060    0.3069871    0.3358156   0.13793029    0.19313481
##              metabolite_11 metabolite_12 metabolite_13 metabolite_14
## metabolite_1     0.1579164     0.2122468     0.1870629     0.2068235
## metabolite_2     0.2037668     0.2946624     0.2341660     0.3079808
## metabolite_3     0.1147814     0.1891951     0.1460312     0.1852916
## metabolite_4     0.1918580     0.2831199     0.2571253     0.2892108
## metabolite_5     0.1842593     0.2537483     0.2475945     0.2650591
## metabolite_6     0.2576985     0.3589582     0.3438236     0.3801845
##              metabolite_15 metabolite_16 metabolite_17 metabolite_18
## metabolite_1     0.2090655     0.1991912    0.11785788     0.2532525
## metabolite_2     0.2931316     0.2709879    0.16230953     0.2834488
## metabolite_3     0.1698101     0.1665578    0.09688847     0.1828497
## metabolite_4     0.2479776     0.2599880    0.15284705     0.3259542
## metabolite_5     0.2702558     0.2520948    0.13001982     0.3052326
## metabolite_6     0.3523785     0.3548082    0.19915187     0.4357530
##              metabolite_19 metabolite_20 metabolite_21 metabolite_22
## metabolite_1    0.11496091     0.1299937     0.1487927     0.2316234
## metabolite_2    0.14725054     0.1725120     0.2002373     0.3218278
## metabolite_3    0.09486947     0.1084088     0.1201728     0.1890557
## metabolite_4    0.15936720     0.1823390     0.1959123     0.2944990
## metabolite_5    0.13993093     0.1637614     0.1944129     0.2788441
## metabolite_6    0.20217217     0.2420832     0.2604766     0.4135079
##              metabolite_23 metabolite_24 metabolite_25 metabolite_26
## metabolite_1     0.2108364     0.2635543     0.2442990     0.2897116
## metabolite_2     0.3002607     0.3586782     0.3305478     0.3275086
## metabolite_3     0.1783501     0.2206520     0.1977199     0.2085774
## metabolite_4     0.2774928     0.3359515     0.2996849     0.3678765
## metabolite_5     0.2557165     0.3522710     0.2888747     0.3518957
## metabolite_6     0.3512251     0.4720034     0.3845858     0.4956480
##              metabolite_27 metabolite_28 metabolite_29 metabolite_30
## metabolite_1     0.1090710     0.2335329     0.1896202     0.1358149
## metabolite_2     0.1323510     0.2954209     0.2487014     0.2008984
## metabolite_3     0.0808657     0.1835814     0.1600222     0.1192158
## metabolite_4     0.1407640     0.2946390     0.2564001     0.1816158
## metabolite_5     0.1330913     0.2770655     0.2196668     0.1739497
## metabolite_6     0.1689191     0.3889943     0.3132906     0.2299339
#Number of studies
K <- nrow(Observed_effects)
#Number of metabolites
N<-ncol(Observed_effects)

Now, we can carry out the meta-analysis using MetaHD as follows.

## Multivariate random-effects meta analyis
model_MetaHD <- MetaHD(Y=Observed_effects,
                  Slist=Slist,
                  method = "reml",
                  bscov = "unstructured")
print(model_MetaHD$estimate) 
##  [1] -0.1653862 -1.9799547 -3.7332264 -0.3679426 -1.2794924  2.2045394
##  [7] -1.8353620  1.9753857 -0.7671977 -0.9399812  0.6337954 -0.8684518
## [13]  1.0743271  1.3247917 -1.2438876  1.4045603  1.2003022  0.8966010
## [19]  1.7545069  1.1164275 -0.8513849 -0.6180309  2.4147547  0.5117378
## [25] -0.8791242 -3.6862738 -1.2616082 -1.7269992  3.0219733  5.5122717
# TRUE VALUES ARE: 
true_values<-c(-0.1675687 , -1.9658875 ,-3.7501346, -0.3722893, -1.2669714,  2.1815949, -1.8274545,  2.0032794, -0.7985332 ,-0.9362461 , 0.6539242 ,-0.8254938,  1.1240729,  1.3267165, -1.2057946,1.3967554,  1.1916929 , 0.9041837,  1.7934879,  1.1444330, -0.8233060, -0.5886543,  2.4371479,  0.4882229, -0.8903039, -3.6956073, -1.2576506, -1.7221614,  3.0298406,  5.4704779)
print(true_values)
##  [1] -0.1675687 -1.9658875 -3.7501346 -0.3722893 -1.2669714  2.1815949
##  [7] -1.8274545  2.0032794 -0.7985332 -0.9362461  0.6539242 -0.8254938
## [13]  1.1240729  1.3267165 -1.2057946  1.3967554  1.1916929  0.9041837
## [19]  1.7934879  1.1444330 -0.8233060 -0.5886543  2.4371479  0.4882229
## [25] -0.8903039 -3.6956073 -1.2576506 -1.7221614  3.0298406  5.4704779

In cases where the within-study correlations are not available, these could be estimated from the data as follows:

# If within-study correlations are not available, use only the variances and enter these values in a K x N matrix
Smat <- matrix(0, nrow = K, ncol = N)
for (i in 1:K) {
  Smat[i, ] <- diag(Slist[[i]])
}

model_MetaHD_est.wscor<- MetaHD(Observed_effects ,
                 Smat,
                 method = "reml",
                 est.wscor = TRUE)

Univariate random-effects meta analysis can be carried out as follows:

#Create matrices containing variances
diag_Slist <- list()
for (k in 1:K) {
  diag_Slist[[k]] <- diag(diag(Slist[[k]]))
}

# Univariate random-effects meta-analysis
model_random <- MetaHD(Observed_effects ,
                 diag_Slist,
                 method = "reml",
                 bscov = "diag")

Univariate fixed-effects meta analysis can be carried out as follows:

model_fixed <- MetaHD(Observed_effects ,
                      diag_Slist,
                      method = "fixed")

To compare the estimated values with the true values:

# Create a data frame of the differences
df <- data.frame(
  Model = factor(rep(c("MetaHD", "MetaHD_est.wscor", "Random", "Fixed"), 
                     each = N),
                 levels=c("MetaHD", "MetaHD_est.wscor", "Random", "Fixed")),
  Difference = c( true_values- model_MetaHD$estimate, 
                 true_values- model_MetaHD_est.wscor$estimate,
                 true_values -model_random$estimate,
                 true_values -model_fixed$estimate))

# Create the boxplots
ggplot(df, aes(x = Model, y = Difference, fill = Model)) +
  geom_boxplot() +
  geom_hline(yintercept = 0, linetype = "dashed") +
  scale_fill_manual(values = c("darkgreen", "red", "blue", "purple")) +
  labs(title = "Differences between true and estimated values",
       x = "Model",
       y = "Difference (True value - Estiamted values)")

References

Liyanage, Jayamini, Luke Prendergast, Robert Staudte, and Alysha De Livera. 2024. “MetaHD: A Multivariate Meta Analysis Model for Metabolomics Data.” Submitted for Review (2024)- Article Available on Request to Authors.