4.1 Unsupervised summary scores

A very intuitive approach to address this question is to create one or more summary score(s) that summarize individual levels of exposure to the mixture, thus reducing the number of covariates that are going to be evaluated. A very common example of such approach is used by investigators working on phthalates. In this context, analyses are often hampered by the presence of extreme correlation between metabolites of Di(2-ethylhexyl)phthalate (DEHP), and researchers are commonly summarizing this information into a molar sum of DEHP mtabolites. Li et al. (2019) writes, for example “we calculated the molar sum of DEHP metabolites (ΣDEHP) by dividing each metabolite concentration by its molecular weight and then summing: ΣDEHP=[MEHP (μg/L)×(1/278.34 (g/mol))]+[MEHHP (μg/L) × (1/294.34 (g/mol))] + [MEOHP (μg/L) × (1/292.33 (g/ mol))] + [MECPP (μg/L) × (1/308.33 (g/mol))].” Note that, with this approach, the score targets a selected sub-sample of exposures (the highly-correlated cluster creating problems), and other phthalates metabolites are included in the model without any transformation.

Another common approach is to use components derived from PCA, as described in section 2. PCA allows identifying continuous covariates that summarize the variability of the mixture exposure. Including these derived components into a regression model has the great advantage that all collinearity issues will be resolved, as the components are uncorrelated by definition. On the other hand, the validity of this approach is severely affected by whether the obtained components have clear biological interpretation. An example of application of this approach in environmental epidemiology can be found in Souter et al. (2020).


Li, Ming-Chieh, Lidia Mı́nguez-Alarcón, Andrea Bellavia, Paige L Williams, Tamarra James-Todd, Russ Hauser, Jorge E Chavarro, and Yu-Han Chiu. 2019. “Serum Beta-Carotene Modifies the Association Between Phthalate Mixtures and Insulin Resistance: The National Health and Nutrition Examination Survey 2003–2006.” Environmental Research 178: 108729.
Souter, Irene, Andrea Bellavia, Paige L Williams, TIM Korevaar, John D Meeker, Joseph M Braun, Ralph A de Poortere, et al. 2020. “Urinary Concentrations of Phthalate Metabolite Mixtures in Relation to Serum Biomarkers of Thyroid Function and Autoimmunity Among Women from a Fertility Center.” Environmental Health Perspectives 128 (6): 067007.