1.3 Broad classification(s) of statistical approaches

Over the last few years several papers have reviewed the existing literature on statistical methods for mixtures and provide different criteria for their classifications. Among these, some recommended readings are Hamra and Buckley (2018), Stafoggia et al. (2017), Gibson et al. (2019), and Lazarevic et al. (2019). Simple and relevant classification criteria are the following:

  1. Supervised vs unsupervised procedures

This first distinction refers to whether or not the mixture is evaluated by taking into account its association with a given outcome of interest. We will discuss in Section 2 that, before evaluating the effects of our exposures on health outcomes, it is important to carefully assess the features of the mixture, especially when this is composed by a high number of components, investigating its correlations structure and identifying the presence of subgroups or clusters of exposures. To this end, we turn to unsupervised techniques that directly focus on characterizing the complex mixture of exposures without any reference to a given outcome of interest such as principal component analysis. Supervised techniques, on the other hand, attempt to account for the complex nature of exposures while investigating a given mixture-outcome association.

  1. Data reduction vs variable selection techniques.

The common goal of all approaches that we will discuss is to reduce the complexity of the data to be able to assess mixtures-outcome associations while losing as little information as possible. This is broadly done in two ways: by summarizing the original exposures into fewer covariates, or by selecting targeted elements of the mixture. We can use the term “data reduction approaches” to describe those techniques that reduce the dimension of the mixture by generating new variables (scores, components, indexes ). On the other hand, methodologies that select specific elements of the mixture that are directly evaluated with respect to the outcome can be defined as “variable selection approaches”.

References

Gibson, Elizabeth A, Yanelli Nunez, Ahlam Abuawad, Ami R Zota, Stefano Renzetti, Katrina L Devick, Chris Gennings, Jeff Goldsmith, Brent A Coull, and Marianthi-Anna Kioumourtzoglou. 2019. “An Overview of Methods to Address Distinct Research Questions on Environmental Mixtures: An Application to Persistent Organic Pollutants and Leukocyte Telomere Length.” Environmental Health 18 (1): 1–16.
Hamra, Ghassan B, and Jessie P Buckley. 2018. “Environmental Exposure Mixtures: Questions and Methods to Address Them.” Current Epidemiology Reports 5 (2): 160–65.
Lazarevic, Nina, Adrian G Barnett, Peter D Sly, and Luke D Knibbs. 2019. “Statistical Methodology in Studies of Prenatal Exposure to Mixtures of Endocrine-Disrupting Chemicals: A Review of Existing Approaches and New Alternatives.” Environmental Health Perspectives 127 (2): 026001.
Stafoggia, Massimo, Susanne Breitner, Regina Hampel, and Xavier Basagaña. 2017. “Statistical Approaches to Address Multi-Pollutant Mixtures and Multiple Exposures: The State of the Science.” Current Environmental Health Reports 4 (4): 481–90.