Presentation

In this document we present the different steps used to pre-treat the biomarker data in Sepages. To date this process has been applied to pregnancy urinary levels of phenols and phthalates and cytokine data. This process includes mainly:

  • imputation of data below the different detection limits using the fill-in method (Helsel 1990)
  • correction for protocol variables using the method presented in (Mortamais et al. 2012)

We will use the example of the Sepages pregnancy phenols data.

Here is an overview of the process and the different variables created:

boxes_and_circles val_crude val_crude less_30 >30% detected val_crude->less_30 more_30 <=30% detected val_crude->more_30 val_cat val_cat log_val log_val fill_in Impute values below LOD with fill-in method log_val->fill_in log_val_i log_val_i exp1 exponentiate log_val_i->exp1 correction correct for protocol vars log_val_i->correction val_i val_i log_val_i_cor log_val_i_cor exp2 exponentiate log_val_i_cor->exp2 val_i_cor val_i_cor categorise categorise categorise->val_cat log log transform log->log_val fill_in->log_val_i exp1->val_i correction->log_val_i_cor exp2->val_i_cor less_30->log more_30->categorise

Figure 0.1: Sepages data preprocess flow-chart