Chapter 4 Calibration procedure

To perform the geographic assignment of samples of unknown origins (i.e. assignment samples), you need to produce an isoscape that describes how the isotope composition of source samples vary in space (see section 3). If the assignment samples are of different nature of the source samples (e.g. the source samples are precipitation water and the assignment samples are some animals), you will also need to apply a calibration procedure. The goal of the calibration procedure is to convert the isotope values of the assignment samples into their source isotope value equivalent so that they can directly be mapped onto the isoscape.

To perform such calibration procedure, we need to fit a statistical model characterizing the relationship between the isotopic composition of the source and assignment samples. We call such a statistical model the calibration model and the fitted model the calibration fit. In other publications, the latter is also referred to as the calibration function, calibration curve, or transfer function.

4.1 Preparing the data

To fit the calibration model, we need isotope measurements from samples of known origin (i.e. the calibration samples) and the isotopic composition of the environment at such origin locations. When assignment samples are animals, the usual practice is to derive the calibration data from sedentary individuals. That way the sampling location and the origin location are identical. Further, because most empiricists do not actually collect environmental samples at the locations of the calibration samples, IsoriX will infer the source isotope data corresponding to each calibration sample from the isoscape build in section 3.

Obtaining good data for the calibration fit is very important for obtaining a precise assignments. You should aim for data as close as possible to the following ideal situation:

  1. the data are from many calibration samples covering the entire assignment area. Thus your data should span the whole range of each covariate considered in the models, which should limit risks associated with extrapolation when using the fitted calibration model during the geographic assignment.

  2. the data correspond to many repeated measurements per location, which allows for a reliable estimation of the inter-sample variance in isotopic composition. Such variance captures the effect of temporal variation in source isotope values at each site.

  3. the timing of the sampling events for the calibration samples matches the overall sampling period of the source isotope data.

  4. the data are collected from samples directly comparable to those that need to be assigned; such as organisms from the same species and the same physiological state. Indeed, the relationship between the isotopic composition of an organism and that of its environment is known to be influenced by such factors.

As an example, we will use here the calibration dataset CalibDataBat that is provided with IsoriX. This dataset contains stable isotope ratios of the non-exchangeable portion of hydrogen in fur keratin of non-migratory bats sampled across Europe:

site_ID long lat elev sample_ID species sample_value
site_41 18.87101 46.19747 90 1 4 -71.8
site_41 18.87101 46.19747 90 2 4 -103.4
site_41 18.87101 46.19747 90 3 4 -68.9
site_41 18.87101 46.19747 90 4 4 -103.3
site_67 19.10489 48.53405 522 5 4 -92.3
site_67 19.10489 48.53405 522 6 4 -102.0

For your own applications, make sure that the dataset you use has the same structure. That is, the calibration dataset must contain a column site_ID of class factor describing the sampling site, the columns containing all covariates required for the geostatistical models to predict the isoscape values at these locations (so here long, lat, and elev), and a column sample_value containing the measured isotope values of the calibration samples. An important constraint to keep in mind is that only measurements on multiple calibration samples per location can capture variation between different samples exposed to the same isotope environment. Therefore, IsoriX needs multiple calibration samples for several (not necessarily all) sampling locations. That means that some levels of site_ID must be repeated on several rows. If laboratory measurements are replicated, we recommend to consider the mean measurement value for each sampling unit (here for each bat) instead of considering multiple rows for a given calibration sample in order to avoid confounding biological replicates from technical ones. If you do so, note however that this implies the technical variance to be neglected.

4.2 Fitting the model

With the calibration data at hand, we can thus proceed to fit the calibration model, which will be later used during the geographic assignment to convert the isotope values of the assignment samples into their source isotope value equivalent.

For fitting the calibration model, we use the function calibfit():

CalibBats <- calibfit(data = CalibDataBat, isofit = EuropeFit)

To get the result of the calibration fit, you can simply type the name of the object we just created, or plot it:

## Fixed effect estimates of the calibration fit 
## sample_value = intercept + slope * mean_source_value + corrMatrix(1|site_ID) + slope^2 * (1|site_ID) + Error 
##            intercept (+/- SE) = -32.22 +/- 6.18 
##            slope     (+/- SE) = 0.91 +/- 0.10 
## [for more information, use summary()]

The calibration model that is fitted is a particular linear mixed-effects model (LMM). The particularity lies in the calibration model controlling for the uncertainty of the isoscape predictions. The function calibfit() not only considers the prediction variance associated with each prediction from the mean model (which varies spatially) but also the prediction covariances between the different predictions of the isotope values across the isoscape. Moreover, the variation between calibration samples is captured by the residual variance of the calibration fit.

A strong assumption made while fitting the calibration model is that the relationship between the isotope values of the calibration samples and those of the environment are linearly related. For the moment, we offer no alternative to this assumption.