Climate is one of the most important factors influencing the geographic ranges of species. Each species is adapted to a particular set of climate conditions, which is referred to as its climate niche. The fundamental niche is the hypothetical range of conditions in which a species could persist independently of other species. The realized niche is the narrow range of conditions in which a species can persist when interactions with other species, such as predation and competition, are also taken into account. The realized niche can be characterized by observing where a species is present and absent over large areas and modeling the association between species occurrence and climate variables measured where the species is present and absent (Elith and Leathwick 2009).
Because of the changes in Earth’s climate that are occurring as a result of anthropogenic greenhouse gas emissions, the locations that are most climatically favorable for different species will shift over time. These changes have the potential to radically alter local communities and ecosystems. There is concern that some species may be threatened if they are not able to migrate fast enough to keep up with shifting climate patterns or become disjunct from other mutualistic species such as pollinators. This chapter will explore these issues by modeling the relationships between tree species and climate in mountains of the Pacific Northwest and analyzing how species distributions may change under a future climate change scenario.
Three new packages will be used in this chapter. The spatstat package (Baddeley, Turner, and Rubak 2022) implements methods for the manipulation and analysis of point pattern data. The dismo package (Hijmans et al. 2022) contains a suite of tools for species distribution modeling. The ROCR package (Sing et al. 2005) is used to generate receiver operating characteristics curves and other graphs and statistics to assess prediction accuracy.
library(tidyverse) library(sf) library(terra) library(colorspace) library(ggspatial) library(dismo) library(spatstat) library(ROCR) source("rasterdf.R")
The study will encompass the Cascade Mountain Range in Washington State. This area encompasses three ecoregions: the Cascades, East Cascades, and North Cascades. These polygons are read in from a shapefile. Two additional datasets are read in with presence and absence points for subalpine fir (Abies lasiocarpa) and Douglas-fir (Pseudotsuga menziesii). Note that Douglas is a proper name, and Douglas-fir is not a true fir in the genus Abies, which is why the common name of the species is capitalized and hyphenated. The data are from forest inventory plots monitored by the U.S. Forest Service and other land management agencies. These data files were derived from the archived datasets provided by Charney et al. (2021).
<- st_read("wacascades.shp", quiet = TRUE) wacascades <- read_csv("abla.csv") abla <- read_csv("psme.csv")psme
psme data frames contain columns for longitude and latitude (
Y) along with a column of binary (1/0) data for species presence/absence. These data frames are converted into
sf objects and assigned the correct coordinate system (geographic with NAD83 datum).
<- st_as_sf(abla, abla_pts coords = c("X", "Y")) st_crs(abla_pts) <- 4326 <- st_as_sf(psme, psme_pts coords = c("X", "Y")) st_crs(psme_pts) <- 4326
The map of presence-absence points shows that subalpine fir is restricted to relatively high elevations near the Cascade crest (Figure 12.1).
ggplot() + geom_sf(data = abla_pts, aes(color = as.character(abla)), size = 0.25) + geom_sf(data = wacascades, fill = NA) + scale_color_manual(name = "Subalpine Fir", values = c("gray", "darkgreen"), labels = c("Absent", "Present")) + annotation_scale(location = 'br') + theme_void()
In contrast, Douglas-fir is distributed more widely at lower elevations and along the eastern edge of the Cascades (Figure 12.2).
ggplot() + geom_sf(data = psme_pts, aes(color = as.character(psme)), size = 0.25) + geom_sf(data = wacascades, fill = NA) + scale_color_manual(name = "Douglas-fir", values = c("gray", "darkgreen"), labels = c("Absent", "Present")) + annotation_scale(location = 'br') + theme_void()
One of the challenges with visualizing the points is that the overall density of plots is variable. In general, there is a higher density of plots on public lands such as National Forests and a lower density on private lands. One way to try to see the occurrence patterns more clearly is to compute kernel-weighted local means. This technique is a type of point-to-raster conversion in which each raster cell in the output is assigned a mean of the local presence/absence values, where points located closer to the cell are given a higher weight than those more distant.
To make this transformation, the point data must be in a projected coordinate system instead of a geographic coordinate system. The Cascades county shapefile as well as the subalpine fir and Dougas-fir point datasets are all reprojected into UTM zone 10 north.
<- st_transform(wacascades, 32610) boundary_utm <- st_transform(abla_pts, 32610) abla_utm <- st_transform(psme_pts, 32610)psme_utm
The spatstat package contains a function for generating kernel-weighted means. However, it cannot directly read the data as
sf objects. They must first be converted to spatstat
ppp objects using the
as.ppp() function. The column of ones and zeroes is automatically read in and stored as the “marks” for the point pattern object. Then the
as.owin() function is used to create an
owin (observation window) object and assign it to the
<- as.ppp(abla_utm) abla_ppp class(abla_ppp) ##  "ppp" Window(abla_ppp) <- as.owin(boundary_utm) abla_ppp## Marked planar point pattern: 4356 points ## marks are numeric, of storage type 'double' ## window: polygonal boundary ## enclosing rectangle: [511678.3, 760623.9] x [5046278, ## 5432695] units
Smooth.ppp() function can now be used to compute kernel-weighted means. The
sigma argument specifies the
bandwidth of the kernel, which can be conceptualized as the approximate radius of the local window used to compute the means. The
eps argument specifies the grid cell size of the output grid. The output is a spatstat
im object, which can be converted to a terra
SpatRaster object. The coordinate reference system information is lost in the conversion, so the same CRS as the original
abla_ppp object is assigned to the new raster dataset.
<- Smooth.ppp(abla_ppp, abla_im sigma=10000, eps=c(1000, 1000)) class(abla_im) ##  "im" <- rast(abla_im) abla_grid crs(abla_grid) <-"epsg:32610"
The output provides a smoothed estimate of the proportion of plots with subalpine fir (Figure 12.3).
<- rasterdf(abla_grid) abla_df ggplot(data = abla_df) + geom_raster(aes(x = x, y = y, fill = value)) + scale_fill_gradient(name = "Subalpine Fir", low = "lightyellow", high = "darkgreen", na.value = NA) + geom_sf(data = boundary_utm, fill = NA) + annotation_scale(location = 'br') + coord_sf(expand = F) + theme_void()
The same procedure is repeated for Douglas-fir to generate a smoothed raster of kernel-weighted means.
<- as.ppp(psme_utm) psme_ppp Window(psme_ppp) <- as.owin(boundary_utm) <- Smooth.ppp(psme_ppp, psme_im sigma=10000, eps=c(1000, 1000)) <- rast(psme_im) psme_grid crs(psme_grid) <-"epsg:32610"
Mapping the smoothed occurrence data helps to show that Douglas-fir is broadly distributed in the southwestern portion of Washington Cascades (Figure 12.4), which is not as clear when looking at the individual points (Figure 12.2).
<- rasterdf(psme_grid) psme_df ggplot(data = psme_df) + geom_raster(aes(x = x, y = y, fill = value)) + scale_fill_gradient(name = "Douglas-fir", low = "lightyellow", high = "darkgreen", na.value = NA) + geom_sf(data = boundary_utm, fill = NA) + annotation_scale(location = 'br') + coord_sf(expand = F) + theme_void()
The WorldClim dataset will be used to analyze relationships between tree species and climate. Worldclim is a downscaled global climate dataset available as a 1000 m grid that captures fine-scale variation associated with elevation (Fick and Hijmans 2017). Thus, it is well-suited for rugged terrain like the Washington Cascades. The basic WorClim variables include monthly temperature and precipitation summarized as monthly climatologies over a 30-year period. These monthly data have been used to generate a set of bioclimatic variables that are more suitable as predictors in species distribution models (Table 12.1).
<- rast("wc2.1_30s_bio_washington.tif") wcbio nlyr(wcbio) ##  19 names(wcbio) ##  "wc2.1_30s_bio_washington_1" ##  "wc2.1_30s_bio_washington_2" ##  "wc2.1_30s_bio_washington_3" ##  "wc2.1_30s_bio_washington_4" ##  "wc2.1_30s_bio_washington_5" ##  "wc2.1_30s_bio_washington_6" ##  "wc2.1_30s_bio_washington_7" ##  "wc2.1_30s_bio_washington_8" ##  "wc2.1_30s_bio_washington_9" ##  "wc2.1_30s_bio_washington_10" ##  "wc2.1_30s_bio_washington_11" ##  "wc2.1_30s_bio_washington_12" ##  "wc2.1_30s_bio_washington_13" ##  "wc2.1_30s_bio_washington_14" ##  "wc2.1_30s_bio_washington_15" ##  "wc2.1_30s_bio_washington_16" ##  "wc2.1_30s_bio_washington_17" ##  "wc2.1_30s_bio_washington_18" ##  "wc2.1_30s_bio_washington_19"
The default layer names are replaced by a set of abbreviated codes.
<- paste0("bio", c(1, 10:19, 2:9)) wcbnames names(wcbio) <- wcbnames wcbnames##  "bio1" "bio10" "bio11" "bio12" "bio13" "bio14" "bio15" ##  "bio16" "bio17" "bio18" "bio19" "bio2" "bio3" "bio4" ##  "bio5" "bio6" "bio7" "bio8" "bio9"
|bio1||Annual Mean Temperature|
|bio2||Mean Diurnal Range|
|bio3||Isothermality (BIO2/BIO7) (×100)|
|bio5||Max Temperature of Warmest Month|
|bio6||Min Temperature of Coldest Month|
|bio7||Temperature Annual Range|
|bio8||Mean Temperature of Wettest Quarter|
|bio9||Mean Temperature of Driest Quarter|
|bio10||Mean Temperature of Warmest Quarter|
|bio11||Mean Temperature of Coldest Quarter|
|bio13||Precipitation of Wettest Month|
|bio14||Precipitation of Driest Month|
|bio16||Precipitation of Wettest Quarter|
|bio17||Precipitation of Driest Quarter|
|bio18||Precipitation of Warmest Quarter|
|bio19||Precipitation of Coldest Quarter|
The Washington Cascades ecoregion boundaries are reprojected into a geographic coordinate system with WGS84 datum to match the coordinate reference system of the plot data and the WorldClim data. These polygons are then used to crop and mask the Washington Cascades from the larger Worldclim dataset.
<- st_transform(wacascades, st_crs(wcbio)) boundary_wgs84 <- crop(wcbio, vect(boundary_wgs84)) wcbio_crop <- mask(wcbio_crop, vect(boundary_wgs84))wcbio_msk
The map of maximum temperature during the warmest month of the year highlights the effects of elevation, with the highest temperatures at the fringes of the mountain range and in the larger river valleys. The lowest temperatures occur along the Cascade crest and at the peaks of the large volcanoes (Figure 12.5).
<- rasterdf(wcbio_msk[["bio5"]]) mtwm_df ggplot(data = mtwm_df) + geom_raster(aes(x = x, y = y, fill = value)) + scale_fill_gradient(name = "Temperature (\u00B0C)", low = "lightyellow", high = "darkred", na.value = NA) + geom_sf(data = boundary_wgs84, fill = NA) + annotation_scale(location = 'br') + coord_sf(expand = F) + theme_void()
The map of precipitation during the wettest month shows the interactions of moist maritime air with topography, with the highest values occurring west of the Cascade Crest and the lowest values in the rain shadow on the eastern slopes (Figure 12.6).
<- rasterdf(wcbio_msk[["bio13"]]) mtwm_df ggplot(data = mtwm_df) + geom_raster(aes(x = x, y = y, fill = value)) + scale_fill_gradient(name = "Precipitation (mm)", low = "lightblue", high = "darkblue", na.value = NA) + geom_sf(data = boundary_wgs84, fill = NA) + annotation_scale(location = 'br') + coord_sf(expand = F) + theme_void()
To prepare the data for modeling, the
extract() function is used to obtain climate variables from
wcbio_mask for every plot location in the subalpine fir dataset.
<- extract(wcbio_msk, vect(abla_pts)) %>% abla_bio bind_cols(abla_pts) %>% as.data.frame() names(abla_bio) ##  "ID" "bio1" "bio10" "bio11" "bio12" ##  "bio13" "bio14" "bio15" "bio16" "bio17" ##  "bio18" "bio19" "bio2" "bio3" "bio4" ##  "bio5" "bio6" "bio7" "bio8" "bio9" ##  "abla" "geometry"
To assess the predictive accuracy of the model, a subset of the data needs to be held out from the calibration step and reserved for validation. The data can be split using the
sample_frac() function from the dplyr package, where the
size argument is the proportion of observations in the sample. There is a similar function called
sample_n() that can be used to sample a specific number of observations. The
anti_join() function is then used to select the validation points as all the observations in
abla_bio that are not present in
set.seed(22003) <- abla_bio %>% abla_train sample_frac(size = 0.7) <- abla_bio %>% abla_val anti_join(abla_train, by = "ID")
Boosted regression trees are used to model the relationships between presence/absence of subalpine fir and the 19 bioclimatic variables. BRT is a machine learning method that combines regression trees, which relate species occurrence to a set of climate variables using a series of recursive binary splits, with boosting, an ensemble approach that combines many simple models to improve predictive accuracy (Elith, Leathwick, and Hastie 2008). BRT models can account for non-linear relationships with climate as well as interactions between two or more climate variables. In addition to generating predictions, information about the relative importance of climate predictors can be derived, and climate effects can be visualized using partial regression plots.
BRT models are run using the
gbm.step() function from the dismo package. A data frame containing training data is provided along with the indices of the columns to use as the predictor variables (
gbm.x) and the response variable (
family = "bernoulli" argument indicates that the model will treat the response as a binomial (presence/absence) variable. The
bag.fraction arguments control aspects of the gradient boosting algorithm that is used to fit the model. Consult Elith, Leathwick, and Hastie (2008) for more details on these parameters as well as recommended approaches for parameter selection. Here, a set of values that have been determined to yield good predictions with these datasets is used. Other arguments control whether plots and output to the console are automatically produced.
<- gbm.step(data = abla_train, abla_mod gbm.x = 2:20, gbm.y = 21, family = "bernoulli", tree.complexity = 3, learning.rate = 0.01, bag.fraction = 0.5, plot.main = FALSE, verbose = FALSE, silent = TRUE)
After running the model, the summary function can be used to extract information about the relative importance of the predictor variables. The relative importance is based on the number of times a variable is selected for splitting the regression trees weighted by the resulting improvement to the BRT model. The importance values are then rescaled so they sum to 100. For subalpine fir, the mean annual temperature (bio1) has a stronger influence in the model than any other bioclimatic variable followed by mean temperature of the warmest quarter (bio10) mean temperature of the driest quarter (bio9), and precipitation of the warmest quarter (bio18).
<- summary(abla_mod, plotit = FALSE) abla_imp abla_imp## var rel.inf ## bio1 bio1 36.0562317 ## bio10 bio10 8.0557059 ## bio9 bio9 6.2651729 ## bio18 bio18 6.0591309 ## bio3 bio3 5.9187074 ## bio17 bio17 5.1232150 ## bio15 bio15 4.4587608 ## bio14 bio14 4.4537895 ## bio8 bio8 4.2289912 ## bio11 bio11 3.9730128 ## bio4 bio4 3.0952993 ## bio5 bio5 2.5050354 ## bio6 bio6 1.9115928 ## bio12 bio12 1.9019258 ## bio2 bio2 1.7437088 ## bio7 bio7 1.3484148 ## bio16 bio16 1.0302154 ## bio13 bio13 0.9892763 ## bio19 bio19 0.8818131
Partial residual plots show the nonparametric relationships between subalpine fir occurrence and the predictor variables. They can be generated using the
gbm.plot() function from the dismo package. Here, the partial plots for the four most important predictors are shown in a 2 x 2 layout.
Over the range of temperatures in the study area, subalpine fir occurs at the lowest mean annual temperatures (bio1) and declines monotonically with increasing temperature (Figure 12.7). Subalpine fir occurrence also decreases slightly with mean temperature of the warmest quarter (bio10) and precipitation in the warmest quarter (bio18) and is slightly higher at intermediate values of mean temperature during the driest quarter (bio9).
gbm.plot(abla_mod, n.plots = 4, write.title = FALSE, plot.layout = c(2, 2))
The predicted probability of subalpine fir occurrence can be mapped using the terra prediction function. The
SpatRaster object containing the predictor variables is the first argument, followed by the model object. The
type = "response" argument is also necessary to transform the predictions to the 0-1 scale, and
na.rm = TRUE suppresses predictions outside of the study area where there are no data.
<- predict(object = wcbio_msk, abla_cur model = abla_mod, type = "response", na.rm = TRUE)
Compared to the smoothed map (Figure 12.3), the predicted map based on the WorldClim bioclimatic variables shows much more local variability associated with topography (Figure 12.8). In particular, the species is predicted to be most abundant on high ridges at the Cascade crest and eastward. The circular rings are the upper elevation limits on large volcanoes such as Mount Rainier and Mount Adams.
<- rasterdf(abla_cur) abla_cur_df ggplot(data = abla_cur_df) + geom_raster(aes(x = x, y = y, fill = value)) + scale_fill_gradient(name = "Subalpine Fir", low = "lightyellow", high = "darkgreen", na.value = NA) + geom_sf(data = boundary_wgs84, fill = NA) + annotation_scale(location = 'br') + coord_sf(expand = F) + theme_void()
The data preparation and modeling steps are repeated for Douglas-fir.
set.seed(22004) <- extract(wcbio_msk, vect(psme_pts)) %>% psme_bio bind_cols(psme_pts) %>% as.data.frame() <- psme_bio %>% psme_train sample_frac(size = 0.7) <- psme_bio %>% psme_val anti_join(abla_train, by = "ID") <- gbm.step(data=psme_train, psme_mod gbm.x = 2:20, gbm.y = 21, family = "bernoulli", tree.complexity = 3, learning.rate = 0.01, bag.fraction = 0.5, plot.main = FALSE, verbose = FALSE, silent = TRUE)
The variable importance ranking is different for Douglas-fir compared to Subalpine Fir. The two most important variables, maximum temperature of the warmest month (bio5) and mean temperature of the warmest quarter (bio10) have considerably higher relative importance than the rest of the variables.
<- summary(psme_mod, plotit = FALSE) psme_imp psme_imp## var rel.inf ## bio5 bio5 19.3455300 ## bio10 bio10 13.5939889 ## bio14 bio14 7.3627926 ## bio17 bio17 7.0770029 ## bio18 bio18 6.0665693 ## bio4 bio4 5.4373063 ## bio15 bio15 5.2734444 ## bio3 bio3 4.9268249 ## bio9 bio9 4.1113149 ## bio1 bio1 3.9882327 ## bio2 bio2 3.7682976 ## bio6 bio6 3.4319576 ## bio12 bio12 3.0354558 ## bio11 bio11 2.9986297 ## bio8 bio8 2.8962037 ## bio7 bio7 2.8731305 ## bio16 bio16 1.4739840 ## bio19 bio19 1.3695103 ## bio13 bio13 0.9698239
The relationships between Douglas-fir and these two variables are unimodal, with species occurrence peaking across a range of optimal temperatures and decreasing at cooler and warmer temperatures (Figure 12.9). Douglas-fir also has unimodal relationships with precipitation in the driest month (bio14) and precipitation in the driest quarter (bio17).
gbm.plot(psme_mod, n.plots = 4, write.title = FALSE, plot.layout = c(2, 2))
The predicted occurrence map shows that Douglas-fir is widespread throughout the southern part of the Washington Cascades but is confined to lower-elevation valleys in the northern part of the region (Figure 12.10).
<- predict(object = wcbio_msk, psme_cur model = psme_mod, type = "response", na.rm = TRUE) <- rasterdf(psme_cur) psme_cur_df ggplot(data = psme_cur_df) + geom_raster(aes(x = x, y = y, fill = value)) + scale_fill_gradient(name = "Douglas-fir", low = "lightyellow", high = "darkgreen", na.value = NA) + geom_sf(data = boundary_wgs84, fill = NA) + annotation_scale(location = 'br') + coord_sf(expand = F) + theme_void()
To understand how accurate the species distribution model predictions are, the
predict() function is used to generate predictions based on the validation dataset. In this example, the
predict() method for
gbm objects is invoked, requiring different arguments than the previous examples, which used the
predict() method for
SpatRaster objects. The first argument is the
gbm model, and the
newdata argument is used to specify the
abla_val data frame, which was created earlier when the full dataset was split into training and validation sets. The vector of predicted values is combined with the observed values from the
abla_val data frame, and the
prediction() function from the ROCR package is used to generate a
prediction object that can be used for computing accuracy statistics.
<- predict(abla_mod, abla_pred newdata = abla_val, type = "response") <- prediction(abla_pred, abla_val$abla)abla_predobs
Although the observations are binary, the output of the model is a continuous probability between zero and one. The receiver operating characteristic curve (ROC) is commonly used to evaluate the predictive capability of such a model. The ROC is generated by using the predicted probability to classify presence or absence over a range of cutoff values and exploring the tradeoff between predictions of presence and absence. At the extreme, if all predicted probabilities less than one are assumed to be positive, then all points are classified as present. In this case, the model will correctly predict all the locations where subalpine fir is actually present but fail to predict any of the points where it is actually absent. The reverse is true if zero is used as the threshold and all points are classified as absent.
The ROC curve describes the changes in true positive rates and false positive rates over all possible cutoff values. The
performance() function in the ROCR package is used to generate a
performance object that includes these two measures.
= performance(abla_predobs, abla_roc measure = "tpr", x.measure = "fpr") class(abla_roc) ##  "performance" ## attr(,"package") ##  "ROCR"
performance object could be visualized using a generic
plot() method, but this example will show how to extract the underlying data and generate an ROC curve with
ggplot(). The ROCR package uses the
S4 object system. Details of
S4 are outside the scope of this book, but at the most basic level, these objects require new functions to visualize and extract their components, which are stored in “slots”.
slotNames(abla_roc) ##  "x.name" "y.name" "alpha.name" ##  "x.values" "y.values" "alpha.values"
slot() function can be used to access the slots in an
S4 object. The
y.values contain the false positive rates and the true positive rates for the ROC curve. These data are stored as lists with a single element. Double brackets are used to subscript these list elements and return numeric vectors, which are combined into a data frame.
<- slot(abla_roc, "x.values")[] abla_fpr <- slot(abla_roc, "y.values")[] abla_tpr <- data.frame(abla_fpr, abla_tpr)abla_aucplot
This data frame can be used with
ggplot() to graph the ROC curve. As the false positive rate (equal to one minus the true negative rate) decreases, the true positive rate also decreases (Figure 12.11).
ggplot(data = abla_aucplot) + geom_line(aes(x = abla_fpr, y = abla_tpr), col = "red") + labs(x = "False Positive Rate", y = "True Positive Rate") + geom_abline(slope = 1, intercept = 0) + scale_x_continuous(expand = c(0.005, 0)) + scale_y_continuous(expand = c(0.005, 0)) + coord_fixed() + theme_bw()
Another way to visualize these relationships is to graph the overall accuracy, true positive rate, and true negative rate as a function of the cutoff used for prediction. The
performance() function is used to create
performance objects containing these three statistics. The values are then extracted and combined into a long-format data frame.
= performance(abla_predobs, abla_all measure = "acc") = performance(abla_predobs, abla_pos measure = "tpr") = performance(abla_predobs, abla_neg measure = "tnr") <- slot(abla_all, "x.values")[] cutoff <- slot(abla_all, "y.values")[] totacc <- slot(abla_pos, "y.values")[] posacc <- slot(abla_neg, "y.values")[] negacc <- data.frame(cutoff, abla_accplot totacc, posacc,%>% negacc) pivot_longer(cols = one_of("totacc", "posacc", "negacc"), values_to = "accval", names_to = "accstat")
The results suggest that a cutoff value around 0.25 would be effective for classifying presence or absence based on the predictions (Figure 12.12) . The overall accuracy is close to the maximum, and the true negative and true positive rates are balanced and relatively high.
ggplot(data = abla_accplot) + geom_line(aes(x = cutoff, y = accval, col = accstat)) + labs(x = "Classification Cutoff", y = "Classification Accuracy", color = "Accuracy Statistic") + scale_color_discrete(labels = c("True Negative Rate", "True Positive Rate", "Overall Accuracy")) + scale_x_continuous(expand = c(0.005, 0)) + scale_y_continuous(expand = c(0.005, 0)) + coord_fixed() + theme_bw()
The area under the ROC curve, often abbreviated as the AUROC or just the AUC, is frequently used as an index of the predictive power of a probabilistic model for binary data like species presence/absence. Values range from a minimum of 0.5 to a maximum of 1.0. The AUC statistic can be calculated using the
<- performance(abla_predobs, measure = "auc") abla_aucval slot(abla_aucval, "y.values")[] ##  0.910495
The AUC for the subalpine fir model is 0.91, which is relatively high value and indicates that predictions are accurate over a range of cutoff values.
The WorldClim dataset also includes projected future climate grids based on the outputs of general circulation models (GCMs) from the Coupled Model Intercomparison Project (CMIP) version 6. This example uses a projection for 2061-2080 from the Max Planck Institute Earth System Model (MPI-ESM1.2). It is based on RCP4.5 global forcing pathway, which assumes that carbon dioxide (CO2) emissions will start to decline before 2045 and reach approximately half of their 2050 levels by 2100. The variables are the same bioclimatic indices that were computed for the historical climatology. The data are read into a raster object, assigned layer names, and cropped and masked to the Washington Cascades study area.
<- rast("wc2.1_30s_bioc_MPI-ESM1-2-HR_ssp245_2061-2080_wa.tif") wcproj nlyr(wcproj) ##  19 <- paste0("bio", 1:19) wcbnames names(wcproj) <- wcbnames <- crop(wcproj, vect(boundary_wgs84)) wcproj_crop <- mask(wcproj_crop, vect(boundary_wgs84))wcproj_msk
predict() function is used to generate predictions of subalpine fir distribution under the future climate scenario. The
object argument is used to specify the raster with projected climate variables. This raster must have the same layer names as the original variables that were used to train the model. The output,
abla_proj, is combined with the predictions under current conditions,
abla_cur, to create a multi-layer raster.
<- predict(object = wcproj_msk, abla_proj model = abla_mod, type = "response", na.rm = TRUE) <- c(abla_cur, abla_proj) abla_chg names(abla_chg) <- c("Current", "Future")
To display the predictions as binary presence or absence outcomes, the probabilities are classified using a cutoff value of 0.25.
<- ifel(abla_chg > 0.25, 1, 0)abla_clas
Comparing these two maps shows that the projected range of subalpine fir is considerably smaller under a future, warmer climate, with the species restricted to the highest elevations in the Washington Cascades (Figure 12.13).
<- rasterdf(abla_clas) abla_clas_df ggplot(data = abla_clas_df) + geom_raster(aes(x = x, y = y, fill = as.character(value))) + scale_fill_manual(name = "Subalpine Fir", values = c("lightyellow", "darkgreen"), labels = c("absent", "present"), na.translate = FALSE) + facet_wrap(~ variable) + geom_sf(data = boundary_wgs84, fill = NA) + coord_sf(expand = F) + theme_void()
In addition to generating maps, the data frame generated by the
rasterdf() function can also be used to create other types of graphs based on the values in a raster dataset. In this example, histograms of current and future mean annual temperatures show that the distribution of temperatures will be considerably higher under the future climate scenario (Figure 12.14).
<- c(wcbio_msk[["bio1"]], wcproj_msk[["bio1"]]) annmean names(annmean) <- c("Current", "Future") <- rasterdf(annmean) annmean_df ggplot(data = annmean_df) + geom_histogram(aes(x = value), bins = 20) + labs(x = "Mean Annual Temperature (\u00B0C)", y = "Count of observations") + facet_wrap(~ variable, ncol = 1) + theme_bw()
Carry out an accuracy assessment of the species distribution model for Douglas-fir and compare the results to those obtained for subalpine fir.
Predict the future distribution of Douglas-fir using the climate projections and compare the results with the projected future distribution of subalpine fir.
Generate a map that shows where the distribution of subalpine fir is projected to expand, contract, and remain the same under future climate projections. Then generate a similar map for Douglas-fir and compare the two.