17 Day 17 (March 26)

17.1 Announcements

How this class will end…
- Last lecture will be on April 30.
- Last in-class workday will be on May 2
- Portfolio will be due Friday May 3 (more details to come)
- Final presentations will occur between Tuesday April 30 - Thursday May 9- Final project peer review is due May 3
- Final project is due May 9
In-class workday on Thursday
- Please come prepared to work on the class project and any unfinished activities
Journal reflections
- How do you know which distribution to choose and when?
- With so many models being presented in class, what is the proper method for model checking and evaluation?
  - Elevation example (R code)
Additional prompt added to activity 2. Please include this in your activity 2!

17.2 Spatio-temporal models for disease data

Example

Data from Enders et al (2018) which is available on Dryad Digital Repository

library(sf)
library(sp)
library(raster)

url <- "https://www.dropbox.com/scl/fi/9ymxt900s77uq50ca6dgc/Enders-et-al.-2018-data.csv?rlkey=0rxjwleenhgu0gvzow5p0x9xf&dl=1"
df1 <- read.csv(url)
df1 <- df1[,c(1,4,5,8,9,10)] # Keep only the data on bird cherry-oat aphid

head(df1)

##   BCOA BYDV.totalpos.BCOA BCOA.totaltested year      long      lat
## 1   12                  2               10 2014 -95.16269 37.86238
## 2    1                  0                1 2014 -95.28463 38.29669
## 3    2                  0                2 2014 -95.33038 39.59482
## 4    0                 NA                0 2014 -95.32098 39.50696
## 5    8                  0                8 2014 -98.55469 38.48455
## 6    1                  0                1 2014 -98.84792 38.32772

# Download shapefile of Kansas from census.gov
download.file("http://www2.census.gov/geo/tiger/GENZ2015/shp/cb_2015_us_state_20m.zip", destfile = "states.zip")
unzip("states.zip")
sf.us <- st_read("cb_2015_us_state_20m.shp",quiet = TRUE)
sf.kansas <- sf.us[48,6]
sf.kansas <- as(sf.kansas, 'Spatial')
#plot(sf.kansas,main="",col="white")

# Make SpatialPoints data frame
pts.sample <- data.frame(long = df1$long,lat = df1$lat, 
                     count = df1$BCOA,
                     BYDV.pos = df1$BYDV.totalpos.BCOA,
                     BYDV.tot = df1$BCOA.totaltested,
                     BYDV.prop = df1$BYDV.totalpos.BCOA/df1$BCOA.totaltested)
coordinates(pts.sample) =~ long + lat
proj4string(pts.sample) <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")


# Plot counts of Bird cherry-oat aphid
par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=TRUE)
plot(sf.kansas,main="Abundance of Bird Cherry-oat Aphid")
points(pts.sample[,1],col=rgb(0.4,0.8,0.5,0.9),pch=ifelse(pts.sample$count>0,20,4),cex=pts.sample$count/50+0.5)
legend("right",inset=c(-0.25,0),legend = c(0,1,10,20,40,60), bty = "n", text.col = "black", 
   pch=c(4,20,20,20,20,20), cex=1.3,pt.cex=c(0,1,10,20,40,60)/50+0.5,col=rgb(0.4,0.8,0.5,0.9))

# Plot proportion of number of Bird cherry-oat aphid infected with BYDV
par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=TRUE)
plot(sf.kansas, main="Proportion infected")
points(pts.sample[,4],col=rgb(0.8,0.5,0.4,0.9),pch=ifelse(is.na(pts.sample$BYDV.prop)==FALSE,20,4),cex=ifelse(is.na(pts.sample$BYDV.prop)==FALSE,pts.sample$BYDV.prop,0)/0.5+0.5)
legend("right",inset=c(-0.25,0),legend = c("NA",0,0.25,0.50,0.75,1.00), bty = "n", text.col = "black", pch=c(4,20,20,20,20,20), cex=1.3,pt.cex=c(0,0,0.25,0.50,0.75,1.00)/0.5+0.5,col=rgb(0.8,0.5,0.4,0.9))

Study goals
- Make accurate predictions of vector abundance and probability of BYDV infection at times and locations where data was not collected.
- Understand the environmental factors (e.g., temperature) that influence the vector abundance and probability of BYDV infection.
Auxiliary data
- Weather/climate data
- Land cover data
- Crop data
Model choice: dynamic vs. descriptive spatio-temporal model?
- Class discussion
Next steps
- Write out the statistical model(s)
- Determine how we will evaluate how model(s)
- Determine how we will make inference