11 Day 11 (February 27)

11.1 Announcements

I have to leave at 9:15 today!
- I will have office hours today in Dickens Hall room 109 from 2-3pm.

11.2 Extreme precipitation in Kansas

On September 3, 2018 there was an extreme precipitation event that resulted in flooding in Manhattan, KS and the surrounding areas. If you would like to know more about this, check out this link and this video.
What we may need to learn
- How to use R as a geographic information system
- New general tools from statistics
  - Gaussian process
- How to use the hierarchical modeling framework to describe Kriging
  - Hierarchical Bayesian model vs. “empirical” hierarchical model
- Specialized language used in spatial statistics (e.g., range, nugget, variogram)

11.3 Gaussian process

See bottom of pg. 139 in Wikle et al. (2019)
A Gaussian process is a probability distribution over functions
If the function is observed at a finite number of points or “locations,” then the vector of values follows a multivariate normal distribution.

11.3.1 Multivariate normal distribution

Multivariate normal distribution
\(\boldsymbol{\eta}\sim\text{N}(\mathbf{0},\sigma^{2}\mathbf{R})\)
- Definitions
Correlation matrix – A positive semi-definite matrix whose elements are the correlation between observations
Correlation function – A function that describes the correlation between observations
Example correlation matrices
Compound symmetry \[\mathbf{R}=\left[\begin{array}{cccccc} 1 & 1 & 0 & 0 & 0 & 0\\ 1 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 1 & 0 & 0\\ 0 & 0 & 1 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 1\\ 0 & 0 & 0 & 0 & 1 & 1 \end{array}\right]\]
- AR(1) \[\mathbf{R(\phi})=\left[\begin{array}{ccccc} 1 & \phi^{1} & \phi^{2} & \cdots & \phi^{n-1}\\ \phi^{1} & 1 & \phi^{1} & \cdots & \phi^{n-2}\\ \phi^{2} & \phi^{1} & 1 & \vdots & \phi^{n-3}\\ \vdots & \vdots & \vdots & \ddots & \ddots\\ \phi^{n-1} & \phi^{n-2} & \phi^{n-3} & \ddots & 1 \end{array}\right]\:\]
  - Example simulating from \(\boldsymbol{\eta}\sim\text{N}(\mathbf{0},\sigma^{2}\mathbf{R})\) in R
```
  n <- 200
  x <- 1:n
  I <- diag(1, n)
  sigma2 <- 1

  library(MASS)
  set.seed(2034)
  eta <- mvrnorm(1, rep(0, n), sigma2 * I)
  cor(eta[1:(n - 1)], eta[2:n])
```
```
## [1] -0.06408623
```
```
  acf(eta)
```
```
  par(mfrow = c(2, 1))
  par(mar = c(0, 2, 0, 0), oma = c(5.1, 3.1, 2.1, 2.1))
  plot(x, eta, typ = "l", xlab = "", xaxt = "n")
  abline(a = 0, b = 0)

  n <- 200
  x <- 1:n # Must be equally spaced
  phi <- 0.7
  R <- phi^abs(outer(x, x, "-"))
  sigma2 <- 1
  set.seed(1330)
  eta <- mvrnorm(1, rep(0, n), sigma2 * R)
  cor(eta[1:(n - 1)], eta[2:n])
```
```
## [1] 0.7016481
```
```
  plot(x, eta, typ = "l")
  abline(a = 0, b = 0)
```
```
  acf(eta)
```
- Example correlation functions
- Gaussian correlation function \[r_{ij}(\phi)=e^{-\frac{d_{ij}^{2}}{\phi}}\] where \(d_{ij}\) is the “distance” between locations i and j (note that \(d_{ij}=0\) for \(i=j\)) and \(r_{ij}(\phi)\) is the element in the \(i^{\textrm{th}}\) row and \(j^{\textrm{th}}\) column of \(\mathbf{R}(\phi)\).
```
  library(fields)
  n <- 200
  x <- 1:n
  phi <- 40
  d <- rdist(x)
  R <- exp(-d^2/phi)
  sigma2 <- 1

  set.seed(4673)
  eta <- mvrnorm(1, rep(0, n), sigma2 * R)
  plot(x, eta, typ = "l")
  abline(a = 0, b = 0)
```
```
  cor(eta[1:(n-1)], eta[2:n])
```
```
## [1] 0.9717508
```

Linear correlation function \[r_{ij}(\phi)=\begin{cases} 1-\frac{d_{ij}}{\phi} &\text{if}\ d_{ij}<0\\ 0 &\text{if}\ d_{ij}>0 \end{cases}\]

  library(fields)
  n <- 200
  x <- 1:n
  phi <- 40
  d <- rdist(x)
  R <- ifelse(d<phi,1-d/phi,0)
  sigma2 <- 1

  set.seed(4803)
  eta <- mvrnorm(1, rep(0, n), sigma2 * R)
  plot(x, eta, typ = "l")
  abline(a = 0, b = 0)

  cor(eta[1:(n-1)], eta[2:n])

## [1] 0.9778878

11.4 Linear models for correlated errors

Motivating example

ISIT <- read.table("https://www.dropbox.com/scl/fi/n5ku9crpsr384p9z2rc8b/bioluminescence.txt?rlkey=airzs2ukql4g0hqxjr72re53j&dl=1", header = TRUE)

Sources16 <- ISIT$Sources[ISIT$Station == 16]
Depth16 <- ISIT$SampleDepth[ISIT$Station == 16]
Sources16 <- Sources16[order(Depth16)]
Depth16 <- sort(Depth16)
plot(Depth16, Sources16, las = 1, ylim = c(0, 65), col = rgb(0, 0, 0, 0.25),
pch = 19,ylab="Density of bioluminescence",xlab="Depth in meters")

Linear model with correlated errors (aka Kriging)
- \(\mathbf{\textbf{y}}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\eta}+\boldsymbol{\varepsilon}\) where \(\boldsymbol{\eta}\sim\text{N}(\mathbf{0},\sigma_{\eta}^{2}\mathbf{R}(\phi))\) and \(\boldsymbol{\varepsilon}\sim\text{N}(\mathbf{0},\sigma_{\varepsilon}^{2}\mathbf{I})\)
- The model above is the same as \(\mathbf{y}\sim\text{N}(\mathbf{X}\boldsymbol{\beta},\sigma_{\eta}^{2}\mathbf{R}(\phi)+\sigma^{2}\mathbf{I})\)
- Estimation
  - Regression coefficients \[\hat{\boldsymbol{\beta}}=(\mathbf{X}^{\prime}\mathbf{V}^{-1}\mathbf{X})^{-1}\mathbf{X}^{\prime}\mathbf{V}^{-1}\mathbf{y}\] where \(\mathbf{V=}\sigma_{\eta}^{2}\mathbf{R}(\phi)+\sigma_{\varepsilon}^{2}\mathbf{I}\)
  - Numerical maximization to estimate \(\sigma_{\eta}^{2}\), \(\sigma^{2}\), and \(\phi\)
- Error terms \[\hat{\boldsymbol{\eta}}=\hat{\sigma}_{\eta}^{2}\mathbf{R}(\hat{\phi})(\hat{\sigma}_{\varepsilon}^{2}\mathbf{I}+\hat{\sigma}_{\eta}^{2}\mathbf{R}(\hat{\phi}))^{-1}(\mathbf{y}-\mathbf{X}\hat{\boldsymbol{\beta}})\] \[\hat{\boldsymbol{\varepsilon}} = \mathbf{y} - \mathbf{X}\hat{\boldsymbol{\beta}} - \hat{\boldsymbol{\eta}}\]

Bioluminescence example

Intercept only linear model

ISIT <- read.table("https://www.dropbox.com/scl/fi/n5ku9crpsr384p9z2rc8b/bioluminescence.txt?rlkey=airzs2ukql4g0hqxjr72re53j&dl=1", header = TRUE)

Sources16 <- ISIT$Sources[ISIT$Station == 16]
Depth16 <- ISIT$SampleDepth[ISIT$Station == 16]
Sources16 <- Sources16[order(Depth16)]
Depth16 <- sort(Depth16)

m1 <- lm(Sources16~1)
e.hat.m1 <- residuals(m1)

plot(Depth16, e.hat.m1, 
 xlab="Depth (m)",ylab=expression(hat(epsilon)),ylim=c(-15,60),las=1,col=rgb(0,0,0,0.25))
abline(a=0,b=0)

Intercept only linear model with spatial random effect

library(nlme)
m2 <- gls(Sources16~1,correlation=corGaus(form=~Depth16,nugget=TRUE,value=c(100,0.05),fixed=FALSE),method="ML")

summary(m2)

## Generalized least squares fit by maximum likelihood
##   Model: Sources16 ~ 1 
##   Data: NULL 
##        AIC      BIC    logLik
##   296.3003 304.0276 -144.1502
## 
## Correlation Structure: Gaussian spatial correlation
##  Formula: ~Depth16 
##  Parameter estimate(s):
##        range       nugget 
## 610.73574394   0.01300607 
## 
## Coefficients:
##                Value Std.Error  t-value p-value
## (Intercept) 17.73824  8.981483 1.974978  0.0538
## 
## Standardized residuals:
##        Min         Q1        Med         Q3        Max 
## -0.8937320 -0.8247052 -0.6866517  0.3991341  1.7952883 
## 
## Residual standard error: 19.84738 
## Degrees of freedom: 51 total; 50 residual

# estimated range & nugget parameters from gls
phi <- exp(m2$model[1]$corStruct[1])^2      
nugget <- 1/(1+exp(-m2$model[1]$corStruct[2])) 


X <- rep(1,length(Sources16))
y <- Sources16

sigma2.eta <- (m2$sigma^2)*(1-nugget)
R <- exp(-as.matrix(dist(Depth16,diag=TRUE,upper=TRUE))^2/phi)                                                                                      
sigma2.e <- (m2$sigma^2)*nugget  
I <- diag(1,length(y))
V <- sigma2.eta*R + sigma2.e*I                       
beta <- solve(t(X)%*%solve(V)%*%X)%*%t(X)%*%solve(V)%*%y 

eta <- sigma2.eta*R%*%solve(V)%*%(y-X%*%beta)

par(mar=c(5.1,4.1,2.1,2.1),oma=c(0,0,0,0))  # reset plot margins
par(mfrow=c(1,2))  # plot layout
par(pch=19)  # plotting symbol

# plot predictions for second-order spatial model
plot(Depth16, Sources16, 
 xlab="Depth (m)",ylab="Density of bioluminescence",ylim=c(-15,60),las=1,col=rgb(0,0,0,0.25))
lines(Depth16,X%*%beta + eta,col="black")
lines(Depth16,X%*%beta,col="purple")

e.hat.m2 <- y - (X%*%beta + eta)
plot(Depth16, e.hat.m2, 
 xlab="Depth (m)",ylab=expression(hat(epsilon)),las=1,col=rgb(0,0,0,0.25))

Homework assignment!
- Go back to the R code for the bioluminescence example and allow the expected value of bioluminescence density to depend on depth (i.e., use a linear model with a slope parameter instead of just an intercept)

11.5 Extreme precipitation in Kansas

Live demonstration using what we have learned so far (Download R code here)