6 Day 6 (February 8)
6.1 Announcements
- Journal reflection
- Example projects
- Misconceptions about spatio-temporal data
- Misconceptions about spatio-temporal models
6.2 Mathematical model review
- Mathematical models are deterministic equations that describe the relationship between input variables and an output variable
- Common types of mathematical models used for spatio-temporal statistics
- Linear equations
- Scalar form: \(\mu=\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots+\beta_{p}x_{p}\)
- Vector form: \(\boldsymbol{\mu}=\beta_{0}+\beta_{1}\mathbf{x}_{1}+\beta_{2}\mathbf{x}_{2}+\ldots+\beta_{p}\mathbf{x}_{p}\)
- Matrix form: \(\boldsymbol{\mu}=\mathbf{X}\boldsymbol{\beta}\)
- Non-linear equations Scalar form: \(\mu = e^{\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots+\beta_{p}x_{p}}\)
- Difference equations
- Scalar form: \(\mu_{t+1} = \phi\mu_{t}\)
- Differential equations
- Scalar form: \(\frac{d\mu(t)}{dt}=\gamma\mu(t)\)
- Linear equations
6.3 Summary and comments
- Probability distributions and mathematical models are the building block for most (parametric) statistical models
- Agent-based models simulation models are also widely used but rarely using statistical approaches (Epstein and Axtell 1996; Heard et al. 2015)
6.3.1 Motivating data example
Data set
url <- "https://www.dropbox.com/scl/fi/kxzc8fmomkigcrxyjjmdp/Butler-et-al.-Table-1.csv?rlkey=61ey3q1jc4rsor257uy4080j3&dl=1" df1 <- read.csv(url) plot(df1$Winter, df1$N, xlab = "Year", ylab = "Population count (z)", xlim = c(1940, 2020), ylim = c(0, 300), typ = "b", cex = 0.8, pch = 20, col = rgb(0.7,0.7,0.7,0.9))
We want to build a statistical model that enables
- Predictions and forecasts of the true population size
- Statistical inference on the date when the population will be larger than 1000 individuals
Points to consider
- Whooping cranes are counted from an airplane (could some individuals be missed?)
- Aggregation of a spatio-temporal point pattern?
- Are there any existing models that could work for these data?
- Anything else?
Live example Download R code here
6.4 Hierarchical models
- During this course we will implement many models using the hierarchical framework
- Hierarchical models are pretty common (e.g., mixed models, kriging, most Bayesian models)
- Today is a crash course on hierarchical and Bayesian statistical models
- Study technical note 1.1 on pg. 13 of Spatio-temporal statistics with R
- The Bayesian hierarchical modeling framework
\[\text{Data model:} \;\;[\mathbf{z}|\mathbf{y},\boldsymbol{\theta}_{D}]\] \[\text{Process model:} \;\;[\mathbf{y}|\boldsymbol{\theta}_{P}]\] \[\text{Parameter model:} \;\;[\boldsymbol{\theta}]\]
- Given a Bayesian hierarchical model we want the following:
- The posterior distribution of the parameters \([\boldsymbol{\theta}|\mathbf{z}]\)
- The posterior predictive distribution \([\mathbf{z}_{\text{pred}}|\mathbf{z}]\)
- Using Bayes’ theorem… \[[\boldsymbol{\theta}|\mathbf{z}]=\int\frac{[\mathbf{z}|\mathbf{y},\boldsymbol{\theta}][\mathbf{y}|\boldsymbol{\theta}][\boldsymbol{\theta}]}{\int\int\mathbf{[z}|\mathbf{y},\boldsymbol{\theta}][\mathbf{y}|\boldsymbol{\theta}][\boldsymbol{\theta}]d\mathbf{y}d\mathbf{\boldsymbol{\theta}}}d\mathbf{y}\] \[[\mathbf{z}_{\text{pred}}|\mathbf{z}]=\int\int\mathbf{[z}_{\text{pred}}|\mathbf{y},\boldsymbol{\theta}][\mathbf{y}|\boldsymbol{\theta}][\boldsymbol{\theta}|\mathbf{z}]d\mathbf{y}d\mathbf{\boldsymbol{\theta}}\]