6 Day 6 (February 8)

6.1 Announcements

  • Journal reflection
    • Example projects
    • Misconceptions about spatio-temporal data
    • Misconceptions about spatio-temporal models

6.2 Mathematical model review

  • Mathematical models are deterministic equations that describe the relationship between input variables and an output variable
  • Common types of mathematical models used for spatio-temporal statistics
    • Linear equations
      • Scalar form: \(\mu=\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots+\beta_{p}x_{p}\)
      • Vector form: \(\boldsymbol{\mu}=\beta_{0}+\beta_{1}\mathbf{x}_{1}+\beta_{2}\mathbf{x}_{2}+\ldots+\beta_{p}\mathbf{x}_{p}\)
      • Matrix form: \(\boldsymbol{\mu}=\mathbf{X}\boldsymbol{\beta}\)
    • Non-linear equations Scalar form: \(\mu = e^{\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots+\beta_{p}x_{p}}\)
    • Difference equations
      • Scalar form: \(\mu_{t+1} = \phi\mu_{t}\)
    • Differential equations
      • Scalar form: \(\frac{d\mu(t)}{dt}=\gamma\mu(t)\)

6.3 Summary and comments

  • Probability distributions and mathematical models are the building block for most (parametric) statistical models
  • Agent-based models simulation models are also widely used but rarely using statistical approaches (Epstein and Axtell 1996; Heard et al. 2015)

6.3.1 Motivating data example

  • Whooping cranes

  • Data set

    url <- "https://www.dropbox.com/scl/fi/kxzc8fmomkigcrxyjjmdp/Butler-et-al.-Table-1.csv?rlkey=61ey3q1jc4rsor257uy4080j3&dl=1"
    df1 <- read.csv(url)
    
    plot(df1$Winter, df1$N, xlab = "Year", ylab = "Population count (z)", xlim = c(1940, 2020), ylim = c(0, 300), typ = "b", cex = 0.8, pch = 20, col = rgb(0.7,0.7,0.7,0.9))

  • We want to build a statistical model that enables

    • Predictions and forecasts of the true population size
    • Statistical inference on the date when the population will be larger than 1000 individuals
  • Points to consider

    • Whooping cranes are counted from an airplane (could some individuals be missed?)
    • Aggregation of a spatio-temporal point pattern?
    • Are there any existing models that could work for these data?
    • Anything else?
  • Live example Download R code here

6.4 Hierarchical models

  • During this course we will implement many models using the hierarchical framework
    • Hierarchical models are pretty common (e.g., mixed models, kriging, most Bayesian models)
    • Today is a crash course on hierarchical and Bayesian statistical models
    • Study technical note 1.1 on pg. 13 of Spatio-temporal statistics with R
  • The Bayesian hierarchical modeling framework

\[\text{Data model:} \;\;[\mathbf{z}|\mathbf{y},\boldsymbol{\theta}_{D}]\] \[\text{Process model:} \;\;[\mathbf{y}|\boldsymbol{\theta}_{P}]\] \[\text{Parameter model:} \;\;[\boldsymbol{\theta}]\]

  • Given a Bayesian hierarchical model we want the following:
    • The posterior distribution of the parameters \([\boldsymbol{\theta}|\mathbf{z}]\)
    • The posterior predictive distribution \([\mathbf{z}_{\text{pred}}|\mathbf{z}]\)
  • Using Bayes’ theorem… \[[\boldsymbol{\theta}|\mathbf{z}]=\int\frac{[\mathbf{z}|\mathbf{y},\boldsymbol{\theta}][\mathbf{y}|\boldsymbol{\theta}][\boldsymbol{\theta}]}{\int\int\mathbf{[z}|\mathbf{y},\boldsymbol{\theta}][\mathbf{y}|\boldsymbol{\theta}][\boldsymbol{\theta}]d\mathbf{y}d\mathbf{\boldsymbol{\theta}}}d\mathbf{y}\] \[[\mathbf{z}_{\text{pred}}|\mathbf{z}]=\int\int\mathbf{[z}_{\text{pred}}|\mathbf{y},\boldsymbol{\theta}][\mathbf{y}|\boldsymbol{\theta}][\boldsymbol{\theta}|\mathbf{z}]d\mathbf{y}d\mathbf{\boldsymbol{\theta}}\]