5 Day 5 (February 4)

5.1 Announcements

  • Please read (and re-read) Ch. 3 and 4 in BBM2L book.

  • Selected questions/clarifications from journals

    • How to choose/select a distribution
      • Definition of a model
      • Combining data and models/assumptions gives use prediction/forecasts/inference.
    • Sample size questions
      • n = Inf
      • n = 0
      • Power analysis
      • Anxiety/statistical therapy
      • Adaptive designs
  • Good reading from The American Statistician link

5.2 Building our first statistical model

  • The backstory
  • Building a statistical model using a likelihood-based (classical) approach
    • Specify (write out) the likelihood
    • Select an approach to estimate unknown parameters (e.g., maximum likelihood)
    • Quantify uncertainty in unknown parameters (e.g., using normal approximation, see here)
  • Building a statistical model using a Bayesian approach
    • Specify (write out) the likelihood/data model
    • Specify the parameter model (or prior) including hyper-parameters
    • Select an approach to obtain the posterior distribution
      • Analytically (i.e., pencil and paper)
      • Simulation-based (e.g., Metropolis-Hastings, MCMC, importance sampling, ABC, etc)

5.3 Numerical Integration

  • Why do we need integrals to do Bayesian statistics?

    • Example using Bayes theorem to estimate prevalence rate of rabies
    • Why it is important to keep track of what we are calculating (i.e., clarity in what is being estimated)
  • Numerical approximation vs. analytical solutions

  • Definition of a definite integral baf(z)dz=limQQq=1Δqf(zq) where Δq=baQ and zq=a+q2Δq.

  • Riemann approximation (midpoint rule)baf(z)dzQq=1Δqf(zq) where Δq=baQ and zq=a+2q12Δq.

  • Using similar approach in R (Adaptive quadrature)

    fn <- function(y){dnorm(y,0,1)}
    integrate(f=fn,lower=-4,upper=4,subdivisions=10)
    ## 0.9999367 with absolute error < 4.8e-12

5.4 Monte Carlo Integration

  • Deterministic vs stochastic methods to approximate integrals

    • Work well for high-dimensional multiple integrals
    • Easy to program
  • Monte Carlo integration

    • E(g(y))=g(y)[y|θ]dy1QQq=1g(yq)
    • Examples:
    1. E(y)=y12πσ2e12σ2(yμ)2dy
    y <- rnorm(n = 10^6, mean = 2, sd = 3)
    mean(y)
    ## [1] 1.99951
    1. E((yμ)2)=(yμ)212πσ2e12σ2(yμ)2dy
    y <- rnorm(n = 10^6, mean = 2, sd = 3)
    mean((y - 2)^2)
    ## [1] 8.99337
    1. E(1y)=1y12πσ2e12σ2(yμ)2dy
    y <- rnorm(n = 10^6, mean = 2, sd = 4)
    mean(1/y)
    ## [1] 0.7251141
  • Questions about activity 2?

  • Live example using bat and coin data/model