13 Day 13 (June 21)

13.1 Announcements

  • Live demonstration of assignment 3?

13.2 Understanding confidence intervals using synthetic data

  • What should we expect from a confidence interval?

    • What is the statistic?
    • Which quantities are random and which are fixed?
    • How often should a 95% CI covers the true value of \(\beta\)?
  • How do we know if a CI performs as advertised?

    • Derive the CI theoretically
    • Evaluate the interval estimator using Monte Carlo simulation
  • Monte Carlo simulation to evaluate a CI

    library(latex2exp)
    library(plotrix)
    
    beta.truth <- c(3,1) # True value of beta
    sigma2.truth <- 1000     # True value of sigma2
    
    
    n <- 50 # Sample size for each simulated data set
    x <- seq(1,100,length.out = n) # Covariate 
    X <- model.matrix (~x) # Design matrix 
    
    q.sims <- 10^2  # How many data sets should I simulate?
    sim.results <- matrix(,nrow=q.sims,ncol=3)
    colnames(sim.results) <- c("beta_1","lower.limit","upper.limit")
    
    set.seed(9409)
    for(i in 1:q.sims){
      y <- X%*%beta.truth + rnorm(n,0,sqrt(sigma2.truth)) # Generate data
      m1 <- lm(y~x)
      sim.results[i,1:3] <- c(coef(m1)[2],confint(m1)[2,])
    }
    
    # Mean of beta1
    mean(sim.results[,1])
    ## [1] 1.013225
    # Plot confidence interval
    library(plotrix)
    plotCI(c(1:q.sims),sim.results[,1],li=sim.results[,2],ui=sim.results[,3],
    sfrac=0.005,pch=20,xlab="Simulation data set",ylab=expression(beta[1]))
    abline(a=beta.truth[2],b=0,col="gold",lwd=3)

    # Calculate the coverage probability
    covered <- ifelse(sim.results[,2] < beta.truth[2] & sim.results[,3] > beta.truth[2],1,0)
    mean(covered)
    ## [1] 0.94