36 Day 36 (July 26)
36.2 The Bayesian Linear Model
The model
- Distribution of the data: “\(\mathbf{y}\) given \(\boldsymbol{\beta}\) and \(\sigma_{\varepsilon}^{2}\)” \[\mathbf{y}|\boldsymbol{\beta},\sigma_{\varepsilon}^{2}\sim\text{N}(\mathbf{X}\boldsymbol{\beta},\sigma_{\varepsilon}^{2}\mathbf{I})\] This is also sometimes called the data model.
- Distribution of the parameters: \(\boldsymbol{\beta}\) and \(\sigma_{\varepsilon}^{2}\) \[\boldsymbol{\beta}\sim\text{N}(\mathbf{0},\sigma_{\beta}^{2}\mathbf{I})\] \[\sigma_{\varepsilon}^{2}\sim\text{inverse gamma}(q,r)\]
Computations and statistical inference
- Using Bayes rule (Bayes 1763) we can obtain the joint posterior distribution
\[[\boldsymbol{\beta},\sigma_{\varepsilon}^{2}|\mathbf{y}]=\frac{[\mathbf{y}|\boldsymbol{\beta},\sigma_{\varepsilon}^{2}][\boldsymbol{\beta}][\sigma_{\varepsilon}^{2}]}{\int\int [\mathbf{y}|\boldsymbol{\beta},\sigma_{\varepsilon}^{2}][\boldsymbol{\beta}][\sigma_{\varepsilon}^{2}]d\boldsymbol{\beta}d\sigma_{\varepsilon}^{2}}\]
- Statistical inference about a paramters is obtained from the marginal posterior distributions \[[\boldsymbol{\beta}|\mathbf{y}]=\int[\boldsymbol{\beta},\sigma_{\varepsilon}^{2}|\mathbf{y}]d\sigma_{\varepsilon}^{2}\] \[[\sigma_{\varepsilon}^{2}|\mathbf{y}]=\int[\boldsymbol{\beta},\sigma_{\varepsilon}^{2}|\mathbf{y}]d\boldsymbol{\beta}\]
- In practice it is difficult to find closed-form solutions for the marginal posterior distributions, but easy to find closed-form solutions for the conditional posterior distribtuions.
- Full-conditional distribtuions: \(\boldsymbol{\beta}\) and \(\sigma_{\varepsilon}^{2}\) \[\boldsymbol{\beta}|\sigma_{\varepsilon}^{2},\mathbf{y}\sim\text{N}\bigg{(}\big{(}\mathbf{X}^{\prime}\mathbf{X}+\frac{1}{\sigma_{\beta}^{2}}\mathbf{I}\big{)}^{-1}\mathbf{X}^{\prime}\mathbf{y},\sigma_{\varepsilon}^{2}\big{(}\mathbf{X}^{\prime}\mathbf{X}+\frac{1}{\sigma_{\beta}^{2}}\mathbf{I}\big{)}^{-1}\bigg{)}\] \[\sigma_{\varepsilon}^{2}|\mathbf{\boldsymbol{\beta}},\mathbf{y}\sim\text{inverse gamma}\left(q+\frac{n}{2},(r+\frac{1}{2}(\mathbf{y}-\mathbf{X}\boldsymbol{\beta})^{'}(\mathbf{y}-\mathbf{X}\boldsymbol{\beta}))^-1\right)\]
- Gibbs sampler for the Bayesian linear model
- Write out on the whiteboard
- Using Bayes rule (Bayes 1763) we can obtain the joint posterior distribution
\[[\boldsymbol{\beta},\sigma_{\varepsilon}^{2}|\mathbf{y}]=\frac{[\mathbf{y}|\boldsymbol{\beta},\sigma_{\varepsilon}^{2}][\boldsymbol{\beta}][\sigma_{\varepsilon}^{2}]}{\int\int [\mathbf{y}|\boldsymbol{\beta},\sigma_{\varepsilon}^{2}][\boldsymbol{\beta}][\sigma_{\varepsilon}^{2}]d\boldsymbol{\beta}d\sigma_{\varepsilon}^{2}}\]
Example: Mauna Loa Observatory carbon dioxide data
<- "ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt" url <- read.table(url,header=FALSE) df colnames(df) <- c("year","meanCO2","unc") plot(x = df$year, y = df$meanCO2, xlab = "Year", ylab = expression("Mean annual "*CO[2]*" concentration"),col="black",pch=20)
- Gibb sampler
# Preliminary steps library(mvnfast) # Required package to sample from multivariate normal distribution
## ## Attaching package: 'mvnfast'
## The following objects are masked from 'package:mgcv': ## ## dmvn, rmvn
<- df$meanCO2 # Response variable y <- model.matrix(~year, data=df) # Design matrix X <- length(y) # Number of observations n <- 2 # Dimensions of beta p <- 5000 #Number of MCMC samples to draw m.draws <- as.data.frame(matrix(,m.draws,3)) # Samples of the parameters we will save samples colnames(samples) <- c(colnames(X),"sigma2") <- 1 # Starting values for sigma2 sigma2.e <- 10^9 #Prior variance for beta sigma2.beta <- 2.01 #Inverse gamma prior with E() = 1/(r*(q-1)) and Var()= 1/(r^2*(q-1)^2*(q-2)) q <- 1 r # MCMC algorithm for(m in 1:m.draws){ # Sample beta from full-conditional distribution <- solve(t(X)%*%X + diag(1/sigma2.beta,p)) H <- t(X)%*%y h <- t(rmvn(1,H%*%h,sigma2.e*H)) beta # Sample sigma from full-conditional distribution <- q+n/2 q.temp <- (1/r+t(y-X%*%beta)%*%(y-X%*%beta)/2)^-1 r.temp <- 1/rgamma(1,q.temp,,r.temp) sigma2.e # Save samples of beta and sigma <- c(beta,sigma2.e) samples[m,] } # Look a histograms of posterior distributions par(mfrow=c(3,1),mar=c(5,6,1,1)) hist(samples[,1],xlab=expression(beta[0]*"|"*bold(y)),ylab=expression("["*beta[0]*"|"*bold(y)*"]"),freq=FALSE,col="grey",main="",breaks=30) hist(samples[,2],xlab=expression(beta[1]*"|"*bold(y)),ylab=expression("["*beta[1]*"|"*bold(y)*"]"),freq=FALSE,col="grey",main="",breaks=30) hist(samples[,3],xlab=expression(sigma^2*"|"*bold(y)),ylab=expression("["*sigma^2*"|"*bold(y)*"]"),freq=FALSE,col="grey",main="",breaks=30)
# Mean of the posterior distribution colMeans(samples[,1:2])
## (Intercept) year ## -2882.979210 1.628374
# Compare to results obtained from lm function <- lm(y~X-1) m1 coef(m1)
## X(Intercept) Xyear ## -2881.953419 1.627857
- Using the R package MCMCpack
library(MCMCpack) <- MCMCregress(y ~ X - 1, samples burnin = 0,mcmc = 5000 , b0 = 0,B0 = 1/sigma2.beta, c0 = q, d0 = 1/r) <- samples[1:5000,] # Extract samples from the posterior for beta and sigma2 from MCMCregress output samples # Look a histograms of posterior distributions par(mfrow=c(3,1),mar=c(5,6,1,1)) hist(samples[,1],xlab=expression(beta[0]*"|"*bold(y)),ylab=expression("["*beta[0]*"|"*bold(y)*"]"),freq=FALSE,col="grey",main="",breaks=30) hist(samples[,2],xlab=expression(beta[1]*"|"*bold(y)),ylab=expression("["*beta[1]*"|"*bold(y)*"]"),freq=FALSE,col="grey",main="",breaks=30) hist(samples[,3],xlab=expression(sigma^2*"|"*bold(y)),ylab=expression("["*sigma^2*"|"*bold(y)*"]"),freq=FALSE,col="grey",main="",breaks=30)
# Mean of the posterior distribution colMeans(samples[,1:2])
## X(Intercept) Xyear ## -2881.026502 1.627389