9.3 Poisson model
We can use same ideas as in Section 9.2 to perform inference in longitudinal/panel datasets where the dependent variable takes non-negative integers. Let’s assume that yit∼P(λit) where log(λit)=y∗it such that y_{it}^*\sim N(\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}+\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i,\sigma^2). We can augment the model with the latent variable y_{it}^{*}, and again use a Metropolis-within-Gibbs algorithm to perform inference in this model.
The posterior distribution of this model is \begin{align*} \pi(\boldsymbol{\beta},\sigma^2, \boldsymbol{b}_i, \boldsymbol{D}, \boldsymbol{y}^*\mid \boldsymbol{y}, \boldsymbol{X}, \boldsymbol{W})&\propto \prod_{i=1}^N \prod_{t=1}^{T_i}\left\{\lambda_{it}^{y_{it}}\exp\left\{-\lambda_{it}\right\}\right.\\ &\left.\times (\sigma^2)^{-1}\exp\left\{-\frac{1}{2\sigma^2}(y_{it}^*-\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}-\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i)^{\top}(y_{it}^*-\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}-\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i)\right\}\right\}\\ &\times \exp\left\{-\frac{1}{2}(\boldsymbol{\beta}-\boldsymbol{\beta}_0)^{\top}\boldsymbol{B}_0^{-1}(\boldsymbol{\beta}-\boldsymbol{\beta}_0)\right\}\\ &\times \exp\left\{-\frac{1}{2}\sum_{i=1}^N \boldsymbol{b}_i^{\top}\boldsymbol{D}^{-1}\boldsymbol{b}_i\right\}\\ &\times (\sigma^2)^{-\alpha_0-1}\exp\left\{-\frac{\delta_0}{\sigma^2}\right\}\\ &\times |\boldsymbol{D}|^{-(d_0+K_2+1)/2}\exp\left\{-\frac{1}{2}tr(d_0\boldsymbol{D}_0\boldsymbol{D}^{-1})\right\}. \end{align*}
We can get samples of y_{it}^* from a normal distribution with mean equal to \boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}+\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i and variance \sigma^2, and use these samples to get \lambda_{it}=\exp(y_{it}^*), y_{it}^{*c}=y_{it}^{*}+\epsilon_{it}, where \epsilon_{it}\sim\mathcal{N}(0,v^2), v is a tuning parameter to get good acceptance rates, and \lambda_{it}^c=\exp(y_{it}^{*c}). The acceptance rate of the Metropolis-Hastings algorithm is \begin{align*} \alpha=\min\left(1,\frac{ \lambda_{it}^{cy_{it}}\exp(-\lambda_{it}^c)\times\exp\left\{-\frac{1}{2\sigma^2}(y_{it}^{c*}-\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}-\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i)^{\top}(y_{it}^{c*}-\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}-\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i)\right\}}{\lambda_{it}^{y_{it}}\exp(-\lambda_{it})\times\exp\left\{-\frac{1}{2\sigma^2}(y_{it}^{*}-\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}-\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i)^{\top}(y_{it}^{*}-\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}-\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i)\right\}}\right). \end{align*}
In addition, we should use the posterior conditional distributions from Section 9.1 to complete the algorithm getting samples of \boldsymbol{\beta}, \sigma^2, \boldsymbol{b}_i and \boldsymbol{D} replacing y_{it} by {y}_{it}^*.
We should take into account for doing predictions that \mathbb{E}[{\lambda}_{it}]=\exp\left\{\boldsymbol{x}_{it}^{\top}\boldsymbol{\beta}+\boldsymbol{w}_{it}^{\top}\boldsymbol{b}_i+0.5\sigma^2\right\} (Diggle et al. 2002).
Example: Simulation exercise
Let’s perform a simulation exercise to assess the performance of the hierarchical longitudinal Poisson model. The point of departure is to assume that y_{it}^* = \beta_0 + \beta_1 x_{it1} + \beta_2 x_{it2} + \beta_3 x_{it3} + b_i + w_{it1} b_{i1}, where x_{itk} \sim N(0,1) for k = 1, 2, 3, w_{it1} \sim N(0,1), b_i \sim N(0, 0.7^{1/2}), b_{i1} \sim N(0, 0.6^{1/2}), and \boldsymbol{\beta} = [0.5 \ 0.4 \ 0.6 \ -0.6]^{\top}, with i = 1, 2, \dots, 50. Additionally, y_{it} \sim P(\lambda_{it}), where \lambda_{it} = \exp(y_{it}^*). The sample size is 1000 in an unbalanced panel structure.
We set the priors as \boldsymbol{\beta}_0 = \boldsymbol{0}_4, \boldsymbol{B}_0 = \boldsymbol{I}_4, \alpha_0 = \delta_0 = 0.001, d_0 = 2, and \boldsymbol{D}_0 = \boldsymbol{I}_2. The number of MCMC iterations, burn-in, and thinning parameters are 15,000, 5,000, and 10, respectively.
The following code shows how to perform inference in the hierarchical longitudinal Poisson model programming the Metropolis-within-Gibbs sampler.
rm(list = ls()); set.seed(010101)
NT <- 1000; N <- 50
id <- c(1:N, sample(1:N, NT - N,replace=TRUE))
x1 <- rnorm(NT); x2 <- rnorm(NT); x3 <- rnorm(NT)
X <- cbind(1, x1, x2, x3)
K1 <- dim(X)[2]; w1 <- rnorm(NT)
W <- cbind(1, w1); K2 <- dim(W)[2]
B <- c(0.5, 0.4, 0.6, -0.6)
D <- c(0.7, 0.6); sig2 <- 0.1
b1 <- rnorm(N, 0, sd = D[1]^0.5)
b2 <- rnorm(N, 0, sd = D[2]^0.5)
b <- cbind(b1, b2)
yl <- NULL
for(i in 1:NT){
ylmeani <- X[i,]%*%B + W[i,]%*%b[id[i],]
yli <- rnorm(1, ylmeani, sig2^0.5)
yl <- c(yl, yli)
}
lambdait <- exp(yl); y <- rpois(NT, lambdait)
Data <- as.data.frame(cbind(y, x1, x2, x3, w1, id))
mcmc <- 15000; burnin <- 5000; thin <- 10; tot <- mcmc + burnin
b0 <- rep(0, K1); B0 <- diag(K1); B0i <- solve(B0)
r0 <- K2; R0 <- diag(K2); a0 <- 0.001; d0 <- 0.001
LatentMHV1 <- function(tuning, Beta, bs, sig2){
ylhat <- rep(0, NT)
accept <- NULL
for(i in 1:NT){
ids <- which(id == i)
yi <- y[i]
ylhatmeani <- X[i,]%*%Beta + W[i,]%*%bs[id[i],]
ylhati <- rnorm(1, ylhatmeani, sd = sig2^0.5)
lambdahati <- exp(ylhati)
ei <- rnorm(1, 0, sd = tuning)
ylpropi <- ylhati + ei
lambdapropi <- exp(ylpropi)
logPosthati <- sum(dpois(yi, lambdahati, log = TRUE) + dnorm(ylhati, ylhatmeani, sig2^0.5, log = TRUE))
logPostpropi <- sum(dpois(yi, lambdapropi, log = TRUE) + dnorm(ylpropi, ylhatmeani, sig2^0.5, log = TRUE))
alphai <- min(1, exp(logPostpropi - logPosthati))
ui <- runif(1)
if(ui <= alphai){
ylhati <- ylpropi; accepti <- 1
}else{
ylhati <- ylhati; accepti <- 0
}
ylhat[i] <- ylhati
accept <- c(accept, accepti)
}
res <- list(ylhat = ylhat, accept = mean(accept))
return(res)
}
PostBeta <- function(D, ylhat, sig2){
XVX <- matrix(0, K1, K1); XVy <- matrix(0, K1, 1)
for(i in 1:N){
ids <- which(id == i); Ti <- length(ids)
Wi <- W[ids, ]
Vi <- diag(Ti)*sig2 + Wi%*%D%*%t(Wi)
ViInv <- solve(Vi); Xi <- X[ids, ]
XVXi <- t(Xi)%*%ViInv%*%Xi
XVX <- XVX + XVXi
yi <- ylhat[ids]
XVyi <- t(Xi)%*%ViInv%*%yi
XVy <- XVy + XVyi
}
Bn <- solve(B0i + XVX); bn <- Bn%*%(B0i%*%b0 + XVy)
Beta <- MASS::mvrnorm(1, bn, Bn)
return(Beta)
}
Postb <- function(Beta, D, ylhat, sig2){
Di <- solve(D); bis <- matrix(0, N, K2)
for(i in 1:N){
ids <- which(id == i)
Wi <- W[ids, ]; Xi <- X[ids, ]
yi <- ylhat[ids]
Wtei <- sig2^(-1)*t(Wi)%*%(yi - Xi%*%Beta)
Bni <- solve(sig2^(-1)*t(Wi)%*%Wi + Di)
bni <- Bni%*%Wtei
bi <- MASS::mvrnorm(1, bni, Bni)
bis[i, ] <- bi
}
return(bis)
}
PostD <- function(bs){
rn <- r0 + N; btb <- matrix(0, K2, K2)
for(i in 1:N){
bsi <- bs[i, ]; btbi <- bsi%*%t(bsi)
btb <- btb + btbi
}
Rn <- d0*R0 + btb
Sigma <- MCMCpack::riwish(v = rn, S = Rn)
return(Sigma)
}
PostSig2 <- function(Beta, bs, ylhat){
an <- a0 + 0.5*NT; ete <- 0
for(i in 1:N){
ids <- which(id == i)
Xi <- X[ids, ]
yi <- ylhat[ids]
Wi <- W[ids, ]
ei <- yi - Xi%*%Beta - Wi%*%bs[i, ]
etei <- t(ei)%*%ei
ete <- ete + etei
}
dn <- d0 + 0.5*ete
sig2 <- MCMCpack::rinvgamma(1, shape = an, scale = dn)
return(sig2)
}
PostBetas <- matrix(0, tot, K1); PostDs <- matrix(0, tot, K2*(K2+1)/2)
Postbs <- array(0, c(N, K2, tot)); PostSig2s <- rep(0, tot)
Accepts <- rep(NULL, tot)
RegPois <- glm(y ~ X - 1, family = poisson(link = "log"))
SumPois <- summary(RegPois)
Beta <- SumPois[["coefficients"]][,1]
sig2 <- sum(SumPois[["deviance.resid"]]^2)/SumPois[["df.residual"]]
D <- diag(K2); bs1 <- rnorm(N, 0, sd = D[1,1]^0.5)
bs2 <- rnorm(N, 0, sd = D[2,2]^0.5); bs <- cbind(bs1, bs2)
tuning <- 0.1; ropt <- 0.44
tunepariter <- seq(round(tot/10, 0), tot, round(tot/10, 0)); l <- 1
pb <- winProgressBar(title = "progress bar", min = 0, max = tot, width = 300)
for(s in 1:tot){
LatY <- LatentMHV1(tuning = tuning, Beta = Beta, bs = bs, sig2 = sig2)
ylhat <- LatY[["ylhat"]]
bs <- Postb(Beta = Beta, D = D, ylhat=ylhat, sig2 = sig2)
D <- PostD(bs = bs)
Beta <- PostBeta(D = D, ylhat = ylhat, sig2 = sig2)
sig2 <- PostSig2(Beta = Beta, bs = bs, ylhat = ylhat)
PostBetas[s,] <- Beta
PostDs[s,] <- matrixcalc::vech(D)
Postbs[, , s] <- bs; PostSig2s[s] <- sig2
AcceptRate <- LatY[["accept"]]
Accepts[s] <- AcceptRate
if(AcceptRate > ropt){
tuning = tuning*(2-(1-AcceptRate)/(1-ropt))
}else{
tuning = tuning/(2-AcceptRate/ropt)
}
if(s == tunepariter[l]){
print(AcceptRate); l <- l + 1
}
setWinProgressBar(pb, s, title=paste( round(s/tot*100, 0),"% done"))
}
## [1] 0.434
## [1] 0.447
## [1] 0.448
## [1] 0.451
## [1] 0.439
## [1] 0.46
## [1] 0.44
## [1] 0.409
## [1] 0.477
## [1] 0.479
## NULL
keep <- seq((burnin+1), tot, thin)
Bs <- PostBetas[keep,]; Ds <- PostDs[keep,]
bs <- Postbs[, , keep]; sig2s <- PostSig2s[keep]
summary(coda::mcmc(Bs))
##
## Iterations = 1:1500
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1500
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## [1,] 0.5169 0.20869 0.005388 0.004732
## [2,] 0.3610 0.06215 0.001605 0.002049
## [3,] 0.5450 0.06383 0.001648 0.001999
## [4,] -0.5713 0.06544 0.001690 0.001903
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## var1 0.1038 0.3803 0.5199 0.6534 0.9259
## var2 0.2432 0.3166 0.3608 0.4003 0.4796
## var3 0.4213 0.5017 0.5453 0.5885 0.6682
## var4 -0.7038 -0.6149 -0.5729 -0.5269 -0.4459
##
## Iterations = 1:1500
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1500
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## [1,] 0.5965 0.16852 0.004351 0.005085
## [2,] -0.1323 0.09418 0.002432 0.003072
## [3,] 0.2885 0.09854 0.002544 0.003397
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## var1 0.3331 0.4788 0.5732 0.69316 0.99135
## var2 -0.3354 -0.1926 -0.1277 -0.06692 0.03674
## var3 0.1252 0.2182 0.2780 0.34731 0.51055
We can see that all 95% credible intervals encompass the population parameters of the fixed effects, the posterior medians are relatively near the population values. However, we do not get good posterior estimates of the covariance matrix of the random effects as the 95% credible intervals do not encompass the second element of the diagonal of this matrix. In addition, the posterior draws of this algorithm over-estimates the over-dispersion parameter.
We can perform inference for the hierarchical longitudinal Poisson model in our GUI using the following Algorithm. Our GUI is based on the MCMChpoisson command from the MCMCpack package.53
Algorithm: Hierarchical Longitudinal Poisson Model
Select Hierarchical Longitudinal Model on the top panel
Select Poisson model using the left radio button
Upload the dataset selecting first if there is a header in the file and the kind of separator in the csv file of the dataset (comma, semicolon, or tab). Then, use the Browse button under the Choose File legend
Select MCMC iterations, burn-in, and thinning parameters using the Range sliders
Write down the formula of the fixed effects equation in the Main Equation: Fixed Effects box. This formula must be written using the syntax of the formula command of R software. This equation includes the intercept by default, so do not include it in the equation
Write down the formula of the random effects equation in the Main Equation: Random Effects box without writing the dependent variable, that is, starting the equation with the tilde (
~
) symbol. This formula must be written using the syntax of the formula command of R software. This equation includes the intercept by default, so do not include it in the equation. If there are just random intercepts, do not write anything in this boxWrite down the name of the grouping variable, that is, the variable that indicates the cross-sectional units
Set the hyperparameters of the fixed effects: mean vector, covariance matrix, shape, and scale parameters. This step is not necessary as by default our GUI uses non-informative priors
Set the hyperparameters of the random effects: degrees of freedom and scale matrix of the inverse Wishart distribution. This step is not necessary as by default our GUI uses non-informative priors
Click the Go! button
Analyze results
Download posterior chains and diagnostic plots using the Download Posterior Chains and Download Posterior Graphs buttons
References
At the time of writing this book there was an issue with the function MCMChpoisson from MCMCpack. We contact the maintainer, but users may have issues running this algorithm or running this function directly in R.↩︎