Chapter 28 A Generalized Linear Model for Poison Response Data

For all i=1,,n, yiPossion(λi), log(λi)=xiβ. The Poisson log likelihood is (βy)=ni=1[yixiβexp(xiβ)log(yi!)] The (βy) can be maximized using Fisher’s scoring method to obtain the MLE.

Let λ=exp(xβ) and ˜λ=exp(˜xβ) where ˜x=[x1,,xj1,xj+1,xj+1,,xp], we have ˜λ/λ=exp(βj). This means that all other explanatory variables held constant, the mean response at xj+1 is exp(βj) times the mean response at xj.

o = glm(y ~ x, family = poisson(link = "log"))
summary(o) 

# likelihood ratio test 
anova(o, test = "Chisq")

Lack of Fit: Under saturated model, λi=yi. Then the likelihood ratio statistic for testing the Poisson GLM as the reduced model vs. the saturated model as the full model is 2ni=1[yilog(yiˆλi)(yiˆλi)] which is the Deviance Statistic for the Poisson case.

The deviance residuals are given by disign(yiˆλi)2[yilog(yiˆλi)(yiˆλi)] The Pearson’s Chi-square statistic is X2=ni=1(yiˆE(yi)^Var(yi))2=ni=1(yiˆλiˆλi)2. The Pearson residure ri=(yiˆλi)/ˆλi.

d = resid(o, type = "deviance")
r = resid(o, type = "pearson")

For the Poisson case, Var(y)=E(y)=λ is a function of E(y). If either the Deviance Statistic or the Pearson Chi-Square Statistic suggests a lack of fit that cannot be explained by other reasons (e.g., poor model for the mean or a few extreme outliers), overdispersion may be the problem.

Quasi-likelihood: Suppose Var(yi)=ϕλi for some unknown dispersion parameter ϕ>1. ϕ can be estimated by ˆϕ=ni=1d2i/(np) or ˆϕ=ni=1r2i/(np).

  • The estimated variance of ˆβ is multiplied by ˆϕ.
  • For Wald type inferences, the standard normal null distribution is replaced by t with np degrees of freedom.
  • Any test statistic T that was assumed χ2q under H0 is replaced with T/(qˆϕ) and compared to an F distribution with q and np degrees of freedom.
# estimates of the dispersion parameter 
deviance(o)/df.residual(o)

sum(r^2)/df.residual(o)