# Chapter 28 A Generalized Linear Model for Poison Response Data

For all $$i = 1, \ldots, n$$, $$y_i \sim \text{Possion}(\lambda_i)$$, $$\log(\lambda_i) = x_i'\beta$$. The Poisson log likelihood is $\ell(\beta\mid y) = \sum_{i=1}^n \left[y_ix_i'\beta - exp(x_i'\beta) - \log(y_i!) \right]$ The $$\ell(\beta\mid y)$$ can be maximized using Fisher’s scoring method to obtain the MLE.

Let $$\lambda = \exp(x'\beta)$$ and $$\tilde{\lambda} = exp(\tilde{x}'\beta)$$ where $$\tilde{x} = [x_1, \ldots, x_{j-1}, x_{j} + 1, x_{j+1}, \ldots, x_p ]'$$, we have $$\tilde{\lambda}/\lambda = \exp(\beta_j)$$. This means that all other explanatory variables held constant, the mean response at $$x_j + 1$$ is $$\exp(\beta_j)$$ times the mean response at $$x_j$$.

o = glm(y ~ x, family = poisson(link = "log"))
summary(o)

# likelihood ratio test
anova(o, test = "Chisq")

Lack of Fit: Under saturated model, $$\lambda_i = y_i$$. Then the likelihood ratio statistic for testing the Poisson GLM as the reduced model vs. the saturated model as the full model is $2 \sum_{i = 1}^n \left[y_i\log\left(\frac{y_i}{\hat\lambda_i}\right) - (y_i - \hat\lambda_i) \right]$ which is the Deviance Statistic for the Poisson case.

The deviance residuals are given by $d_i \equiv \text{sign}(y_i - \hat\lambda_i)\sqrt{2\left[y_i\log\left(\frac{y_i}{\hat\lambda_i} \right) - (y_i - \hat\lambda_i)\right]}$ The Pearson’s Chi-square statistic is $$X^2 = \sum_{i=1}^{n}\left(\frac{y_{i}-\widehat{E}\left(y_{i}\right)}{\sqrt{\widehat{\operatorname{Var}}\left(y_{i}\right)}}\right)^{2} = \sum_{i=1}^n \left(\frac{y_i - \hat\lambda_i}{\sqrt{\hat\lambda_i}}\right)^2$$. The Pearson residure $$r_i = (y_i - \hat\lambda_i)/\sqrt{\hat\lambda_i}$$.

d = resid(o, type = "deviance")
r = resid(o, type = "pearson")

For the Poisson case, $$Var(y) = E(y) = \lambda$$ is a function of $$E(y)$$. If either the Deviance Statistic or the Pearson Chi-Square Statistic suggests a lack of fit that cannot be explained by other reasons (e.g., poor model for the mean or a few extreme outliers), overdispersion may be the problem.

Quasi-likelihood: Suppose $$Var(y_i) = \phi \lambda_i$$ for some unknown dispersion parameter $$\phi > 1$$. $$\phi$$ can be estimated by $$\hat\phi = \sum_{i = 1}^n d_i^2/(n-p)$$ or $$\hat\phi = \sum_{i=1}^n r_i^2/(n-p)$$.

• The estimated variance of $$\hat\beta$$ is multiplied by $$\hat\phi$$.
• For Wald type inferences, the standard normal null distribution is replaced by $$t$$ with $$n-p$$ degrees of freedom.
• Any test statistic $$T$$ that was assumed $$\chi_q^2$$ under $$H_0$$ is replaced with $$T/(q\hat\phi)$$ and compared to an $$F$$ distribution with $$q$$ and $$n-p$$ degrees of freedom.
# estimates of the dispersion parameter
deviance(o)/df.residual(o)

sum(r^2)/df.residual(o)