# Chapter 26 A Generalized Linear Model for Bernoulli Response Data

Example: for each $$i = 1, \ldots, n$$, $$y_i \sim \text{Bernoulli}(\pi_i)$$, $$\pi = \frac{\exp(x_i'\beta)}{1 + \exp(x_i'\beta)}$$ and $$y_1, \ldots, y_n$$ are independent. This model is called a logistic regression model.

The function $$g(\pi) = \log(\frac{\pi}{1-\pi})$$ is called logit function. $$\log(\frac{\pi}{1-\pi})$$ is called log(odds).

Note that $$g(\pi) = x_i'\beta$$. In GLM terminology, the logit is called the link function. However, for GLMs, it is not necessary that the mean of $$y_i$$ be a linear function of $$\beta$$. Here are some other link functions for logistic regression:

• probit: $$\Phi^{-1}(\pi) = x'\beta$$
• complementary log-log (cloglog in R): $$\log(-\log(1-\pi)) = x'\beta$$

For GLMs, Fisher’s Scoring Method is typically used to obtain an MLE for $$\beta$$, denote as $$\hat\beta$$. Fisher’s Scoring Method is a variation of the Newton-Raphson algorithm in which the Hessianm atrix (matrix of second partial derivatives) is replaced by its expected value (-Fisher Information matrix).

For sufficiently large samples, $$\hat\beta$$ is approximately normal with mean $$\beta$$ and a variance-covariance matrix that can be approximated by the estimated inverse of the Fisher information matrix, i.e. $$\hat\beta \sim N(\beta, I^{-1}(\beta))$$.

The Odds ratio: $$\frac{\tilde{\pi}}{1-\tilde{\pi}}/\frac{\pi}{1-\pi} = \exp(\beta_j)$$. This can be explained as: A one unit increase in the jth explanatory variable (with all other explanatory variables held constant) is associated with a multiplicative change in the odds of success by the factor $$\exp(\beta_j)$$.

If $$(L_j, U_j)$$ is a $$100(1-\alpha)\%$$ confidence interval for $$\beta_j$$. then $$(\exp(L_j), \exp(U_j))$$ is a $$100(1-\alpha)\%$$ confidence interval for $$\exp(\beta_j)$$. Also, a $$100(1-\alpha)\%$$ CI for $$\pi$$ is $$\left(\frac{1}{1+\exp(-L_j)},\frac{1}{1+\exp(-U_j)}\right)$$.