Chapter 26 A Generalized Linear Model for Bernoulli Response Data

Example: for each i=1,,n, yiBernoulli(πi), π=exp(xiβ)1+exp(xiβ) and y1,,yn are independent. This model is called a logistic regression model.

The function g(π)=log(π1π) is called logit function. log(π1π) is called log(odds).

Note that g(π)=xiβ. In GLM terminology, the logit is called the link function. However, for GLMs, it is not necessary that the mean of yi be a linear function of β. Here are some other link functions for logistic regression:

  • probit: Φ1(π)=xβ
  • complementary log-log (cloglog in R): log(log(1π))=xβ

For GLMs, Fisher’s Scoring Method is typically used to obtain an MLE for β, denote as ˆβ. Fisher’s Scoring Method is a variation of the Newton-Raphson algorithm in which the Hessianm atrix (matrix of second partial derivatives) is replaced by its expected value (-Fisher Information matrix).

For sufficiently large samples, ˆβ is approximately normal with mean β and a variance-covariance matrix that can be approximated by the estimated inverse of the Fisher information matrix, i.e. ˆβN(β,I1(β)).

The Odds ratio: ˜π1˜π/π1π=exp(βj). This can be explained as: A one unit increase in the jth explanatory variable (with all other explanatory variables held constant) is associated with a multiplicative change in the odds of success by the factor exp(βj).

If (Lj,Uj) is a 100(1α)% confidence interval for βj. then (exp(Lj),exp(Uj)) is a 100(1α)% confidence interval for exp(βj). Also, a 100(1α)% CI for π is (11+exp(Lj),11+exp(Uj)).