Chapter 26 A Generalized Linear Model for Bernoulli Response Data
Example: for each i=1,…,n, yi∼Bernoulli(πi), π=exp(x′iβ)1+exp(x′iβ) and y1,…,yn are independent. This model is called a logistic regression model.
The function g(π)=log(π1−π) is called logit function. log(π1−π) is called log(odds).
Note that g(π)=x′iβ. In GLM terminology, the logit is called the link function. However, for GLMs, it is not necessary that the mean of yi be a linear function of β. Here are some other link functions for logistic regression:
- probit: Φ−1(π)=x′β
- complementary log-log (cloglog in R): log(−log(1−π))=x′β
For GLMs, Fisher’s Scoring Method is typically used to obtain an MLE for β, denote as ˆβ. Fisher’s Scoring Method is a variation of the Newton-Raphson algorithm in which the Hessianm atrix (matrix of second partial derivatives) is replaced by its expected value (-Fisher Information matrix).
For sufficiently large samples, ˆβ is approximately normal with mean β and a variance-covariance matrix that can be approximated by the estimated inverse of the Fisher information matrix, i.e. ˆβ∼N(β,I−1(β)).
The Odds ratio: ˜π1−˜π/π1−π=exp(βj). This can be explained as: A one unit increase in the jth explanatory variable (with all other explanatory variables held constant) is associated with a multiplicative change in the odds of success by the factor exp(βj).
If (Lj,Uj) is a 100(1−α)% confidence interval for βj. then (exp(Lj),exp(Uj)) is a 100(1−α)% confidence interval for exp(βj). Also, a 100(1−α)% CI for π is (11+exp(−Lj),11+exp(−Uj)).