6.16 The Logistic Model

  • How should we model the relationship between \(p(X)=Pr(Y=1|X)\) and \(X\)?
    • See Figure 4.2 in James et al. (2013, 131)
    • Use either linear probability model or logistic regression
  • Linear probability model: \(p(X)=\beta_{0}+\beta_{1}X\)
    • Linear predictions of our outcome (probabilities), can be out of [0,1] range
  • Logistic regression (uses logistic function): \(p(X)=\frac{e^{\beta_{0}+\beta_{1}X}}{1+e^{\beta_{0}+\beta_{1}X}}\)
    • odds: \(\frac{p(X)}{1-p(X)}\) (range: \([0,\infty]\), the higher, the higher probability of recidivism)
    • log-odds: \(log\left(\frac{p(X)}{1-p(X)}\right)\) (James et al. 2013, 132)
  • Estimation of \(\beta_{0}\) and \(\beta_{1}\) usually relies on maximum likelihood
  • See James et al. (2013 Chap. 4.3.4) for an overview
  • Source: James et al. (2013 Chap. 4.3.1, 4.3.2, 4.3.4)


James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.