8.1 The Logistic Model

  • Predicting recidivsm (0/1): How should we model the relationship between p(X)=Pr(Y=1|X) and X?
    • See Figure 4.2 in James et al. (2013, 131)
    • Use either linear probability model or logistic regression
  • Linear probability model: p(X)=β0+β1X
    • Linear predictions of our outcome (probabilities), can be out of [0,1] range
  • Logistic regression (uses logistic function): p(X)=eβ0+β1X1+eβ0+β1X
    • odds: p(X)1p(X)=eβ0+β1X (range: [0,], the higher, the higher probability of recidivism/default)
    • log-odds/logit: log(p(X)1p(X))=β0+β1X (James et al. 2013, 132)
      • Increasing X by one unit, increases the log odds by β1 (usually output in R)
  • Estimation of β0 and β1 usually relies on maximum likelihood
  • See James et al. (2013, chap. 4.3.4) for an overview
  • Source: James et al. (2013, chaps. 4.3.1, 4.3.2, 4.3.4)

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.