## 8.2 LR in R: Predicting Recidvism (1)

• Logistic regression (LR) models the probability that $$Y$$ belongs to a particular category (0 or 1)
• Rather than modeling response $$Y$$ directly
• COMPAS data: Model probability to recidivate (reoffend)
• Outcome $$y$$: Recidivism is_recid (0,1,0,0,1,1,...)
• Various predictors $$x's$$
• age = age
• prior offenses = priors_count
• Use LR to obtain predicted values $$\hat{y}$$ + As probabilities predicted values will range between 0 and 1 + Depend on input/features (e.g., age, prior offences)
• Convert predicted values (probabilities) to a binary variable
• e.g., individuals will recidivate (is_recid = Yes) if Pr(is_recid=Yes|age) > 0.5 (p(age) > 0.5)
• Here we call this variable classified
• More conservative: Use lower threshold, e.g., individuals will recidivate (is_recid = Yes) if Pr(is_recid=Yes|age) > 0.1
• Source: James et al. (2013, chap. 4.3)

### References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.