## 6.17 LR in R: Predicting Recidvism (1)

• Logistic regression (LR) models the probability that $$Y$$ belongs to a particular category (0 or 1)
• Rather than modeling response $$Y$$ directly
• COMPAS data: Model probability to recidivate (reoffend)
• Outcome $$y$$: Recidivism is_recid (0,1,0,0,1,1,...)
• Predictors $$x's$$: age = age, prior offenses = priors_count
• Predicted values $$\hat{y}$$: Pr(is_recid=Yes|age)
• Values of Pr(is_recid=Yes|age) (abbr. p(age)) will range between 0 and 1
• For given value of age (and other covariates in the model), a prediction can be made for outcome is_recid
• We can convert our predicted value (= a probability) to a 0/1 variable
• e.g., individuals will recidivate (is_recid = Yes) if Pr(is_recid=Yes|age) > 0.5 (p(age) > 0.5)
• More conservative: Use lower threshold, e.g., individuals will recidivate (is_recid = Yes) if Pr(is_recid=Yes|age) > 0.1
• Source: James et al. (2013 Chap. 4.3)

