1 Predicting Recidvism (1): The data

id name compas_screening_date is_recid is_recid_factor age priors_count
1 miguel hernandez 2013-08-14 0 no 69 0
3 kevon dixon 2013-01-27 1 yes 34 0
4 ed philo 2013-04-14 1 yes 24 4
5 marcu brown 2013-01-13 0 no 23 1
6 bouthy pierrelouis 2013-03-26 0 no 43 2
7 marsha miles 2013-11-30 0 no 44 0
8 edward riddle 2014-02-19 1 yes 41 14
9 steven stewart 2013-08-30 0 no 43 3
10 elizabeth thieme 2014-03-16 0 no 39 0
13 bo bradac 2013-11-04 1 yes 21 1

2 Predicting Recidvism (2): The Logistic Model

  • Predicting recidivsm (0/1): How should we model the relationship between Pr(Y=1|X)=p(X) and X?
    • See Figure 1 below
    • Use either linear probability model or logistic regression
  • Linear probability model: p(X)=β0+β1X
    • Linear predictions of our outcome (probabilities), can be out of [0,1] range
  • Logistic regression: p(X)=eβ0+β1X1+eβ0+β1X
    • …force them into range using logistic function
    • odds: p(X)1p(X)=eβ0+β1X (range: [0,], the higher, the higher probability of recidivism/default)
    • log-odds/logit: log(p(X)1p(X))=β0+β1X ()
      • …take logarithm on both sides.
      • Increasing X by one unit, increases the log odds by β1 (usually output/interpretation in R)
  • Estimation of β0 and β1 usually relies on maximum likelihood

3 Predicting Recidvism (3)

  • Logistic regression (LR) models the probability that Y belongs to a particular category (0 or 1)
    • Rather than modeling response Y directly
  • COMPAS data: Model probability to recidivate (reoffend)
    • Outcome y: Recidivism is_recid (0,1,0,0,1,1,...)
    • Various predictors xs
      • age = age
      • prior offenses = priors_count
  • Use LR to obtain predicted values y^
    • As probabilities predicted values will range between 0 and 1
    • Depend on input/features (e.g., age, prior offences)
  • Convert predicted values (probabilities) to a binary variable
    • e.g., predict individuals will recidivate (is_recid = Yes) if Pr(is_recid=Yes|age) > 0.5
    • Here we call this variable y_hat_01
  • Source: James et al. ()

4 Predicting Recidvism (4): Model estimation

  • Estimate model in R: glm(y ~ x1 + x2, family = binomial, data = data_train)
fit <- glm(as.factor(is_recid) ~ age + priors_count, 
           family = binomial, 
           data = data_train)
cat(paste(capture.output(summary(fit))[11:14], collapse="\n"))
              Estimate Std. Error z value            Pr(>|z|)    
(Intercept)   1.101001   0.097597   11.28 <0.0000000000000002 ***
age          -0.049831   0.002861  -17.42 <0.0000000000000002 ***
priors_count  0.159982   0.008236   19.43 <0.0000000000000002 ***
  • R output shows log odds: e.g., a one-unit increase in age is associated with an increase in the log odds of is_recid by -0.05 units
  • Difficult to interpret.. much easier to use predicted probabilities

5 Predicting Recidvism (5): Use model to predict

  • predict(): Predict values in R (or augment())

    • Once coefficients have been estimated, it is a simple matter to compute the probability of outcome for values of our predictors ()
    • predict(fit, newdata = NULL, type = "response"): Predict probability for each unit
    • Use argument type="response" to output probabilities of form P(Y=1|X) (as opposed to other information such as the logit)
  • predict(fit, newdata = data_predict, type = "response"): Predict probability setting values for particular Xs (contained in data_predict)

data_predict = data.frame(age = c(20, 20, 40, 40),
                          priors_count = c(0, 2, 0, 2))
data_predict$y_hat <- predict(fit, newdata = data_predict, type = "response")
  age priors_count     y_hat
1  20            0 0.5260711
2  20            2 0.6045219
3  40            0 0.2906473
4  40            2 0.3607111
  • Q: How would you interpret these values?

6 Predicting Recidvism (6)

  • Background story by ProPublica: Machine Bias
  • Replication and extension by Dressel and Farid (): The Accuracy, Fairness, and Limits of Predicting Recidivism
    • Abstract: “Algorithms for predicting recidivism are commonly used to assess a criminal defendant’s likelihood of committing a crime. […] used in pretrial, parole, and sentencing decisions. […] We show, however, that the widely used commercial risk assessment software COMPAS is no more accurate or fair than predictions made by people with little or no criminal justice expertise. In addition, despite COMPAS’s collection of 137 features, the same accuracy can be achieved with a simple linear classifier with only two features.”
  • Very nice lab by Lee, Du, and Guerzhoy (): Auditing the COMPAS Score: Predictive Modeling and Algorithmic Fairness
  • We will work with the corresponding data and use it to grasp various concepts underlying statistical/machine learning


