Big data and Social Science

10.1 Dressel and Farid (2018): Predicting recidivism

Q: Whats the topic of this study?
Dressel and Farid (2018): “The accuracy, fairness, and limits of predicting recidivism”
- “In the criminal justice system, predictive algorithms have been used to predict where crimes will most likely occur, who is most likely to commit a violent crime, who is likely to fail to appear at their court hearing, and who is likely to reoffend at some point in the future.”
- Compas has been used to assess more than 1 million offenders since it was developed in 1999 (since 2000 recidivism prediction component).
  - “software predicts a defendant’s risk of committing a misdemeanor or felony within 2 years of assessment from 137 features about an individual and the individual’s past criminal record.”
What is the research question?
- General: Are algorithms better in predicting than humans?
- Specific: Does the COMPAS algorithm perform better than humans in predicting recidivism?
What is the hypothesis?
- They are worse/better/as good as
What data do they use?
- Database of 2013/2014 pretrial defendants from Broward County, Florida
  - 7214 defendants with individual demographic information, criminal history, the COMPAS recidivism risk score, and each defendant’s arrest record within a 2-year period following the COMPAS scoring
    - COMPAS scores, ranging from 1 to 10, classify the risk of recidivism as low-risk (1 to 4), medium-risk (5 to 7), or high-risk (8 to 10)
  - Algorithmic assessment based on full set of 7214 defendants
  - Human assessment was based on a random subset of 1000 defendants, which was held fixed throughout all conditions
What is their finding?
- “people from a popular online crowdsourcing marketplace - who, it can reasonably be assumed, have little to no expertise in criminal justice - are as accurate and fair as COMPAS at predicting recidivism”
- “Collectively, these results cast significant doubt on the entire effort of algorithmic recidivism prediction”
- COMPAS software is equivalent to a simple linear classifier (age + total number of previous convictions)
Q: What struck you about this study? What did you find interesting?
- What do the authors mention about race (Tip: Correlates with..)?
- What is the outcome variable in the study? What are false positive and false negatives in this context?
Insight: Widespread use of questionable prediction/classifcation algorithms

References

Dressel, Julia, and Hany Farid. 2018. “The Accuracy, Fairness, and Limits of Predicting Recidivism.” Sci Adv 4 (1): eaao5580.