8.7 Exercise

  1. In our lab we used a logistic model to predict whether someone will recidiviate or not. So far we only used two variables/features for our predictive model namely age and priors_count. Extend the model we used above with three more predictors (race, sex, juv_fel_count) and train the model using the training dataset data_train.
  2. Compute the training error rate for your model and compare it to the model we used in the lab (is_recid ~ age + priors_count). Did you manage to built a better model?
  3. Using your model predict the probabilities of recidivism for three different ages (20, 40, 60 years). Keep that values of the other features at the following values: race = "African-American", sex = "Male" and juv_fel_count = mean(data_train$juv_fel_count).
  4. Compute the test error rate for your model and compare it to the model above (is_recid ~ age + priors_count). How does your model fare in terms of out of sample prediction?