Exercise
- In our lab we used a logistic model to predict whether someone will recidiviate or not. So far we only used two variables/features for our predictive model namely
age
and priors_count
. Extend the model we used above with three more predictors (race
, sex
, juv_fel_count
) and train the model using the training dataset data_train
.
- Compute the training error rate for your model and compare it to the model we used in the lab (
is_recid ~ age + priors_count
). Did you manage to built a better model?
- Using your model predict the probabilities of recidivism for three different ages (20, 40, 60 years). Keep that values of the other features at the following values:
race = "African-American"
, sex = "Male"
and juv_fel_count = mean(data_train$juv_fel_count)
.
- Compute the test error rate for your model and compare it to the model above (
is_recid ~ age + priors_count
). How does your model fare in terms of out of sample prediction?