8.4 LR in R: Predicting Recidvism (3): Use model to predict

  • predict(): Predict values in R
    • Once coefficients have been estimated, it is a simple matter to compute the probability of outcome for values of our predictors (James et al. 2013, 134)
    • predict(fit, newdata = NULL, type = "response"): Predict probability for each unit
    • Use argument type="response" to output probabilities of form \(P(Y=1|X)\) (as opposed to other information such as the logit)
data_train$probability <- predict(fit, type = "response")
data_train$classified <- if_else(data_train$probability >= 0.5, 1, 0)
head(data_train %>% select(is_recid, age, priors_count, probability, classified))
is_recid age priors_count probability classified
0 69 0 0.0880770 0
1 34 0 0.3558881 0
1 24 4 0.6329705 1
0 23 1 0.5286854 1
0 43 2 0.3270028 0
0 44 0 0.2513234 0
  • predict(fit, newdata = data_predict, type = "response"): Predict probability setting values for particular Xs (contained in data_predict)
data_predict = data.frame(age = c(20, 20, 40, 40),
                          priors_count = c(0, 2, 0, 2))
data_predict$probability <- predict(fit, newdata = data_predict, type = "response")
data_predict
age priors_count probability
20 0 0.5260711
20 2 0.6045219
40 0 0.2906473
40 2 0.3607111
  • Q: How would you interpret these values?

Below

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.