6.18 LR in R: Predicting Recidvism (2)

  • Estimate model: glm(y ~ x1 + x2, family = binomial, data = data.train)
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   1.101001   0.097597   11.28   <2e-16 ***
## age          -0.049831   0.002861  -17.42   <2e-16 ***
## priors_count  0.159982   0.008236   19.43   <2e-16 ***
  • R output shows log odds: e.g., a one-unit increase in age is associated with an increase in the log odds of is_recid by -0.05 units (annoying interpretation)

  • predict(): Predict values

    • Once coefficients have been estimated, it is a simple matter to compute the probability of outcome for values of our predictors (James et al. 2013, 134)
    • predict(fit, newdata = NULL, type = "response"): Predict probability for each unit
    • predict(fit, newdata = data_predict, type = "response"): Predict probability for particular Xs (contained in data_predict)
    • type="response": Output probabilities of form \(P(Y=1|X)\) (as opposed to other information such as the logit)
age priors_count Pr
30 2 0.4815163
30 4 0.5611905
50 2 0.2552909
  • Q: How would you interpret these values?
  • Source: James et al. (2013 Chap. 4.3.3, 4.6.2)

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.