## 12.4 Classification trees

• Predict qualitative outcomes rather than a quantitative one (vs. regression trees)
• Prediction: Unseen (test) observation i belongs to the most commonly occurring class of training observations (the mode) in the region to which it belongs
• Region: Young people (<25) with 3 previous offences
• Training observations: Most individuals in region re-offended
• Unseen/test observations: Prediction.. also re-offended
• To grow classification tree we use recursive splitting/partitioning
• Splitting training data into sub-populations based on several dichotomous independent variables
• Criterion for making binary splits: Classification error rate (vs. RSS in regression tree)
• minimize CRR: Fraction of training observations in region that do not belong to most common class in region
• $$E=1−\max\limits_{k}(\hat{p}_{mk})$$ where $$\hat{p}_{mk}$$ is proportion of training observations in the $$m$$th region that are from the $$k$$th class
• In practice: Use measures more sensitive to node purity, i.e., Gini index or cross-entropy

### References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.