Bagging
- Single decision trees may suffer from high variance, i.e., results could be very different for different splits
- We want: Low variance procedure would yield similar results if applied repeatedly to distinct datasets (e.g., linear regression)
- Bagging: construct \(B\) regression trees using \(B\) bootstrapped training sets, and average the resulting predictions
- Bootstrap (also called bootstrap aggregation): Take repeated samples from the (single) training set, built predictive model and average predictions
- Classification: For a given test observation, we can record the class predicted by each of the \(B\) trees, and take a majority vote: the overall prediction is the most commonly occurring majority vote class among the \(B\) predictions