## 12.6 Bagging

• Single decision trees may suffer from high variance, i.e., results could be very different for different splits
• Why?
• We want: Low variance procedure would yield similar results if applied repeatedly to distinct datasets (e.g., linear regression)
• Bagging: construct $$B$$ regression trees using $$B$$ bootstrapped training sets, and average the resulting predictions
• Bootstrap (also called bootstrap aggregation): Take repeated samples from the (single) training set, built predictive model and average predictions
• Classification: For a given test observation, we can record the class predicted by each of the $$B$$ trees, and take a majority vote: the overall prediction is the most commonly occurring majority vote class among the $$B$$ predictions