- Resampling methods (James et al. 2013, Ch. 5)
- Repeatedly draw samples from training set and refit model of interest on each sample to obtain additional information about the fitted model, e.g., the variability of a logistic regression fit
- Aim: Obtain information that would not be available from fitting model only once using the original training sample
- Model assessment: Process of evaluating a model’s performance
- Model selection: Process of selecting the proper level of flexibility for a model
- Common resampling methods: Cross-validation (CV) and bootstrap
- Cross-validation used to (1) estimate the test error associated with a given statistical learning method in order to evaluate its performance, or (2) to select the appropriate level of flexibility
- Bootstrap most commonly used in model selection to provide a measure of accuracy of a parameter estimate or of a given statistical learning method
- Here we discuss cross-validation!
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.