9.2 Resampling methods (1)

  • Resampling methods (James et al. 2013, Ch. 5)
    • Repeatedly draw samples from training set and refit model of interest on each sample to obtain additional information about the fitted model, e.g., the variability of a logistic regression fit
    • Aim: Obtain information that would not be available from fitting model only once using the original training sample
  • Model assessment: Process of evaluating a model’s performance
  • Model selection: Process of selecting the proper level of flexibility for a model
  • Common resampling methods: Cross-validation (CV) and bootstrap
    • Cross-validation used to (1) estimate the test error associated with a given statistical learning method in order to evaluate its performance, or (2) to select the appropriate level of flexibility
    • Bootstrap most commonly used in model selection to provide a measure of accuracy of a parameter estimate or of a given statistical learning method
  • Here we discuss cross-validation!

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.