9.5 Resampling methods (4): Leave-one-out cross-validation (LOOCV)

  • Q: James et al. (2013) use Figure 5.3 [p.179] to explain the Leave-One-Out Cross-Validation (LOOCV). Please inspect the figure and explain it in a few words.





  • Data set repeatedly split into two parts, whereby a single observation \((x_{1},y_{1})\) is used for validation set
    • Remaining observations \({(x_{2},y_{2}),...,(x_{n},y_{n})}\) make up the training set
    • Statistical learning method is fit on the \(n−1\) training observations and prediction is made for excluded observation
    • Repeated for all splits
  • Q: What might be advantages/disadvantages?



  • Advantages over validation set approach
    • Less bias because training data is bigger (Why?)
    • Performing LOOCV repeatedly will always produces the same results (Why? No randomness in training/validation set splits)
  • Disadvantage: Can be computationally intensive (Why?) [estimate it n times]

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.