9.5 Resampling methods (4): Leave-one-out cross-validation (LOOCV) | Computational Social Science

Loading [MathJax]/jax/output/CommonHTML/jax.js

9.5 Resampling methods (4): Leave-one-out cross-validation (LOOCV)

Q: James et al. (2013) use Figure 5.3 [p.179] to explain the Leave-One-Out Cross-Validation (LOOCV). Please inspect the figure and explain it in a few words.

Data set repeatedly split into two parts, whereby a single observation $(x_{1},y_{1})$ is used for validation set
- Remaining observations ${(x_{2},y_{2}),...,(x_{n},y_{n})}$ make up the training set
- Statistical learning method is fit on the $n−1$ training observations and prediction is made for excluded observation
- Repeated for all splits
Q: What might be advantages/disadvantages?

Advantages over validation set approach
- Less bias because training data is bigger (Why?)
- Performing LOOCV repeatedly will always produces the same results (Why? No randomness in training/validation set splits)
Disadvantage: Can be computationally intensive (Why?) [estimate it n times]

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.