9.7 Resampling methods (6): k-Fold Cross-Validation

James et al. (2013) use Figure 5.5 [p.179] to explain the k-Fold Cross-Validation. Please inspect the figure and explain it in a few words.

k-Fold Cross-Validation is special case of LOOCV where $k$ is set to equal $n$
Error rate: $CV_{(k)} = \frac{1}{k}\sum_{i=1}^{k}Err_{i}$
Advantages
- Q: What is the advantage of using $k = 5$ or $k = 10$ rather than $k = n$ ? [computation!]
Bias-variance trade-off (James et al. 2013, Ch. 5.1.4)
- Q: k-Fold CV may give more accurate estimates of test error rate than LOOCV: Why?
  - Datasets smaller than LOOCV, but larger than validation set approach
    - LOOCV best from bias perspective but (most observations) but each model fit on identical observations (highly correlated)
      - Mean of many highly correlated quantities has higher variance than does the mean of many quantities that are notas highly correlated → test error estimate from LOOCV higher variance than test error estimate from k-fold CV
- Recommendation use k-fold cross-validation using k = 5 or k = 10

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.