9.1 Retake: Simple setup to build predictive model
- A simple setup to built a predictive model might look as follows:
- Randomly split data into one training dataset and one validation dataset
- Train model based on training data
- Predict outcome in training data and calculate training error rate
- If unhappy, change model (e.g. select more features) and redo (3)
- If happy, use trained model to predict outcome in validation dataset and calculate test error rate
- Model tuning
- e.g., parameter tuning, feature selection, up-/down-sampling of imbalanced data prior to training
- Sometimes we might to want to use different datasets for model tuning vs. calculating true/test error rate
References
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.