Session 14 Final session
- Structure of a predictive paper
- Introduction
- Review of previous predictive papers for respective data
- Methods & data
- Description of method etc.
- Empirical results
- Conclusion
- Term paper deadline
- Pagedown & Bibtex
- API reviews
- Training & test data (RFs)
- Definition of test error: Resampling methods (2): Cross-validation
- RF logic: Out-of-Bag (OOB) Error Estimation
- OOB error used to estimate test error
- BUT in principle you could take out some data from your labeled training dataset to validate your model in the end.
- In the lab Lab: Random Forest for text classification you would only use a subset of the training observations (
training_data==TRUE
) to grow the forest, e.g., 80% and use the other 20% for a final validation
- In the lab Lab: Random Forest for text classification you would only use a subset of the training observations (