8.12 The universal workflow of machine learning

Defining the problem and assemble a dataset (Chollet and Allaire 2018, Ch. 4.5)
- Define the problem at hand and the data on which we’ll train. Collect this data, potentially annotate with labels (supervised learning)
Choose a measure of success
- Choose how we’ll measure success on your problem. Which metrics will you monitor on our validation data?
Deciding on an evaluation protocol
- Determine our evaluation protocol: Hold-out validation? K-fold validation? Which portion of the data should we use for validation?
Prepare our data
Developing a model that does better than a baseline (baseline?)
Scaling up: developing a model that overfits
Regularize our model and tune our hyperparameters based on performance on the validation data
- A lot of machine-learning research tends to focus only on this step-but keep the big picture in mind

Chollet, Francois, and J J Allaire. 2018. Deep Learning with R. 1st ed. Manning Publications.