Summary of three key questions:
What is being predicted? What type of predictive task is being performed?
How is predictive success evaluated?
What is the baseline performance and the performance of alternative benchmarks?
Actually, asking these three questions are useful for any kind of agent modeling or measuring performance in general. (See Neth et al., 2016, for applications in assessments of human rationality.)
19.5.2 Beware of biases
A final caveat:
A story: Inspecting WW-II bomber planes after their missions: Assume that
- 90% of planes show bullet hits on wings;
- 10% of planes show bullet hits on tanks.
Where should we add more reinforcements?
An instance of survivorship bias.
More generally, we may be susceptible to biases due to the availability of data. Thus, when training and evaluating algorithms, we must always ask ourselves where the data is coming from.
Pointers to sources of inspirations and ideas:
Books, chapters, and packages
Articles and blogs
The Learning Machines blog (by Holger K. von Jouanne-Diedrich) provides many posts on essential aspects of prediction. For instance: