19.1 Introduction

Prediction is very difficult, especially if it’s about the future!

Niels Bohr

This familiar quote always seems a bit funny, although it is not immediately clear why. The fact that the same or highly similar quotes are also attributed to Mark Twain and Yogi Berra suggest that it expresses something obvious — some fact that is widely accessible and agreed upon. And while it may seem circular, it actually is not: Although most predictions may concern the future, we can predict anything not known to us, including things that happened in the past. When predicting data with computers, some criterion events have typically been determined and can be used to evaluate predictions.

The temporal dimension in the quote has an analogue in the type of sampling that is used to evaluate predictions: Predicting different data from an existing sample corresponds to predicting the past, whereas predicting entirely new data (i.e., out-of-sample) corresponds to predicting the future.

The basic principles discussed in this chapter will apply to both general types of prediction.
As we will see, even when predicting the past or data within the same sample, prediction remains difficult.

On of the most fundamental properties of thought is its power of predicting events.

K.J.W. Craik (1943), The nature of explanation, Ch. 5

Clarify key terminology: Prediction.

Distinguish between explanation (accounting for past data) and prediction (accounting for novel data).

An ability to predict implies specifying some mechanism. This does not need to be the true underlying mechanism, but some explicit model (which could be the actual mechanism or an as if model).


  • formal models
  • criteria for evaluating predictions


Craik, K. J. W. (1943). The nature of explanation. Cambridge University Press.