35.4 Regression for predictions

The regression equation for the red deer data

\[ \hat{y} = 4.398 - 0.181 x \] can be used to make predictions. For example, we could predict the average molar weight for deer \(10\) years old. Since \(x\) represents the age, use \(x=10\) in the regression equation:

\[\begin{eqnarray*} \hat{y} &=& 4.398 - (0.181\times 10)\\ &=& 4.398 - 1.81\\ &=& 2.588. \end{eqnarray*}\] Male red deer aged 10 years old are predicted to have a mean molar weight of 2.588 grams. Some individual male red deer aged 10 will have molars weighing more than this, and some will have molars weighing less than this. The model predicts that the mean molar weight for male red deer aged 10 will be about 2.588 grams.

Think 35.2 (Predicting) For male red deer \(12\) years of age, what is the predicted mean molar weight?
Think 35.3 (Predicting) For male red deer \(20\) years of age, what is the predicted mean molar weight?
Prediction: \(4.398 - (0.181 \times 20) = 0.778\), or about 0.78 grams.

This last prediction may be a useful prediction… but it also may be rubbish. The oldest deer in the data is aged 14.4 years, so the regression line may not even apply for deer aged over 14.4 years of age (red deer may not even live to 20 years of age). The prediction may be sensible… but it may not be either. We don’t know whether the prediction is sensible or not, because we have no data for deer aged over 14.4 years to inform us. Making prediction outside the range of the available data is called extrapolation, and extrapolation beyond the data can lead to nonsense predictions.

Definition 35.1 (Extrapolation) Extrapolation refers to making predictions outside the range of the available data. Extrapolation beyond the data can lead to nonsense predictions.
Extrapolating can lead to nonsense predictions.