35.5 Regression for understanding

The regression equation can be used to understand the relationship between the two variables. Consider again the red deer regression equation:

\[\begin{equation} \hat{y} = 4.398 - 0.181 x. \tag{35.1} \end{equation}\] What does it tell us about the relationship between \(x\) and \(y\)?

35.5.1 The meaning of \(b_0\)

\(b_0\) is the predicted value of \(y\) when \(x=0\). Equation (35.1) predicts a molar weight of \(4.398\) for a deer zero years of age, which is likely to be nonsense: it is extrapolating beyond the data (the youngest deer in the sample is aged 4.4 years).

The value of the intercept \(b_0\) is sometimes meaningful, but is often meaningless. The value of the slope \(b_1\) is usually of greater interest, as it explains the relationship between the two variables.

35.5.2 The meaning of \(b_1\)

\(b_1\) tells us how much the value of \(y\) changes (on average) when the value of \(x\) increase by one. For the red-deer data, \(b_1\) tells us how much the molar weight changes (on average) when age increases by one year.

Each extra year of age is associated with a change of \(-0.181\) grams in molar weight; that is, a decrease in molar weight by a mean of \(0.181\). The molars of some individual deer will lose more weight than this in some years, and some will lose less weight than this in some years… the value is a mean weight loss per year.

To demonstrate, for \(x=10\), \(y\) is predicted to be \(\hat{y}= 2.588\). For deer one year older than this (i.e. \(x=11\)) we predict \(y\) to be \(b_1 = -0.181\) higher (or, equivalently, 0.181 lower). That is, we would predict \(\hat{y}= 2.588 - 0.181 = 2.407\). (This is the same prediction made by using \(x=11\) in Eq. (35.1).)

If the value of \(b_1\) is positive, then the predicted values of \(y\) increase as the values of \(x\) increase.

If the value of \(b_1\) is negative, then the predicted values of \(y\) decrease as the values of \(x\) increase.

This interpretation of \(b_1\) explains the relationship: Each extra year of age reduces the weight of the molars by 0.181 grams, on average, in male red deer. The units of the slope are the units of the response variable divided by the units of the explanatory variable (so in the deer example, the slope is \(-0.181\) grams per year).

Observe what happens if the slope is zero. Since \(b_1\) is the change in \(y\) (on average) when \(x\) increase by one, \(b_1=0\) means that the value of \(y\) changes by zero if the value of \(x\) changes by one. In other words, if the value of \(x\) changes, the predicted value of \(y\) doesn’t change. This is equivalent to saying that there is no relationship between the variables. (We would also find \(r=0\).)

If the value of the slope is zero, there is no linear relationship between \(x\) and \(y\). In this case, the correlation coefficient is also zero.