Chapter 4 Measures of Association between Variables

We will now consider two measures that can be used to measure the relationship between two variables.

The first is the covariance, and the second is the correlation coefficient, often denoted \(r\). (Note that \(r\) is the sample correlation coefficient, whereas the population correlation coefficient is usually denoted \(\rho\)).

If the covariance and correlation values are positive, this indicates that the two variables are positively related. In other words, when one increases, the other one generally also increases.
On the other hand, if the covariance and correlation values are negative, this indicates that the two variables are negatively related. That is, when one increases, the other will typically decrease.
Covariance and correlation values of 0 indicate that the two variables are unrelated (at least in a linear sense).

Apart from its sign (positive or negative), the covariance value can be hard to interpret, especially if the two variables are on different scales. However, correlation is a standardised measure, meaning it is much easier to interpret. The correlation coefficient is always between -1 and 1. The closer \(|r|\) (the absolute value of \(r\)) is to 1, the stronger the linear relationship between the two variables.

The below table can be used as a guide when interpreting the correlation coefficient \(r\).

Table 4.1: A guide to interpreting the strength of a correlation coefficient.
Range of \|r\|	Strength of correlation
0 to 0.3	None or very weak
0.3 to 0.5	Weak
0.5 to 0.8	Moderate
0.8 to 1	Strong

We will see some examples in the next section.