1.1 Test for significance: Correlation coefficient

After calculating the sample correlation coefficient \(r\), we can take it one step further and test for the significance of the correlation coefficient. Consider the following hypotheses:

\[H_0:\rho = 0 \;\;\text{versus}\;\;H_1: \rho \neq 0,\]

where:

  • \(\rho\) denotes the true (population) correlation coefficient.

By carrying out this hypothesis test, we can determine whether or not the correlation coefficient is significant. In other words, we we reject \(H_0\), we conclude there is evidence to suggest that the correlation is not equal to zero. This would mean we have evidence that there is a significant linear relationship (or association) between the two variables. On the other hand, if we do not reject \(H_0\), this means we do not have enough evidence to conclude that the correlation is not equal to zero. If the correlation is equal to zero, this means there is no linear relationship between the two variables.

Let's carry out the test to determine whether there is evidence of a significant association between the income and happiness variables:


    Pearson's product-moment correlation

data:  df$income_2019 and df$happiness_2019
t = 10.278, df = 76, p-value = 4.945e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6504941 0.8422277
sample estimates:
      cor 
0.7626303 

From the above output, we note the following:

  • The \(p\)-value is almost 0, which is much less than 0.05, so we reject \(H_0\). That is, there is evidence to suggest that the correlation is not equal to zero. This means we have evidence of a significant association between the two variables
  • The test statistic is \(t = 10.278\)
  • The sample correlation coefficient is \(r = 0.76\)
  • The 95% confidence interval for \(\rho\) is (0.65, 0.84). That is, we are 95% confident that the true correlation between income and happiness is between 0.65 and 0.84.