Inference about \(\rho\)

We need to be able to judge whether or not the population correlation coefficient, \(\rho\) could plausibly be zero. If the population correlation coefficient could be zero then this provides evidence that there is no linear relationship between the two variables. In this course, we will use a table of critical values for \(r\) to test \[H_0:\rho =0 \mbox{ vs } H_1:\rho\neq 0\] a two-sided test. These critical values have been computed for a range of sample sizes and significance levels.

  1. Table 8 in your statistical tables gives critical values (\(c\)) for \(r\) (the sample correlation coefficient), for sample size \(n\) and significance level \(\alpha\), which we will take to be 5%.

  2. We reject \(H_0:\rho =0\) in favour of \(H_1:\rho\neq 0\), if \(|r|\) (the absolute value of the sample correlation) is greater than \(c\), where \(c\) is the critical value read from the statistical tables.

For information only

The theoretical basis for the test and confidence interval for \(\rho\) is the sampling distribution of \(r\), which is rather complex.

Rather than working with \(r\), we work with the transformed variable, \[T= \frac{1}{2} \mathrm{ln} ((1+r)/(1-r))\]

\(T\) is approximately Normal

The mean of \(T\) is

\[ \mathrm{E}(T) = \frac{1}{2}\mathrm{ln}\frac{1+\rho}{1-\rho}\]

The variance of \(T\) is

\[ \mathrm{Var}(T) = \frac{1}{n-3}\]

Therefore,

\[ T \sim N\left(\frac{1}{2}\mathrm{ln}\frac{1+\rho}{1-\rho},\frac{1}{n-3}\right)\]

Thus we have \(Z \sim N(0,1)\) where \(Z\) is

\[\begin{eqnarray*} Z &=& \frac{\frac{1}{2}\mathrm{ln}\frac{1+r}{1-r}-\frac{1}{2}\mathrm{ln}\frac{1+\rho}{1-\rho}}{\sqrt{\frac{1}{n-3}}} \sim N(0,1)\\ &=&\frac{\sqrt{n-3}}{2}\mathrm{ln }\frac{(1+r)(1-\rho)}{(1-r)(1+\rho )} \sim N(0,1) \end{eqnarray*}\]

\(T\) and hence \(Z\) can be used to construct confidence intervals and tests for \(\rho\). In the practicals we will briefly consider \(p\)-values and confidence intervals for \(\rho\).