# 1 Intro

## 1.1 Intended Learning Outcomes

After attending this lab, you should be able to use R to:

• calculate least squares estimates of model parameters using vector-matrix formulation;
• calculate and interpret the sample correlation coefficient;
• perform hypothesis tests on the population correlation and interpret the decision.

## 1.2 Introduction

In the lectures we learned how to assess the strength of a linear relationship between random variables using the correlation coefficient. The population correlation is a measure of the magnitude of the strength of the relationship between two random variables X and Y, and is defined as

$\begin{equation} \rho(X,Y) = \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}(X)\text{Var}(Y)}}, \tag{1.1} \end{equation}$

and can be estimated by replacing each of $$\text{Cov}(X,Y)$$, (X) and (Y) by their unbiased estimators to give

$\begin{equation} r = \frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}}=\frac{\sum^{n}_{i=1}(x_i-\overline{x})(y_i-\overline{y})}{\sqrt{\sum^{n}_{i=1}(x_i-\overline{x})^2(y_i-\overline{y})^2}}, \tag{1.2} \end{equation}$

the sample correlation coefficient (-1 $$\le r \le$$ 1).

Given a sample of data, we can assess the statistical significance of the observed correlations between variables in the wider population. To do this we perform a hypothesis test (more on this in Chapter 2.2).