The Simple Linear Regression

Let us first consider the parameters of a simple linear regression model.

\(\newline\) Data: \((y_i,x_i), \quad i=1,\dots,n\) \(\newline\) Model: \(y_i =\alpha + \beta x_i + \epsilon_i\).

If we let \(i=1, \ldots, n\), we can see that we ave \(n\) equations:

\[\begin{aligned} y_1 & =\alpha+\beta x_1+\epsilon_1 \\ y_2 & =\alpha+\beta x_2+\epsilon_2 \\ \vdots \\ y_n & = \alpha+\beta x_n+\epsilon_n \end{aligned}\]

The above is a pretty tedious way of writing \(n\) similar equations. Instead, we can use the repetitive pattern with the concept of vector and matrix notation with the vector and matrix operations of addition and multiplication to write the above set of equations more compactly.

First we group all the observations, \(y_i\), into an \(n\) dimensional vector \(\boldsymbol{Y}\), and the errors \(\epsilon_i\) into another column vector \(\boldsymbol{\epsilon}\): \[\begin{aligned} \mathbf{Y} =&\left( \begin{array}{c} y_{1} \\ \vdots \\ y_{n} \\ \end{array} \right), \quad \quad \quad \boldsymbol{\epsilon} &= \left( \begin{array}{c} \epsilon_{1} \\ \vdots \\ \epsilon_{n} \\ \end{array} \right).\\ \end{aligned} \]

Similarly, we stack the two parameters, \(\alpha\) and \(\beta\), into another column vector: \[\begin{aligned} \boldsymbol{\beta} &=&\left( \begin{array}{c} \alpha \\ \beta \end{array} \right)\end{aligned}\]

Finally, we append a vector of ones with the single predictor for each \(i\), and create a matrix with two columns of the following form: \[\begin{aligned}\mathbf{X}&=&\left( \begin{array}{cc} 1 & x_{1} \\ \vdots & \vdots \\ 1 & x_{n} \end{array} \right). \end{aligned}\]

In other words, we want to define the matrix \(\mathbf{X}\) such that \(\mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}\) produces the right hand side of our original \(n\) equations.

We now have the ingredients to define our simple linear model

\[y_i =\alpha+\beta x_i + \epsilon_i; \quad i=1,\ldots n \] in vector matrix notation as \[ \mathbf{Y}_{(n \times 1)} = \mathbf{X}_{(n \times 2) } \boldsymbol{\beta}_{(2 \times 1)}+\boldsymbol{\epsilon}_{(n \times 1)}\]

The numbers in the brackets represent the dimension of the vectors or matrix. Vectors are just matrices that have a row or column length of one. The first number shows the number of rows and the second shows the number of columns. It is good practice to check that these dimensions match; meaning that the multiplication and addition of the matrices is possible, and that the dimensions on both sides of the equation are equal. Remember that for matrix multiplication to be possible, the number of columns for the left matrix must be equal to the number of rows for the right matrix. Addition is only possible when both matrices have the same dimensions.