Use of the general least square estimate

Now we will show how we can get back the least squares solutions that we obtained from first principle for the simple linear model through the origin.

Data: \((y_i,x_i), \quad i=1,\dots,n\)

Model: \(y_i =\beta x_i + \epsilon_i\), where \(\epsilon_i \sim N(0,\sigma^2)\) and the \(\epsilon_i\)s are assumed uncorrelated.

Writing this model in vector-matrix form we get

\[\begin{aligned} \mbox{Response } \mathbf{Y} &=\left( \begin{array}{c} y_{1} \\ \vdots \\ y_{n} \\ \end{array} \right), \\ \quad \mbox{parameters } \boldsymbol{\beta} &=\left( \begin{array}{c} \beta \end{array} \right), \\ \quad \mbox{design matrix } \mathbf{X}&=\left( \begin{array}{c} x_{1} \\ \vdots\\ x_{n} \end{array} \right), \\ \quad \mbox{and errors } \boldsymbol{\epsilon} &=\left( \begin{array}{c} \epsilon_{1} \\ \vdots \\ \epsilon_{n} \\ \end{array} \right).\\ \end{aligned}\]

\[(\mathbf{X}^T\mathbf{X})= \sum_{i=1}^n x_i^2, \quad (\mathbf{X}^T\mathbf{X})^{-1}= \frac 1 { \sum_{i=1}^n x_i^2} , \quad (\mathbf{X}^T\mathbf{Y})=\sum_{i=1}^n x_i y_i\] Thus, \[\boldsymbol{\beta} = (\mathbf{X}^T\mathbf{X})^{-1}(\mathbf{X}^T\mathbf{Y}) =\frac {\sum_{i=1}^n x_i y_i} { \sum_{i=1}^n x_i^2}, \] which is the same result as we obtained in the previous week.

Additionally, we have

\[ \begin{aligned} RSS &= \mathbf{Y}^T\mathbf{Y}-\mathbf{Y}^T\mathbf{X}\boldsymbol{\hat{\beta}}\\ &= \sum_{i=1}^n y_i^2 - \sum_{i=1}^n x_i y_i \left(\frac {\sum_{i=1}^n x_i y_i} { \sum_{i=1}^n x_i^2} \right)\\ &= \sum_{i=1}^n y_i^2 - \frac {\left(\sum_{i=1}^n x_i y_i\right)^2} { \sum_{i=1}^n x_i^2}. \end{aligned}\]