5.4 OLS properties
The Gauss-Markov Theorem: given the assumptions (1)-(7), the least squares estimator OLS, in class of all unbiased linear estimators, has minimum variance -> BLUE (Best Linear Unbiased Estimator)
If the OLS estimator is linear and unbiased and at the same time has the smallest variance then it is efficient (in finite samples)
As the sample size increases indefinitely (\(n\rightarrow\infty\)) variance of the OLS estimator converges to zero! This property is called consistency!
Also, as the sample size increases distribution of the OLS estimator asymptotically approaches to the normal distribution (according to Central Limit Theorem)
Considering relations \(~\hat{\beta}=(x^{T}x)^{-1}x^{T}y~\) and \(~y=x\beta+u~\) the expectation and the variance of the OLS estimator \(\hat{\beta}\) can be determined
\[\begin{equation} \begin{aligned} \hat{\beta}&=(x^{T}x)^{-1}x^{T}y \\ &=(x^{T}x)^{-1}x^{T}(x\beta+u) \\ &=\beta+(x^{T}x)^{-1}x^{T}u \end{aligned} \tag{5.23} \end{equation}\]
By taking expectation of the last equation in (5.23), where only \(u\) is a random vector, we get \[\begin{equation} \begin{aligned} E(\hat{\beta})&=\beta+(x^{T}x)^{-1}x^{T}E(u) \\ &=\beta \end{aligned} \tag{5.24} \end{equation}\]
According to result in (5.24) the estimator’s expectation is equal to the vector of the true parameters. This proves that OLS estimator is unbiased!
Difference between \(\hat{\beta}\) and \(\beta\) is estimator’s biasedness
\[\begin{equation} \hat{\beta}-\beta=(x^{T}x)^{-1}x^{T}u \tag{5.25} \end{equation}\]
- Variance of the OLS estimator can me determined by taking the square of it’s biasedness
\[\begin{equation}\begin{aligned} Var(\hat{\beta})&=\Gamma=E((\hat{\beta}-\beta)(\hat{\beta}-\beta)^{T}) \\ &=E((x^{T}x)^{-1}x^{T}uu^{T}x(x^{T}x)^{-1}) \\ &=(x^{T}x)^{-1}x^{T}E(uu^{T})x(x^{T}x)^{-1} \\ &=\sigma_u^{2}(x^{T}x)^{-1} \end{aligned} \tag{5.26} \end{equation}\]
According to results in (5.24) and (5.26) we know that expectation of an estimator \(\hat{\beta}_j\) is equal to the parameter \(\beta_j\) with variance \[\begin{equation} Var(\hat{\beta}_j)=\sigma_u^{2}diag_j(x^{T}x)^{-1} \tag{5.27} \end{equation}\]
The square root of the estimator’s variance (5.27) is called standard error \(se(\hat{\beta}_j)\)
Exercise 26. Two estimated models are given as: \[(1)~~~y_i=3.54+2.87x_i+\hat{u}_i~~~~~~~~~~~~~~~~\] \[(2)~~~y_i=4.11+1.96x_i+0.53z_i+\hat{u}_i\]
- Which problem appears in equation (1) when variable \(z\) is omitted although it is relevant?
- Which problem appears in equation (2) when variable \(z\) is not omitted although it is irrelevant?
- Why the slope coefficient with respect to variable \(x\) is changed in equation (2) after including variable \(z\)?
- In which case the slope coefficient with respect to variable \(x\) should not change?