5.4 OLS properties

The Gauss-Markov Theorem: given the assumptions (1)-(7), the least squares estimator OLS, in class of all unbiased linear estimators, has minimum variance -> BLUE (Best Linear Unbiased Estimator)

  • If the OLS estimator is linear and unbiased and at the same time has the smallest variance then it is efficient (in finite samples)

  • As the sample size increases indefinitely (\(n\rightarrow\infty\)) variance of the OLS estimator converges to zero! This property is called consistency!

  • Also, as the sample size increases distribution of the OLS estimator asymptotically approaches to the normal distribution (according to Central Limit Theorem)

  • Considering relations \(~\hat{\beta}=(x^{T}x)^{-1}x^{T}y~\) and \(~y=x\beta+u~\) the expectation and the variance of the OLS estimator \(\hat{\beta}\) can be determined

\[\begin{equation} \begin{aligned} \hat{\beta}&=(x^{T}x)^{-1}x^{T}y \\ &=(x^{T}x)^{-1}x^{T}(x\beta+u) \\ &=\beta+(x^{T}x)^{-1}x^{T}u \end{aligned} \tag{5.23} \end{equation}\]

  • By taking expectation of the last equation in (5.23), where only \(u\) is a random vector, we get \[\begin{equation} \begin{aligned} E(\hat{\beta})&=\beta+(x^{T}x)^{-1}x^{T}E(u) \\ &=\beta \end{aligned} \tag{5.24} \end{equation}\]

  • According to result in (5.24) the estimator’s expectation is equal to the vector of the true parameters. This proves that OLS estimator is unbiased!

  • Difference between \(\hat{\beta}\) and \(\beta\) is estimator’s biasedness

\[\begin{equation} \hat{\beta}-\beta=(x^{T}x)^{-1}x^{T}u \tag{5.25} \end{equation}\]

  • Variance of the OLS estimator can me determined by taking the square of it’s biasedness

\[\begin{equation}\begin{aligned} Var(\hat{\beta})&=\Gamma=E((\hat{\beta}-\beta)(\hat{\beta}-\beta)^{T}) \\ &=E((x^{T}x)^{-1}x^{T}uu^{T}x(x^{T}x)^{-1}) \\ &=(x^{T}x)^{-1}x^{T}E(uu^{T})x(x^{T}x)^{-1} \\ &=\sigma_u^{2}(x^{T}x)^{-1} \end{aligned} \tag{5.26} \end{equation}\]

  • According to results in (5.24) and (5.26) we know that expectation of an estimator \(\hat{\beta}_j\) is equal to the parameter \(\beta_j\) with variance \[\begin{equation} Var(\hat{\beta}_j)=\sigma_u^{2}diag_j(x^{T}x)^{-1} \tag{5.27} \end{equation}\]

  • The square root of the estimator’s variance (5.27) is called standard error \(se(\hat{\beta}_j)\)

Exercise 26. Two estimated models are given as: \[(1)~~~y_i=3.54+2.87x_i+\hat{u}_i~~~~~~~~~~~~~~~~\] \[(2)~~~y_i=4.11+1.96x_i+0.53z_i+\hat{u}_i\]

  1. Which problem appears in model \((1)\) when variable \(z\) is omitted although it is relevant?
    Solution If a relevant variable is omitted from a model \((1)\) it leads to a problem called omitted variable bias. This means that estimated coefficient \(2.87\) is biased (it does not reflect the true relationships between \(x\) and \(y\)). Omitting relevant variables can also lead to an endogeneity problem.
  2. Which problem appears in model \((2)\) when variable \(z\) is not omitted although it is irrelevant?
    Solution Including irrelevant variable typically leads to inefficiency problem (having higher standard errors for the estimated coefficients of the relevant variables).
  3. Why the slope coefficient with respect to variable \(x\) is changed in model \((2)\) after including variable \(z\)?
    Solution The slope coefficient with respect to variable \(x\) changes to \(1.96\) after including variable \(z\) in model \((2)\), because \(x\) and \(z\) are highly correlated. Including any additional independent variable can lead to the multicollinearity problem (highly correlated RHS variables). This also means that \(x\) effects \(y\) both directly and indirectly (concept of mediation role of variable \(z\)).
  4. In which case the slope coefficient with respect to variable \(x\) should not change?
    Solution The slope coefficient with respect to variable \(x\) should not change if \(x\) and \(z\) are zero correlated.