5.3 OLS assumptions

  • In total, 7 assumptions may emerge:
  1. Linearity in the parameters
  2. RHS variables are fixed (matrix x has the same values in repeated sampling)
  3. Constant variance of error terms (error term variances are equal across all observations, so called homoskedasticity)
  4. Independence of error terms (no correlation between error terms, i.e. no autocorrelation)
  5. Normality of error terms (error terms are normally distributed with zero mean)
  6. Independence of RHS variables (matrix x has a full rank)
  7. Independence between RHS variables and error terms, i.e. when RHS variables are not fixed, but random (dealing with time-series data), additional assumption of exogeneity is required
  • Not all assumptions are required, depending on the data type and model specification

  • When dealing with cross-sectional data, assumption (2) is fulfilled by default and thus assumption (7) is not required

  • When dealing with time-series data, assumption (2) is not fulfilled and thus assumption (7) is required

  • In case of bivariate econometric model assumption (6) is not required because there is just one variable on the right hand side

  • Opposite to that, multivariate econometric model requires assumption (6) to be fulfilled

  • Assumptions (3) and (4) can be embedded into the covariance matrix of error terms  Ω

Var(u|x)=E(uuT)=Ω=[σ2u0000σ2u0000σ2u0000σ2u]

  • Moreover, assumptions (3), (4) and (5) can be summarized in the following notation

u|xN (0, Ω) ;  Ω=σ2uI

  • Variance of the error terms σ2u is unknown and it can be estimated using squared residuals. Unbiased estimate of the error terms variance is given by

ˆσ2u=ˆuTˆun(k+1)=ni=1ˆu2ink1

  • The square root of (5.19) is called regression standard error

ˆσu=ni=1ˆu2ink1

  • Unbiased estimator ˆσ2u is a random variable, independently distributed from an estimator ˆβ, and thus ˆσ2u(nk1)σ2u χ2(df=nk1)

  • The fraction of χ2 variable and degrees of freedom df equals to χ2df=ˆσ2u(nk1)σ2unk1=ˆσ2uσ2u

  • Equality in (5.22) is important for determining test statistics in significance testing of estimated parameters (section 6.1)

Exercise 24. Answer the following questions:

  1. What are the assumptions of a bivariate econometric model based on time-series data?
    Solution A bivariate model has only one RHS variable, and thus, assumption (6) is not an issue, while assumption (7) should be checked as assumption (2) is not fulfilled by default when dealing with time-series data. Therefore, assumptions (1), (3), (4), (5) and (7) are required.
  2. What are the assumptions of a multivariate econometric model based on cross-sectional data?
    Solution A multivariate model has more than one RHS variable, and thus, assumption (6) should be checked, while assumption (7) is not an issue as assumption (2) is fulfilled by default when dealing with cross-sectional data. Therefore, assumptions (1), (2), (3), (4), (5) and (6) are required.
  3. What are the assumptions of a multivariate econometric model based on time-series data?
    Solution All assumptions should be checked, except for assumption (2), which is not typically fulfilled by default.
  4. What are dimensions of the matrix x?
    Solution The matrix x has n rows corresponding to the number of observations (sample size), and k+1 columns (with one additional column for the intercept and k independent variables).
  5. What is the formula for OLS estimator?
    Solution OLS estimator is given by the formula ˆβOLS=(xTx)1xTy
  6. Which distribution have error terms?
    Solution It is typically assumed that error terms follow a normal distribution with a zero mean and constant variance. However, error terms can also follow other statistical distributions, such as the t-distribution or Poisson distribution, depending on the data type and model specification.
  7. Which elements has matrix Ω?
    Solution The diagonal elements of the matrix Ω are variances, while the off-diagonal elements are covariances. The matrix Ω is always symmetric, but not necessarily diagonal (symmetric matrix is diagonal if all off-diagonal elements are zero).

Exercise 25. Consider multivariate time-series econometric model:

yt=β0+β1xt+β2zt+ut ;   utN(0, Ω)

  1. Which assumption does not hold if Cov(xt,zt)0?
    Solution The observed RHS variables xt and zt are not independent (as implied by the non-zero covariance), but they should be according to assumption (6). Therefore, the independency of RHS variables does not hold.
  2. Is endogeneity problem present if Cov(xt,ut)=0 and Cov(zt,ut)=0?
    Solution Both observed RHS variables xt and zt are independent of the error terms ut (as implied by zero covariances), and thus, the exogeneity assumption (7) holds, meaning that the endogeneity problem is not present.
  3. What kind of problem exist if Cov(ut,ut1)0?
    Solution If covariance between error terms (shifted/lagged by 1 or more steps) is not zero, then the autocorrelation problem exists, i.e. the assumption (4) does not hold.
  4. Which assumption does not hold if matrix Ω is diagonal but has no equal diagonal elements?
    Solution If the diagonal elements of matrix Ω are not equal, meaning that the variance of error terms is not constant across observations, the homoskedasticity assumption (4) does not hold.