11.2 Time Series

$y_t = \beta_0 + x_{t1}\beta_1 + x_{t2}\beta_2 + ... + x_{t(k-1)}\beta_{k-1} + \epsilon_t$

Examples

Static Model
- $y_t=\beta_0 + x_1\beta_1 + x_2\beta_2 - x_3\beta_3 - \epsilon_t$
Finite Distributed Lag model
- $y_t=\beta_0 + pe_t\delta_0 + pe_{t-1}\delta_1 +pe_{t-2}\delta_2 + \epsilon_t$
- Long Run Propensity (LRP) is $LRP = \delta_0 + \delta_1 + \delta_2$
Dynamic Model
- $GDP_t = \beta_0 + \beta_1GDP_{t-1} - \epsilon_t$

Finite Sample Properties for Time Series:

A1-A3: OLS is unbiased
A1-A4: usual standard errors are consistent and Gauss-Markov Theorem holds (OLS is BLUE)
A1-A6, A6: Finite Sample Wald Test (t-test and F-test) are valid

A3 might not hold under time series setting

Spurious Time Trend - solvable
Strict vs Contemporaneous Exogeneity - not solvable

In time series data, there are many processes:

Autoregressive model of order p: AR(p)
Moving average model of order q: MA(q)
Autoregressive model of order p and moving average model of order q: ARMA(p,q)
Autoregressive conditional heteroskedasticity model of order p: ARCH(p)
Generalized Autoregressive conditional heteroskedasticity of orders p and q; GARCH(p.q)

11.2.1 Deterministic Time trend

Both the dependent and independent variables are trending over time

Spurious Time Series Regression

$y_t = \alpha_0 + t\alpha_1 + v_t$

and x takes the form

$x_t = \lambda_0 + t\lambda_1 + u_t$

$\alpha_1 \neq 0$ and $\lambda_1 \neq 0$
$v_t$ and $u_t$ are independent
there is no relationship between $y_t$ and $x_t$

If we estimate the regression,

$y_t = \beta_0 + x_t\beta_1 + \epsilon_t$

so the true $\beta_1=0$

Inconsistent: $plim(\hat{\beta}_1)=\frac{\alpha_1}{\lambda_1}$
Invalid Inference: $|t| \to^d \infty$ for $H_0: \beta_1=0$ , will always reject the null as $n \to \infty$
Uninformative $R^2$ : $plim(R^2) = 1$ will be able to perfectly predict as $n \to \infty$

We can rewrite the equation as

$\begin{aligned} y_t &=\beta_0 + \beta_1x_t+\epsilon_t \\ \epsilon_t &= \alpha_1t + v_t \end{aligned}$

where $\beta_0 = \alpha_0$ and $\beta_1=0$ . Since $x_t$ is a deterministic function of time, $\epsilon_t$ is correlated with $x_t$ and we have the usual omitted variable bias.
Even when $y_t$ and $x_t$ are related ( $\beta_1 \neq 0$ ) but they are both trending over time, we still get spurious results with the simple regression on $y_t$ on $x_t$

Solutions to Spurious Trend

Include time trend $t$ as an additional control
- consistent parameter estimates and valid inference
Detrend both dependent and independent variables and then regress the detrended outcome on detrended independent variables (i.e., regress residuals $\hat{u}_t$ on residuals $\hat{v}_t$ )
- Detrending is the same as partialing out in the [Frisch-Waugh-Lovell Theorem]
  - Could allow for non-linear time trends by including $t$ $t^2$ , and $\exp(t)$
  - Allow for seasonality by including indicators for relevant “seasons” (quarters, months, weeks).

A3 does not hold under:

Feedback Effect
- $\epsilon_t$ influences next period’s independent variables
Dynamic Specification
- include last time period outcome as an explanatory variable
Dynamically Complete
- For finite distrusted lag model, the number of lags needs to be absolutely correct.

11.2.2 Feedback Effect

$y_t = \beta_0 + x_t\beta_1 + \epsilon_t$

$E(\epsilon_t|\mathbf{X})= E(\epsilon_t| x_1,x_2, ...,x_t,x_{t+1},...,x_T)$

will not equal 0, because $y_t$ will likely influence $x_{t+1},..,x_T$

A3 is violated because we require the error to be uncorrelated with all time observation of the independent regressors (strict exogeneity)

11.2.3 Dynamic Specification

$y_t = \beta_0 + y_{t-1}\beta_1 + \epsilon_t$

$E(\epsilon_t|\mathbf{X})= E(\epsilon_t| y_1,y_2, ...,y_t,y_{t+1},...,y_T)$

will not equal 0, because $y_t$ and $\epsilon_t$ are inherently correlated

A3 is violated because we require the error to be uncorrelated with all time observation of the independent regressors (strict exogeneity)
Dynamic Specification is not allowed under A3

11.2.4 Dynamically Complete

$y_t = \beta_0 + x_t\delta_0 + x_{t-1}\delta_1 + \epsilon_t$

$E(\epsilon_t|\mathbf{X})= E(\epsilon_t| x_1,x_2, ...,x_t,x_{t+1},...,x_T)$

will not equal 0, because if we did not include enough lags, $x_{t-2}$ and $\epsilon_t$ are correlated

A3 is violated because we require the error to be uncorrelated with all time observation of the independent regressors (strict exogeneity)
Can be corrected by including more lags (but when stop? )

Without A3

then, we can

Focus on Large Sample Properties
Can use [A3a] instead of A3

[A3a] in time series become

$A3a: E(\mathbf{x}_t'\epsilon_t)= 0$

only the regressors in this time period need to be independent from the error in this time period (Contemporaneous Exogeneity)

$\epsilon_t$ can be correlated with $...,x_{t-2},x_{t-1},x_{t+1}, x_{t+2},...$
can have a dynamic specification $y_t = \beta_0 + y_{t-1}\beta_1 + \epsilon_t$

Deriving Large Sample Properties for Time Series

Assumptions A1, A2, [A3a]
[Weak Law] and Central Limit Theorem depend on A5
- $x_t$ and $\epsilon_t$ are dependent over t
- without [Weak Law] or Central Limit Theorem depend on A5, we cannot have Large Sample Properties for OLS
- Instead of A5, we consider [A5a]
Derivation of the Asymptotic Variance depends on A4
- time series setting introduces Serial Correlation: $Cov(\epsilon_t, \epsilon_s) \neq 0$

under A1, A2, [A3a], and [A5a], OLS estimator is consistent, and asymptotically normal

11.2.5 Highly Persistent Data

If $y_t, \mathbf{x}_t$ are not weakly dependent stationary process

$y_t$ and $y_{t-h}$ are not almost independent for large h
[A5a] does not hold and OLS is not consistent and does not have a limiting distribution.
Example + Random Walk $y_t = y_{t-1} + u_t$ + Random Walk with a drift: $y_t = \alpha+ y_{t-1} + u_t$

Solution First difference is a stationary process

$y_t - y_{t-1} = u_t$

If $u_t$ is a weakly dependent process (also called integrated of order 0) then $y_t$ is said to be difference-stationary process (integrated of order 1)
For regression, if $\{y_t, \mathbf{x}_t \}$ are random walks (integrated at order 1), can consistently estimate the first difference equation

$\begin{aligned} y_t - y_{t-1} &= (\mathbf{x}_t - \mathbf{x}_{t-1}\beta + \epsilon_t - \epsilon_{t-1}) \\ \Delta y_t &= \Delta \mathbf{x}\beta + \Delta u_t \end{aligned}$

Unit Root Test

$y_t = \alpha + \alpha y_{t-1} + u_t$

tests if $\rho=1$ (integrated of order 1)

Under the null $H_0: \rho = 1$ , OLS is not consistent or asymptotically normal.
Under the alternative $H_a: \rho < 1$ , OLS is consistent and asymptotically normal.
usual t-test is not valid, will need to use the transformed equation to produce a valid test.

Dickey-Fuller Test $\Delta y_t= \alpha + \theta y_{t-1} + v_t$ where $\theta = \rho -1$

$H_0: \theta = 0$ and $H_a: \theta < 0$
Under the null, $\Delta y_t$ is weakly dependent but $y_{t-1}$ is not.
Dickey and Fuller derived the non-normal asymptotic distribution. If you reject the null then $y_t$ is not a random walk.

Concerns with the standard Dickey Fuller Test
1. Only considers a fairly simplistic dynamic relationship

$\Delta y_t = \alpha + \theta y_{t-1} + \gamma_1 \Delta_{t-1} + ..+ \gamma_p \Delta_{t-p} +v_t$

with one additional lag, under the null $\Delta_{y_t}$ is an AR(1) process and under the alternative $y_t$ is an AR(2) process.
Solution: include lags of $\Delta_{y_t}$ as controls.

Does not allow for time trend $\Delta y_t = \alpha + \theta y_{t-1} + \delta t + v_t$

allows $y_t$ to have a quadratic relationship with $t$
Solution: include time trend (changes the critical values).

Adjusted Dickey-Fuller Test $\Delta y_t = \alpha + \theta y_{t-1} + \delta t + \gamma_1 \Delta y_{t-1} + ... + \gamma_p \Delta y_{t-p} + v_t$ where $\theta = 1 - \rho$

$H_0: \theta_1 = 0$ and $H_a: \theta_1 < 0$
Under the null, $\Delta y_t$ is weakly dependent but $y_{t-1}$ is not
Critical values are different with the time trend, if you reject the null then $y_t$ is not a random walk.

11.2.5.0.1 Newey West Standard Errors

If A4 does not hold, we can use Newey West Standard Errors (HAC - Heteroskedasticity Autocorrelation Consistent)

$\hat{B} = T^{-1} \sum_{t=1}^{T} e_t^2 \mathbf{x'_tx_t} + \sum_{h=1}^{g}(1-\frac{h}{g+1})T^{-1}\sum_{t=h+1}^{T} e_t e_{t-h}(\mathbf{x_t'x_{t-h}+ x_{t-h}'x_t})$

estimates the covariances up to a distance g part
downweights to insure $\hat{B}$ is PSD
How to choose g:
- For yearly data: $g = 1$ or 2 is likely to account for most of the correlation
- For quarterly or monthly data: g should be larger ($g = 4$ or 8 for quarterly and $g = 12$ or 14 for monthly)
- can also take integer part of $4(T/100)^{2/9}$ or integer part of $T^{1/4}$

Testing for Serial Correlation

Run OLS regression of $y_t$ on $\mathbf{x_t}$ and obtain residuals $e_t$
Run OLS regression of $e_t$ on $\mathbf{x}_t, e_{t-1}$ and test whether coefficient on $e_{t-1}$ is significant.
Reject the null of no serial correlation if the coefficient is significant at the 5% level.
- Test using heteroskedastic robust standard errors
- can include $e_{t-2},e_{t-3},..$ in step 2 to test for higher order serial correlation (t-test would now be an F-test of joint significance)