7 Time-series analysis

A time-series is a collection of data observed over time $t=1,2,3,\dots,T$
We can think of a single time-series as a specific realization of an underlying stochastic process. A stochastic process is a set of time-indexed random variables $\Big\{ y_{-\infty},...,y_{-2},~y_{-1},~y_0,~\underbrace{y_1,~y_2,~y_3,...,y_T}_{time-series},~y_{T-1},~y_{T+2},...,y_{+\infty}\Big\}$
Discrete time-series are typically considered, with observations made at equidistant time points or time intervals, i.e. regular time-series such as monthly, quarterly or yearly data.
A simple bivariate model based on time-series is a static model

$\begin{equation} y_t=\beta_0+\beta_1x_t+u_t ~~~~~~~ t=1,2,3,\dots,T \tag{7.1} \end{equation}$

a) strict exogeneity of the RHS variable

$\begin{equation} Cov(x_t,u_t)=0 \tag{7.2} \end{equation}$

b) no autocorrelation of the error terms

$\begin{equation} Cov(u_t,u_{t-j})=0~~~~\forall j=1,2,3,\dots,k \tag{7.3} \end{equation}$

c) constant variance of the error terms

$\begin{equation} Var(u_t)=\sigma^2_u~~~~~~~~~\forall t=1,2,3,\dots,T \tag{7.4} \end{equation}$

It’s almost impossible to satisfy assumptions (7.2), (7.3) and (7.4) when dealing with time-series data
For example, when error terms are autocorrelated the OLS estimates of the static model (7.1) are no longer efficient (OLS estimates are still unbiased and consistent but with wrong standard errors)
The old fashion approach tries to overcome the autocorrelation problem by using the GLS method instead of OLS, or by using robust standard errors (e.g., Newey-West standard errors), which is typically done when dealing with cross-sectional data
The modern approach is to include autocorrelation as a part of the static model rather than treat it as an estimation problem, which is typical for time-series data
By ignoring the properties of time-series data (such as autocorrelation, trending behavior, seasonality, structural breaks, nonstationarity, $\dots$ ) the static model becomes spurious!
When analyzing time-series data, dynamic models are more appropriate than the static models
Any dynamic model includes variables with time lag on the RHS, i.e. values from the previous period

$\begin{equation}y_t=\beta_0+\beta_1x_t+\beta_2x_{t-1}+u_t ~~~~~~~~~~~~~~~~ \tag{7.5} \end{equation}$

$\begin{equation}y_t=\beta_0+\beta_1x_t+\beta_2y_{t-1}+u_t ~~~~~~~~~~~~~~~~ \tag{7.6} \end{equation}$

$\begin{equation}y_t=\beta_0+\beta_1x_t+\beta_2x_{t-1}+\beta_3y_{t-1}+u_t \tag{7.7} \end{equation}$