6.1 Example: An \(AR(1)\) Model
Consider the following example, which describes and auto-regressive model of order \(1\),
\[ y_t=\phi y_{t-1} + w_t \]
where \(w_t\sim\mathcal{N}(0,\tau^2)\) and we assume that \(\mathbb{E}[y_t]=0\) for all \(t\). What is the joint distribution of the the \(y_t\)s in this case?
If we assume that the process is 2nd-order stationary then for the marginal variances, we have
\[ \text{Var}(y_t) = \phi^2\text{Var}(y_{t-1}) + \tau^2. \] However, the stationarity assumption implies that \(\text{Var}(y_t)=\text{Var}(y_{t-1})\). Therefore, if we rearrange terms, we must have that
\[ \text{Var}(y_t) = \frac{\tau^2}{1-\phi^2}. \] Note that the expression makes little sense if \(|\phi|\geq 1\) so from here on we will assume \(|\phi|<1\). Furthermore, we can then show that
\[\begin{eqnarray*} \text{Cov}(y_t,y_{t-1}) & = & \text{Cov}(\phi y_{t-1}, y_{t-1})\\ & = & \phi\text{Var}(y_{t-1})\\ & = & \phi\,\frac{\tau^2}{1-\phi^2} \end{eqnarray*}\] and because of the sequential dependence of the \(y_t\)s on each other, we have
\[ \text{Cov}(y_t,y_{t-j}) = \phi^{|j|}\,\frac{\tau^2}{1-\phi^2}. \] From all this, we can see that the joint distribution of \(y_1,\dots,y_n\) is Normal with mean vector \(0\) and a covariance matrix that whose elements are complex nonlinear functions of \(\phi\). While it is theoretically possible to compute this joint density and maximize it with respect to \(\phi\) and \(\tau\), some challenges arise relatively quickly:
As \(n\) increase, the \(n\times n\) covariance matrix quickly grows in size, making the computations more cumbersome, especially because some form of matrix decomposition must occur.
As \(n\) increases, we are taking larger and larger powers of \(\phi\), which can quickly lead to numerical instability and unfortunately cannot be solved by taking logs.
The formulation above ignores the sequential structure of the \(AR(1)\) model, which could be used to simplify the computations.
Thankfully, the Kalman filter provides a computationally efficient way to evaluate this complex likelihood that addresses both of these problems.