## 5.2 The Kalman Filter

FUN FACT: The Kalman filter was developed by Rudolf Kalman while he worked at the Research Institute for Advanced Study in Baltimore, MD.

For the sake of introducing the Kalman filter, let’s take a simple model sometimes referred to as the “local level” model, which has a state equation of

$x_t = \theta x_{t-1} + w_t$

and an observation equation of

$y_t = x_t + v_t$

where we assume $$w_t\sim\mathcal{N}(0,\tau^2)$$ and $$v_t\sim\mathcal{N}(0,\sigma^2)$$.

The basic one-dimensional Kalman filtering algorithm is as follows. We start with an initial state $$x_0^0$$ and initial variance $$P_0^0$$. From here we compute $\begin{eqnarray*} x_1^0 & = & \theta x_0^0\\ P_1^0 & = & \theta^2 P_0^0 + \tau^2 \end{eqnarray*}$ as our best guesses for $$x_1$$ and $$P_1$$ given our current state. Given our new observation $$y_1$$, we can the update our guess based on this new information to get $\begin{eqnarray*} x_1^1 & = & x_1^0 + K_1(y_1-x_1^0)\\ P_1^1 & = & (1-K_1) P_1^0. \end{eqnarray*}$ where $$K_1 = P_1^0/(P_1^0 + \sigma^2)$$.

For the general case, we want to produce a new estimate $$x_t$$ and we have the current state $$x_{t-1}^{t-1}$$ and variance $$P_{t-1}^{t-1}$$. The one-step prediction is then $\begin{eqnarray*} x_{t}^{t-1} & = & \theta x_{t-1}^{t-1}\\ P_{t}^{t-1} & = & \theta^2 P_{t-1}^{t-1} + \tau^2. \end{eqnarray*}$ Given the new information $$y_t$$, we can then update our estimate to get $\begin{eqnarray*} x_t^t & = & x_t^{t-1} + K_t(y_t-x_t^{t-1})\\ P_t^t & = & (1-K_t) P_t^{t-1} \end{eqnarray*}$ where $K_t = \frac{P_t^{t-1}}{P_t^{t-1} + \sigma^2}$ is the Kalman gain coefficient.

If we look at the formula for the Kalman gain, it’s clear that if the measurement noise is high, so $$\sigma^2$$ is large, then the Kalman gain will be closer to $$0$$, and the influence of the new data point $$y_t$$ will be small. If $$\sigma^2$$ is small, then the filtered value $$x_t^t$$ will be adjusted more in the direction of $$y_t$$. This is important to remember when tuning the Kalman filtering algorithm for specific applications. The general idea is

$\begin{eqnarray*} \sigma^2\text{ is large} & \Rightarrow & \text{Trust the system}\\ \tau^2\text{ is large} & \Rightarrow & \text{Trust the data} \end{eqnarray*}$