3.6 The Fast Fourier Transform (FFT)

The problem with the Fourier transform as it is presented above, either in its sine/cosine regression model form or in its complex exponential form, is that it requires \(O(n^2)\) operations to compute all of the Fourier coefficients. There are \(n\) data points and there are \(n/2\) frequencies for which Fourier coefficients can be computed. Each frequency coefficient requires multiplying a summing a cosine or sine with each of the \(n\) data points. For short time series this is not an issue but for very long time series this can be a prohibitively expensive computation even on today’s computers.

The Fast Fourier Transform (FFT) is a way to reduce the complexity of the Fourier transform computation from \(O(n^2)\) to \(O(n\log n)\), which is a dramatic improvement. The primary version of the FFT is one due to Cooley and Tukey. The basic idea of it is easy to see.

Suppose we have a time series \(y_1,\dots, y_n\) and we want to compute the complex Fourier coefficient \(z_1\). Going by the formula in the previous section, this would require computing \[ z_0 = \sum_{t=0}^{n-1} y_t, \] which is simply proportional to the mean of the data. If the data have been de-meaned or de-trended then this will be zero. The next Fourier coefficient is then \[\begin{eqnarray*} z_1 & = & \sum_{t=0}^{n-1} y_t\exp(-2\pi i\cdot 1\cdot t / n)\\ & = & y_0\exp(-2\pi i\cdot 1\cdot 0 / n) + y_1\exp(-2\pi i\cdot 1\cdot 1 / n) + y_2\exp(-2\pi i\cdot 1\cdot 2 / n) + \cdots. \end{eqnarray*}\] Now suppose we want to compute the next coefficient \(z_2\). This requires computing \[ z_2 = y_0\exp(-2\pi i\cdot 2\cdot 0 / n) + y_1\exp(-2\pi i\cdot 2\cdot 1 / n) + \cdots \] But wait! Notice how the exponential in the second term in the sum for \(z_2\) is the same as the exponential in the third term in the sum for \(z_1\). They are both equal to \(\exp(-2\pi i\cdot 1\cdot 2 / n)\). There is no need to compute this expoential quantity twice. We can simply compute it the first time, store it in memory, and then retrieve it when it is needed to compute \(z_2\) (assuming that retrieving from memory is faster than computing it from scratch.) One can think of the FFT algorithm as an elaborate bookkeeping algorithm that keeps track of these symmetries in computing the Fourier coefficients.

3.6.1 Example: A Simple FFT

The efficiencies of the algorithm are easier to see with a simple example. Suppose we have a time series with 4 observations, \(y_1, y_2, y_3, y_4\). First, let \(\theta=\exp(-2\pi i/n)\). Then the Fourier transform of the time series can be expressed in vector/matrix form as

\[ \mathbf{z}= \left[ \begin{array}{c} z_0\\ z_1\\ z_2\\ z_3 \end{array} \right] = \left[ \begin{array}{cccc} 1 & 1 & 1 & 1\\ 1 & \theta^1 & \theta^2 & \theta^3\\ 1 & \theta^2 & \theta^4 & \theta^6\\ 1 & \theta^3 & \theta^6 & \theta^{9} \end{array} \right] \left[ \begin{array}{c} y_0\\ y_1\\ y_2\\ y_3 \end{array} \right] \]

One can see the symmetry in the matrix of \(\theta\)s in the various powers of \(\theta\). Taking advantage of these symmetries by storing values instead of recomputing them produces a substantial reduction in computational complexity.

If \(\mathbf{y}=(y_0,\dots,y_{n-1})\), then the Fourier transform can be written as \(\mathbf{z}=W\mathbf{y}\), where \[ W = \left[ \begin{array}{ccccc} 1 & 1 & 1 & \cdots & 1\\ 1 & \theta^1 & \theta^2 & \cdots & \theta^{n-1}\\ 1 & \theta^2 & \theta^4 & \cdots & \theta^{2(n-1)}\\ 1 & \theta^3 & \theta^6 & \cdots & \theta^{3(n-1)}\\ \vdots & \vdots & & &\vdots \\ 1 & \theta^{(n-1)} & \theta^{2(n-1)} & \cdots & \theta^{(n-1)(n-1)} \end{array} \right] \] The symmetries in the \(W\) matrix obviously remain when we have a time series of length \(n\).

Consider the following simple time series of length 4.

set.seed(2020-02-03)
y <- rnorm(4)
y
[1] -1.545448388 -0.528393243 -1.086758791 -0.000111512

Here is the Fourier transform matrix for this series.

W <- rbind(exp(-2 * pi * 0 * (0:3) * 1i/4),
           exp(-2 * pi * 1 * (0:3) * 1i/4),
           exp(-2 * pi * 2 * (0:3) * 1i/4),
           exp(-2 * pi * 3 * (0:3) * 1i/4))
W
     [,1]  [,2]  [,3]  [,4]
[1,] 1+0i  1+0i  1+0i  1+0i
[2,] 1+0i  0-1i -1-0i  0+1i
[3,] 1+0i -1-0i  1+0i -1-0i
[4,] 1+0i  0+1i -1-0i  0-1i

We can now multiply the data in y with the Fourier transform matrix W and get

z <- W %*% y
drop(z)
[1] -3.1607119+0.0000000i -0.4586896+0.5282817i -2.1037024-0.0000000i
[4] -0.4586896-0.5282817i

To verify that we have the correcdt results, we can execute the FFT algorithm using the fft() function in R, which computes a univariate FFT.

z <- fft(y)
z
[1] -3.1607119+0.0000000i -0.4586896+0.5282817i -2.1037024+0.0000000i
[4] -0.4586896-0.5282817i

For a real time series, the first coefficient will always be real and will equal the sum of the time series, so \(z_0=\)-3.1607. The second and the third coefficients (\(z_1\) and \(z_2\)) are the Fourier coefficients corresponding to 1 cycle per 4 observations and 2 cycles per 4 observations, respectively. You will notice that because there are only 4 observations, 2 cycles per 4 observations is the Nyquist frequency.

The final Fourier coefficient (\(z_3\)) here is the complex conjugate of the second coefficient, because for a real zeries \(z_p=\bar{z}_{n-p}\), so \(z_1 = \bar{z}_3\). Practically speaking, for most analyses we will ignore the coefficients after \(p=n/2\) even though R computes all of them.

Notice also above that the imaginary component of \(z_3\) is zero, as was stated previously, because \(\sin(2\pi t (n/2) / n)=0\) for all integer \(t\).

3.6.2 The Cooley-Tukey Algorithm

Apparently, John Tukey thought of the idea for the fast Fourier transform while sitting in a government meeting so I guess the lesson there is that sometimes meetings can in fact produce novel ideas.

More formally, let’s assume that the length of the time series is such that it can be factored into \(n=r\times s\). Then recall that each Fourier coefficient is \[ z_p=\sum_{t=0}^{n-1} y_t\exp(-2\pi ipt/n) \] for \(p=0,\dots,n-1\). Now let the index \(t\) be decomposed into \(t = rt_1 + t_0\), where we have \[\begin{eqnarray*} t_0 & = & 0,\dots,r-1\\ t_1 & = & 0,\dots,s-1 \end{eqnarray*}\] You can see that as \(t_0\) and \(t_1\) vary over the respective ranges, we can reconstruct every value of \(t\) from \(0\) to \(n-1\). For example

\[\begin{eqnarray*} n-1 & = & r t_1 + t_0\\ & = & r (s-1) + r-1 \\ & = & rs-r+r-1 \\ & = & rs-1 \\ & = & n-1 \end{eqnarray*}\] because \(n=rs\). It might look a bit counterintuitive but you just have to believe that all the numbers from \(0\) to \(n-1\) are in there!

Now, we can re-write the equation for the Fourier coefficients as

\[\begin{eqnarray*} z_p & = & \sum_{t=0}^{n-1} y_t\exp(-2\pi ip t/n)\\ & = & \sum_{rt_1+t_0=0}^{n-1} y_t\exp(-2\pi i p (rt_1 + t_0) / n)\\ & = & \sum_{rt_1+t_0=0}^{n-1} y_{rt_1+t_0}\exp(-2\pi i p t_0/n)\exp(-2\pi i prt_1/n)\\ & = & \sum_{t_0=0}^{r-1} \sum_{t_1=0}^{s-1} y_{rt_1+t_0}\exp(-2\pi i p t_0/n)\exp(-2\pi i prt_1/n) \end{eqnarray*}\] In the final line, verify that there are a total of \(r\times s=n\) terms and cross terms in the product of sums. If \(p=0,\dots,n-1\) then we require \(O(n^2)\) computations to compute all the Fourier coefficients.

Much like we did for \(t\), we can decompose the \(p=sp_1 + p_0\) for \[\begin{eqnarray*} p_0 & = & 0,\dots,s-1\\ p_1 & = & 0,\dots,r-1 \end{eqnarray*}\]

Then for a single term in the final right-hand sum above, we have

\[\begin{eqnarray*} \exp(-2\pi i \,p\, rt_1/n) & = & \exp(-2\pi i \,(sp_1+p_0)\, rt_1/n)\\ & = & \exp(-2\pi i \,\mathbf{rs}p_1t_1/n)\exp(-2\pi i\,p_0\,rt_1/n) \end{eqnarray*}\]

Note that I have bolded the product \(r\times s\) in the expoential to remind you that \(r\times s = n\). Therefore, the last line translates to \[ \exp(-2\pi i \,p\, rt_1/n) = \exp(-2\pi i \,p_1t_1)\exp(-2\pi i\,p_0\,rt_1/n). \] However, we know that \(\exp(2\pi ip_1 t_1)=1\) for any integer value of \(p_1\) or \(t_1\). Therefore, we then have \[ \exp(-2\pi i \,p\, rt_1/n) = \exp(-2\pi i\,p_0\,rt_1/n) \] Putting things back together, we now have \[\begin{eqnarray*} z_p & = & \sum_{t_0=0}^{r-1} \sum_{t_1=0}^{s-1} y_{rt_1+t_0}\exp(-2\pi i p_0rt_1/n)\\ & = & \sum_{t_0=0}^{r-1} \left[ \exp(-2\pi i p t_0/n) \left( \sum_{t_1=0}^{s-1} y_{rt_1+t_0}\exp(-2\pi i p_0rt_1/n) \right) \right]\\ \end{eqnarray*}\]

Now note that the expression in the inner set of parentheses does not depend on \(p_1\) and is only a function of \(p_0\) and \(t_0\). If we denote this constant as \(C(p_0, t_0)\), we can write \[ z_p = \sum_{t_0=0}^{r-1} C(p_0, t_0) \exp(-2\pi i p t_0/n). \] Note that \(C(p_0, t_0)\) also depends on \(r\) and a subset of values of \(y_0,\dots,y_{n-1}\).

If we look at the formula above (and bear in mind that \(p=0,\dots,n-1\)), we can do a little bookkeeping to see how many computations will be required to compute all the Fourier coefficients.

  1. To compute \(z_p\), there are \(r\times s\) values of \(C(p_0, t_0)\) to compute (because \(p_0=0,\dots,s-1\) and \(t_0=0,\dots, r-1\)) and each value of \(C(p_0, t_0)\) requires \(s\) multiplications, giving us \(rs^2 = rs\times s = ns\) computations. They key here is that once we compute \(C(p_0, t_0)\) for every pair \(p_0\) and \(t_0\), the do not need to be computed again (they can be stored).

  2. Given the values of \(C(p_0, t_0)\), each \(z_p\) requires \(r\) multiplications and there are \(n\) different \(z_p\)s, giving us \(nr\) computations.

Adding the two sets of computations together gives us \(n(r + s)\) total. Recall that the naive Fourier transform algorithm requires \(O(n (r\times s)) = O(n^2)\) computations. Reducing the complexity down to \(O(n(r+s))\) is a major reduction.

Finally, note that there is nothing in the description of the FFT algorithm that requires any special ordering of the values of \(y_t\) or anything special about \(r\) or \(s\). In particular, if the values of \(r\) and \(s\) can themselves be factored into the product of two integers, then the entire algorithm can be recursively applied until there is no more factorization possible.

In the event that \(n\) is a power of \(2\), say \(n=2^K\) for some interger \(K>0\), then we could take \(r = 2\) and \(s = 2^{K-1}\). But then \(s\) itself could be factored into \(s=2\times 2^{K-2}\), etc. If the sample size \(n\) is highly composite, meaning that it can be decomposed into many factors, then the complexity of the FFT is \(O(n\log n)\). If \(n\) is in fact a power of \(2\), then the complexity is \(O(n\log_2 n)\), where \(\log_2 n\) is the number of times \(n\) can be factored into two integers.

The benefit of having \(n\) be a power of \(2\) is so great that some implementations of the algorithm will pad the time series with \(0\)s in order to make the total length a power of \(2\). In R, this is the default for the periodogram function spec.pgram(), which uses the fft() function. There is even a function nextn() in R which will find the next value of \(n\) that is a power of \(2\) (or possibly a power of some other number, like \(3\) or \(5\)).

So for example, if your time series has a length of \(n=900\), the next power of \(2\) is

nextn(900, 2)
[1] 1024

You could then add a vector of zeros of length sufficient enough to get your time series to the required length.