2.2 Rates of Convergence
One of the ways in which algorithms will be compared is via their rates of convergence to some limiting value. Typically, we have an interative algorithm that is trying to find the maximum/minimum of a function and we want an estimate of how long it will take to reach that optimal value. There are three rates of convergence that we will focus on here—linear, superlinear, and quadratic—which are ordered from slowest to fastest.
In our context, rates of convergence are typically determined by how much information about the target function \(f\) we use in the updating process of the algorithm. Algorithms that use little information about \(f\), such as the bisection algorithm, converge slowly. Algorithms that require more information about \(f\), such as derivative information, typically converge more quickly. There is no free lunch!
2.2.1 Linear convergence
Suppose we have a sequence \(\{x_n\}\) such that \(x_n\rightarrow x_\infty\) in \(\mathfrak{R}^k\). We say the convergence is linear if there exists \(r\in(0, 1)\) such that
\[ \frac{\|x_{n+1}-x_\infty\|}{\|x_n-x_\infty\|}\leq r \] for all \(n\) sufficiently large.
2.2.2 Superlinear Convergence
We say a sequence \(\{x_n\}\) converges to \(x_\infty\) superlinearly if we have \[ \lim_{n\rightarrow\infty} \frac{\|x_{n+1}-x_\infty\|}{\|x_n-x_\infty\|} = 0 \]
The sequence above does not converge superlinearly because the ratio is always constant, and so never can converge to zero as \(n\rightarrow\infty\). However, the sequence \(x_n = 1 + \left(\frac{1}{n}\right)^n\) converges superlinearly to \(1\).
2.2.3 Quadratic Convergence
Quadratic convergence is the fastest form of convergence that we will discuss here and is generally considered desirable if possible to achieve. We say the sequence converges at a quadratic rate if there exists some constant \(0 < M < \infty\) such that \[ \frac{\|x_{n+1}-x_\infty\|}{\|x_n-x_\infty\|^2}\leq M \] for all \(n\) sufficiently large.
Extending the examples from above, the sequence \(x_n = 1 + \left(\frac{1}{n}\right)^{2^n}\) converges quadratically to \(1\). With this sequence, we have
\[ \frac{\|x_{n+1}-x_\infty\|}{\|x_n-x_\infty\|^2} = \frac{\left(\frac{1}{n+1}\right)^{2^{n+1}}}{\left(\frac{1}{n}\right)^{(2^n)2}} = \left(\frac{n}{n+1}\right)^{2^{n+1}} \leq 1 \]
2.2.4 Example: Bisection Algorithm
For the bisection algorithm, the error that we make in estimating the root is \(x_n = |b_n - a_n|\), where \(a_n\) and \(b_n\) represent the end points of the bracketing interval at iteration \(n\). However, we know that the size of the interval in the bisection algorithm decreases by a half at each iteration. Therefore, we can write \(x_n = 2^{-n}|b_0 - a_0|\) and we can write the rate of convergence as
\[ \frac{\|x_{n+1}-x_\infty\|}{\|x_n-x_\infty\|} = \frac{x_{n+1}}{x_n} = \frac{2^{-(n+1)}(b_0-a_0)}{2^{-n}(b_0-a_0)} = \frac{1}{2} \] Therefore, the error of the bisection algorithm converges linearly to \(0\).