\( \newcommand{\bm}[1]{\boldsymbol{#1}} \newcommand{\textm}[1]{\textsf{#1}} \def\T{{\mkern-2mu\raise-1mu\mathsf{T}}} \newcommand{\R}{\mathbb{R}} % real numbers \newcommand{\E}{{\rm I\kern-.2em E}} \newcommand{\w}{\bm{w}} % bold w \newcommand{\bmu}{\bm{\mu}} % bold mu \newcommand{\bSigma}{\bm{\Sigma}} % bold mu \newcommand{\bigO}{O} %\mathcal{O} \renewcommand{\d}[1]{\operatorname{d}\!{#1}} \)

Exercises

Exercise 13.1 (Indices and ETFs) Download price data corresponding to some financial indices (e.g., the S&P 500, Dow Jones Industrial Average, Nasdaq) and some ETFs that track each of these indices (e.g., the SPY for the S&P 500 index). Plot each index along with the corresponding ETFs in a linear and logarithmic scale. Assess the tracking capabilities.

Exercise 13.2 (Active versus passive investments) Download price data corresponding to some mutual funds and compare with an appropriate financial indices. Plot the price time series and compute some performance measure, such as Sharpe ratio, to compare their performance. Do these results support the efficient-market hypothesis, promoted by Fama, or the inefficient and irrational markets, promoted by Shiller?

Exercise 13.3 (Sparse regression via \(\ell_1\)-norm) Generate an underdetermined system of linear equations \(\bm{A}\bm{x} = \bm{b}\) with \(\bm{A} \in \R^{5\times 10}\). Then, solve the following sparse underdetermined system of linear equations via brute force (i.e., trying all possible \(2^{10}\) patterns for the variable \(\bm{x}\)): \[ \begin{array}{ll} \underset{\bm{x}}{\textm{minimize}} & \|\bm{x}\|_0\\ \textm{subject to} & \bm{A}\bm{x} = \bm{b}. \end{array} \] Finally, solve the following linear program and compare the solution with the previous one: \[ \begin{array}{ll} \underset{\bm{x}}{\textm{minimize}} & \|\bm{x}\|_1\\ \textm{subject to} & \bm{A}\bm{x} = \bm{b}. \end{array} \]

Exercise 13.4 (Sparse least squares) Generate an overdetermined system of linear equations \(\bm{A}\bm{x} = \bm{b}\) with \(\bm{A} \in \R^{10\times 5}\). Consider the resolution of the sparse regression problem \[ \begin{array}{ll} \underset{\bm{x}}{\textm{minimize}} & \left\Vert \bm{A}\bm{x} - \bm{b} \right\Vert_2^2\\ \textm{subject to} & \|\bm{x}\|_0 \leq k \end{array} \] via the following list of methods and plot the trade-off curve of regression error versus sparsity level for each method:

  1. Brute force (i.e., trying all possible \(2^{5}\) patterns for the variable \(\bm{x}\)).

  2. \(\ell_1\)-norm approximation: \[ \begin{array}{ll} \underset{\bm{x}}{\textm{minimize}} & \left\Vert \bm{A}\bm{x} - \bm{b} \right\Vert_2^2 + \lambda \|\bm{x}\|_1. \end{array} \]

  3. Concave approximation using a general-purpose nonlinear solver: \[ \begin{array}{ll} \underset{\bm{x}}{\textm{minimize}} & \begin{aligned}[t] \left\Vert \bm{A}\bm{x} - \bm{b} \right\Vert_2^2 + \lambda \sum_{i=1}^N \textm{log}\left(1 + \frac{|x_i|}{\varepsilon}\right) \end{aligned}. \end{array} \]

  4. Concave approximation again, but using the iterative reweighted \(\ell_1\)-norm method.

Exercise 13.5 (Cap-weighted indices) The portfolio of a cap-weighted index is defined in terms of the market capitalization. Denoting by \(\bm{p}_t\) the prices of the \(N\) assets at time \(t\) and by \(\bm{n}\) the number of outstanding shares of the \(N\) assets. The capital portfolio of the assets is defined to be proportional to the market capitalization \(\bm{n} \odot \bm{p}_t\), which leads to the normalized portfolio \[ \bm{b}_t = \frac{\bm{n} \odot \bm{p}_t}{\bm{n}^\T\bm{p}_t}. \] Show that this normalized portfolio can also be expressed as \[ \bm{b}_t = \frac{\bm{b}_{t-1} \odot \left(\bm{1} + \bm{r}_t\right)}{\bm{b}_{t-1}^\T \left(\bm{1} + \bm{r}_t\right)}, \] where the returns are defined as \[\bm{r}_t = \frac{\bm{p}_t - \bm{p}_{t-1}}{\bm{p}_{t-1}} = \frac{\bm{p}_t}{\bm{p}_{t-1}} - \bm{1}.\]

Exercise 13.6 (Tracking error measures) Download price data corresponding to some financial index (e.g., the S&P 500, Dow Jones Industrial Average, Nasdaq) and some ETFs that track the index (e.g., the SPY for the S&P 500 index). Compute different error tracking measures, namely: the \(\ell_2\)-norm tracking error, the downside risk, the \(\ell_1\)-norm tracking error, and the Huberized tracking error. Finally, plot a histogram of the tracking errors as a more complete picture of the tracking performance (note that the previous error measures are summarizations of the histogram).

Exercise 13.7 (Two-stage index tracking methods) Download price data corresponding to some financial index, such as the S&P 500, and the corresponding constituent \(N\) assets for some period of time. Then, construct the benchmark return vector \(\bm{r}^\textm{b}\), the assets’ return matrix \(\bm{X}\), and formulate the sparse index tracking problem \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 \end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0} \end{aligned}\\ & \begin{aligned}[t] \|\w\|_0 \le K. \end{aligned} \end{array} \]

  1. Solve the problem via a naive two-stage approach: simply select the \(K\) active assets with some heuristic and then renormalize so that \(\bm{1}^\T\w=1\).

  2. Solve the problem via a two-stage approach with refitting of weights: select the \(K\) active assets as before and then solve the convex regression problem with the selected assets.

Plot the trade-off curve of regression error versus sparsity level \(K\) for each method.

Exercise 13.8 (Sparse index tracking methods via concave sparsity approximation) Download price data corresponding to some financial index, such as the S&P 500, and the corresponding constituent \(N\) assets for some period of time. Then, construct the benchmark return vector \(\bm{r}^\textm{b}\), the assets’ return matrix \(\bm{X}\), and formulate the sparse index tracking problem \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 + \lambda \|\w\|_0 \end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0} \end{aligned} \end{array} \] for different values of the hyper-parameter \(\lambda\).

  1. Approximate the sparsity regularizer with the concave log-function: \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 + \lambda \sum_{i=1}^N \textm{log}\left(1 + \frac{|w_i|}{\varepsilon}\right) \end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0} \end{aligned}. \end{array} \] Then solve the problem with a general-purpose nonconvex solver.

  2. Apply the majorization-minimization approach to get the iterative reweighted \(\ell_1\)-norm method that solves sequentially \(k=0,1,2,\dots,\) the following: \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 + \lambda \sum_{i=1}^N \alpha_i^k|w_i|\end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0}, \end{aligned} \end{array} \] where \[ \alpha_i^k = \frac{1}{\varepsilon + \left|w_i^k\right|}. \]

Plot the trade-off curve of regression error versus sparsity level for each method (by varying the hyper-parameter \(\lambda\)).

Exercise 13.9 (Sparse index tracking for downside risk) Download price data corresponding to some financial index, such as the S&P 500, and the corresponding constituent \(N\) assets for some period of time. Then, construct the benchmark return vector \(\bm{r}^\textm{b}\), the assets’ return matrix \(\bm{X}\), and formulate the sparse index tracking problem \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\left(\bm{r}^\textm{b} - \bm{X}\w\right)^+\right\|^2_2 + \lambda \|\w\|_0 \end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0} \end{aligned} \end{array} \] for different values of the hyper-parameter \(\lambda\).

  1. Approximate the sparsity regularizer with the concave log-function and solve the problem with a general-purpose nonconvex solver.

  2. Apply the majorization-minimization approach to get the iterative reweighted \(\ell_1\)-norm method that solves sequentially the following convex problem: \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\left(\bm{r}^\textm{b} - \bm{X}\w\right)^+\right\|^2_2 + \lambda \sum_{i=1}^N \alpha_i^k|w_i|\end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0}, \end{aligned} \end{array} \] where \[ \alpha_i^k = \frac{1}{\varepsilon + \left|w_i^k\right|}. \]

  3. Apply the majorization-minimization approach fully to get the iterative reweighted \(\ell_1\)-norm method that solves sequentially the following convex problem: \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\big(\tilde{\bm{r}}^\textm{b}\big)^k - \bm{X}\w\right\|^2_2 + \lambda \sum_{i=1}^N \alpha_i^k|w_i|\end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0}, \end{aligned} \end{array} \] where now \[\big(\tilde{\bm{r}}^\textm{b}\big)^k = \bm{r}^\textm{b} + \left(\bm{X}\w^k - \bm{r}^\textm{b}\right)^+.\]

Plot the trade-off curve of regression error versus sparsity level for each method (by varying the hyper-parameter \(\lambda\)).

Exercise 13.10 (FDR-controlling method for sparse index tracking) Download price data corresponding to some financial index, such as the S&P 500, and the corresponding constituent \(N\) assets for some period of time. Then, construct the benchmark return vector \(\bm{r}^\textm{b}\), the assets’ return matrix \(\bm{X}\), and formulate the sparse index tracking problem \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 + \lambda \|\w\|_0 \end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0}. \end{aligned} \end{array} \]

  1. Approximate the sparsity regularizer with the \(\ell_1\)-norm: \[ \begin{array}{ll} \underset{\w}{\textm{minimize}} & \begin{aligned}[t] \frac{1}{T}\left\|\bm{r}^\textm{b} - \bm{X}\w\right\|^2_2 + \lambda \|\w\|_1 \end{aligned}\\ \textm{subject to} & \begin{aligned}[t] \bm{1}^\T\w=1, \; \w\ge\bm{0} \end{aligned}. \end{array} \] Then solve the problem for different values of \(\lambda\) and plot the trade-off curve of regression error versus sparsity level.

  2. Employ the T-Rex method to choose automatically the active assets with FDR control.