7.3 Utility-Based Portfolios

$\newcommand{\bm}[1]{\boldsymbol{#1}} \newcommand{\textm}[1]{\textsf{#1}} \newcommand{\textnormal}[1]{\textsf{#1}} \def\T{{\mkern-2mu\raise-1mu\mathsf{T}}} \newcommand{\R}{\mathbb{R}} % real numbers \newcommand{\E}{{\rm I\kern-.2em E}} \newcommand{\w}{\bm{w}} % bold w \newcommand{\bmu}{\bm{\mu}} % bold mu \newcommand{\bSigma}{\bm{\Sigma}} % bold mu \newcommand{\bigO}{O} %\mathcal{O} \renewcommand{\d}[1]{\operatorname{d}\!{#1}}$

All the previous porfolio formulations in Sections 7.1 and 7.2 are based on some judicious combination of the mean and variance of the returns $R^\textm{portf}_t = \w^\T\bm{r}_t$ (here, $\bm{r}_t$ denotes linear returns). However, it is possible to express the interest of the investor in a more general way via utility functions.

7.3.1 Kelly Criterion Portfolio

In 1956, a scientist working for Bell Labs, John Larry Kelly, Jr., brought together game theory and information theory (Kelly, Jr., 1956). He showed that in order to achieve maximum growth of wealth, a gambler should place bets that maximize the expected value of the logarithm of the capital, usually referred to as the Kelly criterion. The Kelly criterion was applied to portfolio design in Markowitz (1959); see also Thorp (1971) and Thorp (1997).³⁵

Recall from (6.8) that, for a given fixed portfolio $\w$ , the returns are $R^\textm{portf}_t = \w^\T\bm{r}_t$ . From the geometric compounding of the portfolio wealth of NAV in (6.6), we can write the wealth accumulated during the periods $t=1,\dots,T$ as $W_T = W_0\prod_{t=1}^T \frac{W_t}{W_{t-1}} = W_0\prod_{t=1}^T \left(1 + \w^\T\bm{r}_t\right),$ where $W_0$ is the initial wealth and $W_t$ the wealth at time period $t$ .

It turns out that the wealth grows exponentially as $W_t\sim e^{t\times G}$ (Cover and Thomas, 1991), where the exponent $G$ is the exponential rate of growth or, simply, growth rate: $G = \underset{T\rightarrow\infty}{\textm{lim}} \;\textm{log}\left(\frac{W_T}{W_0}\right)^{1/T}.$ The growth rate can be estimated asymptotically by the law of large numbers as $G = \underset{T\rightarrow\infty}{\textm{lim}} \; \frac{1}{T} \sum_{t=1}^T \textm{log}\left(1 + \w^\T\bm{r}_t\right) = \E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right],$ where $\bm{r}$ is a random variable with the same distribution as each $\bm{r}_t$ .

Maximizing the growth rate effectively maximizes the long-term wealth and it is a very compelling choice for portfolio design. We call this formulation the Kelly criterion portfolio: $\begin{equation} \begin{array}{ll} \underset{\w}{\textm{maximize}} & \E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right]\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}. \end{array} \tag{7.13} \end{equation}$ Good and bad properties of the Kelly criterion are discussed in MacLean et al. (2010).

Problem (7.13) is convex since the log is a concave function and it is a maximization problem. In practice, however, we need to find an appropriate way to deal with the expected value in the objective function as we discuss next via sample averages, the exponential cone, and other approximations.

Solving the Kelly Criterion Portfolio Directly via Sample Average

In practice, the expectation in problem (7.13) can be approximated by the sample mean: $\E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right] \approx \frac{1}{T} \sum_{t=1}^T \textm{log}\left(1 + \w^\T\bm{r}_t\right).$ However, finding a solver that can deal directly with the log function may be challenging.

Solving the Kelly Criterion Portfolio via Exponential Cone Programming

Interestingly, this problem can be reformulated in terms of the exponential cone (Dany Cajas, 2021b), leading to exponential cone programming for which solvers can be found.

Consider problem (7.13) with the sample average approximation and with the additional slack variables $q_t$ : $\begin{array}{ll} \underset{\w, \{q_t\}}{\textm{maximize}} & \frac{1}{T}\sum_{t=1}^T q_t\\ \textm{subject to} & q_t \leq \textm{log}\left(1 + \w^\T\bm{r}_t\right), \quad t=1,\dots,T,\\ & \bm{1}^\T\w=1, \quad \w\ge\bm{0}. \end{array}$ The constraints $q_t \leq \textm{log}\left(1 + \w^\T\bm{r}_t\right)$ can be equivalently written as $\textm{exp}(q_t) \leq 1 + \w^\T\bm{r}_t$ and the Kelly portfolio can be finally written as the exponential cone program $\begin{equation} \begin{array}{ll} \underset{\w, \{q_t\}}{\textm{maximize}} & \frac{1}{T}\sum_{t=1}^T q_t\\ \textm{subject to} & (q_t, 1, 1 + \w^\T\bm{r}_t) \in \mathcal{K}_\textm{exp}, \quad t=1,\dots,T,\\ & \bm{1}^\T\w=1, \quad \w\ge\bm{0}, \end{array} \tag{7.14} \end{equation}$ where $\mathcal{K}_\textm{exp}$ is the exponential cone (Chares, 2007) defined as $\mathcal{K}_{\textm{exp}} \triangleq \big\{(a,b,c) \mid c\geq b\,e^{a/b}, b>0\big\} \cup \big\{(a,b,c) \mid a\leq0, b=0, c\geq0\big\}.$

Solving the Kelly Criterion Portfolio via Mean–Variance Approximations

The most common way to deal with the expectation in the objective of problem (7.13) is via a first-order Taylor approximation³⁶ around the point $\bm{r}=\bm{0}$ (Markowitz, 1959): $\begin{equation} \E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right] \approx \w^\T\bmu - \frac{1}{2}\w^\T\bSigma\w, \tag{7.15} \end{equation}$ which is a surprising and beautiful justification for the mean–variance formulation in (7.1) (with $\lambda=1$ ), that is, $\begin{array}{ll} \underset{\w}{\textm{maximize}} & \w^\T\bmu - \frac{1}{2}\w^\T\bSigma\w\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}. \end{array}$

It is possible to find better Taylor approximations than (7.15) around the point $\bm{r}=\bmu$ , such as (Markowitz, 1959) $\begin{equation} \E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right] \approx \textm{log}\left(1 + \w^\T\bmu\right) - \frac{1}{2}\frac{\w^\T\bSigma\w}{\left(1+\w^\T\bmu\right)^2} \tag{7.16} \end{equation}$ or, further approximated, $\begin{equation} \E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right] \approx \w^\T\bmu - \frac{1}{2}(\w^\T\bmu)^2 - \frac{1}{2}\frac{\w^\T\bSigma\w}{1+2\w^\T\bmu}. \tag{7.17} \end{equation}$

One can also try an approximation over an interval (as opposed to the Taylor approximations, which focus on a single point) such as the Levy-Markowitz approximation (Levy and Markowitz, 1979): $\begin{equation} \begin{aligned}[b] \E\left[\textm{log}\left(1 + \w^\T\bm{r}\right)\right] &\approx \frac{1}{2\kappa^2}\textm{log}\left(\left(1+\w^\T\bmu\right)^2 - \kappa^2 \w^\T\bSigma\w\right)\\ &\qquad + \left(1-\frac{1}{\kappa^2}\right)\textm{log}\left(1+\w^\T\bmu\right), \end{aligned} \tag{7.18} \end{equation}$ where $\kappa$ measures the width of the approximating interval in standard deviations.

However, these other approximations may not bring any benefit in practice (Pulley, 1983) due to the nonconvexity and the fact that the parameters $\bmu$ and $\bSigma$ contain estimation errors an order of magnitude larger than the potential benefit of these refined approximations. See Markowitz (2014) for a historical perspective on mean–variance approximations. Chapter 9 explores higher-order moments, that is, skewness and kurtosis, to obtain better approximations for portfolio design.

7.3.2 Expected Utility Theory

The model of rational decision-making in most of economics and statistics is expected utility theory, which was axiomatized by Neumann and Morgenstern (1944) and Savage (1954).

In the context of portfolio design, the utility $U(\cdot)$ is a function of the random portfolio return $\w^\T\bm{r}$ and the objective is the maximization of the expected utility: $\begin{equation} \begin{array}{ll} \underset{\w}{\textm{maximize}} & \E\left[U(\w^\T\bm{r})\right]\\ \textm{subject to} & \bm{1}^\T\w=1, \quad \w\ge\bm{0}. \end{array} \tag{7.19} \end{equation}$

The Kelly criterion portfolio in (7.13) is a particular case of the expected utility maximization in (7.19) by choosing the log utility $U(x) = \textm{log}\left(1 + x\right)$ . Some (concave) utilities of interest include:

$U(x) = \textm{log}\left(1 + x\right)$
$U(x) = \sqrt{1 + x}$
$U(x) = -\dfrac{1}{x}$
$U(x) = -p\dfrac{1}{x^p}$ for $p>0$ (as $p\rightarrow0$ it converges to the log utility)
$U(x) = -\dfrac{1}{\sqrt{1 + x}}$
$U(x) = 1 - \textm{exp}(-\lambda x)$ , with $\lambda>0$ being the risk aversion parameter.

However, this general expected utility theory, while useful in theory, is an elusive concept that may be of little practical help when faced with investment decisions, as opposed to the statistically sound Kelly criterion. As stated in Roy (1952):

In calling in a utility function to our aid, an appearance of generality is achieved at the cost of a loss of practical significance, and applicability in our results. A man who seeks advice about his actions will not be grateful for the suggestion that he maximise expected utility.

Problem (7.19) is convex as long as the utility $U(\cdot)$ is a concave function. Similarly to the Kelly criterion portfolio, the expected utility portfolio formulation can be numerically solved in a direct way or via mean–variance approximations as described next.

Solving the Expected Utility Portfolio Directly via Sample Average

In practice, problem (7.19) can be approximated by directly replacing the expectation in the objective function by the sample mean: $\E\left[U(\w^\T\bm{r})\right] \approx \frac{1}{T} \sum_{t=1}^T U\left(\w^\T\bm{r}_t\right).$ However, finding a solver that can deal directly with the utility function $U(\cdot)$ may be challenging, even if it is a concave function.

Solving the Expected Utility Portfolio via Mean–Variance Approximations

Similarly to the Kelly criterion portfolio, the most common way to deal with the expectation in the objective of problem (7.19) is via a mean–variance approximations similar to that in (7.15); see Markowitz (2014).

We now consider some Taylor approximations around specific points³⁷ (Markowitz, 1959) and the Levy–Markowitz approximation along an interval (more specifically on three points of the interval) (Levy and Markowitz, 1979).

The second-order Taylor approximation around the point $\bm{r}=\bm{0}$ is $\begin{equation} \begin{aligned}[b] \E\left[U(\w^\T\bm{r})\right] & \approx U(0) + U'(0)\;\E\left[\w^\T\bm{r}\right] + \frac{1}{2}U''(0)\;\E\left[(\w^\T\bm{r})^2\right]\\ & = U(0) + U'(0)\;\w^\T\bmu + \frac{1}{2}U''(0)(\w^\T\bSigma\w + (\w^\T\bmu)^2), \end{aligned} \tag{7.20} \end{equation}$ whereas the approximation around the point $\bm{r}=\bmu$ reads $\begin{equation} \begin{aligned}[b] \E\left[U(\w^\T\bm{r})\right] & \approx U(\w^\T\bmu) + U'(\w^\T\bmu)\E\left[\w^\T(\bm{r} - \bmu)\right] + \frac{1}{2}U''(\w^\T\bmu)\E\left[(\w^\T(\bm{r} - \bmu)^2\right]\\ & = U(\w^\T\bmu) + \frac{1}{2}U''(\w^\T\bmu)\w^\T\bSigma\w. \end{aligned} \tag{7.21} \end{equation}$

An approximation that fits the utility $U(\cdot)$ simultaneously on three points, namely, the mean $\w^\T\bmu$ (like the previous Taylor approximation) and the two points $\w^\T\bmu \pm \kappa\sqrt{\w^\T\bSigma\w}$ is Levy and Markowitz (1979) ³⁸ $\begin{equation} \begin{aligned}[b] \E\left[U(\w^\T\bm{r})\right] &\approx U(\w^\T\bmu)\\ &\quad + \frac{U\left(\w^\T\bmu + \kappa\sqrt{\w^\T\bSigma\w}\right) + U\left(\w^\T\bmu - \kappa\sqrt{\w^\T\bSigma\w}\right) - 2U\left(\w^\T\bmu\right)}{2\kappa^2}. \end{aligned} \tag{7.22} \end{equation}$

Many empirical analyses have concluded that these mean–variance approximations perform well in practice for real data, and their difference is negligible (Levy and Markowitz, 1979; Markowitz, 1959, 2014; Pulley, 1983). Chapter 9 explores higher-order moments, that is, skewness and kurtosis, in these approximations for portfolio design.

References

Cajas, Dany. (2021b). Kelly portfolio optimization: A disciplined convex programming framework. SSRN Electronic Journal.

Chares, R. (2007). Cones and Interior-Point Algorithms for Structured Convex Optimization Involving Powers and Exponentials (PhD thesis). Université Catholique de Louvain; École Polytechnique de Louvain.

Cover, T., and Thomas, J. (1991). Elements of Information Theory. John Wiley & Sons.

Kelly, Jr., J. L. (1956). A new interpretation of information rate. The Bell System Technical Journal, 35(4), 917–926.

Levy, H., and Markowitz, H. M. (1979). Approximating expected utility by a function of mean and variance. The American Economic Review, 69(3), 308–317.

MacLean, L., Thorp, E., and Ziemba, W. (2010). Long-term capital growth: The good and bad properties of the Kelly and fractional Kelly capital growth criteria. Quantitative Finance, 10(7), 681–687.

Markowitz, H. M. (1959). Portfolio Selection: Efficient Diversification of Investments. John Wiley & Sons.

Markowitz, H. M. (2014). Mean–variance approximations to expected utility. European Journal of Operational Research, 234(2), 346–355.

Neumann, J. von, and Morgenstern, O. (1944). Theory of Games and Economic Behavior. Princeton University Press.

Pulley, L. B. (1983). Mean–variance approximations to expected logarithmic utility. Operations Research, 31(4), 685–696.

Roy, A. (1952). Safety first and the holding of assets. Econometrica, 20(3), 431–449.

Savage, L. J. (1954). The Foundations of Statistics. New York: John Wiley & Sons.

Thorp, E. O. (1962). Beat the Dealer: A Winning Strategy for the Game of Twenty-One. New York: Blaisdell Publishing.

Thorp, E. O. (1971). Portfolio choice and the Kelly criterion. Business and Economics Statistics Section, Proceedings of the American Statistical Association, 215–224.

Thorp, E. O. (1997). The Kelly criterion in Blackjack, sports betting, and the stock market. In Proceedings of the 10th international conference on gambling and risk taking. Montreal.

Edward O. Thorp was an American math professor, author, and blackjack player who wrote Beat the Dealer (Thorp, 1962), which became a classic and was the first book to prove mathematically that the house advantage in blackjack could be overcome by card counting.↩︎
The Taylor expansion of the log function around the point $x=x_0$ is $\textm{log}(1 + x) = \textm{log}(1 + x_0) + \frac{1}{1 + x_0}(x - x_0) + \frac{1}{2}\frac{-1}{(1 + x_0)^2}(x - x_0)^2 + \cdots$ ↩︎
The Taylor expansion of a utility function $U(\cdot)$ around the point $x=x_0$ is $U(x) \approx U(x_0) + U'(x_0)(x - x_0) + \frac{1}{2}U''(x_0)(x - x_0)^2 + \cdots$ ↩︎
The Levy approximation of an expected utility around an interval of width $\kappa$ standard deviations centered at the mean is $\E\left[U(X)\right] \approx U(\mu) + \frac{1}{2\kappa^2}\left[U(\mu+\kappa\sigma) + U(\mu-\kappa\sigma) - 2U(\mu) \right],$ where $X$ denotes a random variable (with mean $\mu$ and standard deviation $\sigma$ ) and $U(\cdot)$ is the utility function.↩︎