3.8 Derivatives of Simple Matrix Functions

Result: Let $$\mathbf{A}$$ be an $$n\times n$$ symmetric matrix, and let $$\mathbf{x}$$ and $$\mathbf{y}$$ be $$n\times1$$ vectors. Then, \begin{align} \underset{n\times1}{\frac{\partial}{\partial\mathbf{x}}}\mathbf{x}^{\prime}\mathbf{y} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\mathbf{x}^{\prime}\mathbf{y}\\ \vdots\\ \frac{\partial}{\partial x_{n}}\mathbf{x}^{\prime}\mathbf{y} \end{array}\right)=\mathbf{y},\tag{3.12}\\ \underset{n\times1}{\frac{\partial}{\partial\mathbf{x}}}\mathbf{Ax} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(\mathbf{Ax}\right)^{\prime}\\ \vdots\\ \frac{\partial}{\partial x_{n}}\left(\mathbf{Ax}\right)^{\prime} \end{array}\right)=\mathbf{A},\tag{3.13}\\ \underset{n\times1}{\frac{\partial}{\partial\mathbf{x}}}\mathbf{x}^{\prime}\mathbf{Ax} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\mathbf{x}^{\prime}\mathbf{Ax}\\ \vdots\\ \frac{\partial}{\partial x_{n}}\mathbf{x}^{\prime}\mathbf{Ax} \end{array}\right)=2\mathbf{Ax}.\tag{3.14} \end{align} We will demonstrate these results with simple examples. Let, $\mathbf{A}=\left(\begin{array}{cc} a & b\\ b & c \end{array}\right),~\mathbf{x}=\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right),\mathbf{y}=\left(\begin{array}{c} y_{1}\\ y_{2} \end{array}\right).$ First, consider (3.12). Now, $\mathbf{x}^{\prime}\mathbf{y}=x_{1}y_{1}+x_{2}y_{2}.$ Then, $\frac{\partial}{\partial\mathbf{x}}\mathbf{x}^{\prime}\mathbf{y}=\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\mathbf{x}^{\prime}\mathbf{y}\\ \frac{\partial}{\partial x_{2}}\mathbf{x}^{\prime}\mathbf{y} \end{array}\right)=\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(x_{1}y_{1}+x_{2}y_{2}\right)\\ \frac{\partial}{\partial x_{2}}\left(x_{1}y_{1}+x_{2}y_{2}\right) \end{array}\right)=\left(\begin{array}{c} y_{1}\\ y_{2} \end{array}\right)=\mathbf{y}.$ Next, consider (3.13). Note that, $\mathbf{Ax}=\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right)=\left(\begin{array}{c} ax_{1}+bx_{2}\\ bx_{1}+cx_{2} \end{array}\right),$ and, $\left(\mathbf{Ax}\right)^{\prime}=\left(ax_{1}+bx_{2},bx_{1}+cx_{2}\right).$ Then, $\frac{\partial}{\partial\mathbf{x}}\mathbf{Ax}=\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(ax_{1}+bx_{2},bx_{1}+cx_{2}\right)\\ \frac{\partial}{\partial x_{2}}\left(ax_{1}+bx_{2},bx_{1}+cx_{2}\right) \end{array}\right)=\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)=\mathbf{A.}$ Finally, consider (3.14). We have, $\mathbf{x}^{\prime}\mathbf{Ax}=\left(\begin{array}{cc} x_{1} & x_{2}\end{array}\right)\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right)=ax_{1}^{2}+2bx_{1}x_{2}+cx_{2}^{2}.$ Then, \begin{align*} \frac{\partial}{\partial\mathbf{x}}\mathbf{x}^{\prime}\mathbf{Ax} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(ax_{1}^{2}+2bx_{1}x_{2}+cx_{2}^{2}\right)\\ \frac{\partial}{\partial x_{2}}\left(ax_{1}^{2}+2bx_{1}x_{2}+cx_{2}^{2}\right) \end{array}\right)=\left(\begin{array}{c} 2ax_{1}+2bx_{2}\\ 2bx_{1}+2cx_{2} \end{array}\right)\\ & =2\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right)=2\mathbf{Ax}. \end{align*}

Example 3.9 (Calculating an asset’s marginal contribution to portfolio volatility)

In portfolio risk budgeting (see chapter 14), asset i’s marginal contribution to portfolio volatility $$\sigma_{p}=\mathbf{\left(x^{\prime}\Sigma x\right)}^{1/2}$$ is given by $\mathrm{MCR_{i}^{\sigma}}=\frac{\partial\sigma_{p}}{\partial x_{i}}=\frac{\partial \left(\mathbf{x}^{\prime}\Sigma \mathbf{x}\right)^{1/2}}{\partial x_{i}},$ and approximates how much portfolio volatility changes when the allocation to asset i increases by a small amount. Using the chain rule and matrix derivatives we can compute the entire vector of asset marginal contributions at once:

\begin{align*} \frac{\partial \left(\mathbf{x}^{\prime}\Sigma \mathbf{x}\right)^{1/2}}{\partial\mathbf{x}} & = \frac{1}{2}\left(\mathbf{x}^{\prime}\Sigma \mathbf{x}\right)^{-1/2} \frac{\partial \mathbf{x}^{\prime}\Sigma \mathbf{x}}{\partial\mathbf{x}} = \frac{1}{2}\left(\mathbf{x}^{\prime} \Sigma \mathbf{x} \right)^{-1/2} 2 \Sigma \mathbf{x} \\ & = \left(\mathbf{x}^{\prime} \Sigma \mathbf{x} \right)^{-1/2} \Sigma \mathbf{x}= \frac{\Sigma \mathbf{x}}{\sigma_{p}}. \end{align*}

Then asset $$i$$’s marginal contribution is given by the $$i$$-th row of $$\frac{\Sigma \mathbf{x}}{\sigma_{p}}$$.

$$\blacksquare$$

Example 2.33 (Finding the global minimum variance portfolio)

Let $$\mathbf{R}$$ denote an $$n\times1$$ random vector of asset returns with $$E[\mathbf{R}]=\mu$$ and $$\mathrm{var}(\mathbf{R})=\Sigma$$. The global minimum variance portfolio (see Chapter 11, Section 11.3) $$\mathbf{m}$$ solves the constrained minimization problem: $$$\min_{\mathbf{m}}~\sigma_{p,m}^{2}=\mathbf{m}^{\prime}\Sigma m\text{ s.t. }\mathbf{m}^{\prime}\mathbf{1}=1.\tag{3.15}$$$ The Lagrangian function is: $L(\mathbf{m},\lambda)=\mathbf{m}^{\prime}\Sigma \mathbf{m}+\lambda\mathbf{(m}^{\prime}\mathbf{1}-1).$ The first order conditions can be expressed in matrix notation as, \begin{align} \underset{(n\times1)}{\mathbf{0}} & =\frac{\partial L(\mathbf{m},\lambda)}{\partial\mathbf{m}}=\frac{\partial}{\partial\mathbf{m}}\mathbf{m}^{\prime}\Sigma \mathbf{m+}\frac{\partial}{\partial\mathbf{m}}\lambda\mathbf{(m}^{\prime}\mathbf{1}-1)=2\cdot\Sigma \mathbf{m+}\lambda\cdot\mathbf{1}\tag{3.16}\\ \underset{(1\times1)}{0} & =\frac{\partial L(\mathbf{m},\lambda)}{\partial\lambda}=\frac{\partial}{\partial\lambda}\mathbf{m}^{\prime}\Sigma \mathbf{m+}\frac{\partial}{\partial\lambda}\lambda\mathbf{(m}^{\prime}\mathbf{1}-1)=\mathbf{m}^{\prime}\mathbf{1}-1\tag{3.17} \end{align} These first order conditions represent a system of $$n+1$$ linear equations in $$n+1$$ unknowns ($$\mathbf{m}$$ and $$\lambda$$). These equations can be represented in matrix form as the system $\left[\begin{array}{cc} 2\Sigma & \mathbf{1}\\ \mathbf{1}^{\prime} & 0 \end{array}\right]\left[\begin{array}{c} \mathbf{m}\\ \lambda \end{array}\right]=\left[\begin{array}{c} \mathbf{0}\\ 1 \end{array}\right],$ which is of the form $$\mathbf{Az}=\mathbf{b}$$ for $\mathbf{A}=\left[\begin{array}{cc} 2\Sigma & \mathbf{1}\\ \mathbf{1}^{\prime} & 0 \end{array}\right],\,\mathbf{z}=\left[\begin{array}{c} \mathbf{m}\\ \lambda \end{array}\right],\,\mathbf{b}=\left[\begin{array}{c} \mathbf{0}\\ 1 \end{array}\right].$ The portfolio weight vector $$\mathbf{m}$$ can be found as the first $$n$$ elements of $$\mathbf{z}=\mathbf{A}^{-1}\mathbf{b}$$.

$$\blacksquare$$