3.8 Derivatives of Simple Matrix Functions

Result: Let \(\mathbf{A}\) be an \(n\times n\) symmetric matrix, and let \(\mathbf{x}\) and \(\mathbf{y}\) be \(n\times1\) vectors. Then, \[\begin{align} \underset{n\times1}{\frac{\partial}{\partial\mathbf{x}}}\mathbf{x}^{\prime}\mathbf{y} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\mathbf{x}^{\prime}\mathbf{y}\\ \vdots\\ \frac{\partial}{\partial x_{n}}\mathbf{x}^{\prime}\mathbf{y} \end{array}\right)=\mathbf{y},\tag{3.12}\\ \underset{n\times1}{\frac{\partial}{\partial\mathbf{x}}}\mathbf{Ax} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(\mathbf{Ax}\right)^{\prime}\\ \vdots\\ \frac{\partial}{\partial x_{n}}\left(\mathbf{Ax}\right)^{\prime} \end{array}\right)=\mathbf{A},\tag{3.13}\\ \underset{n\times1}{\frac{\partial}{\partial\mathbf{x}}}\mathbf{x}^{\prime}\mathbf{Ax} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\mathbf{x}^{\prime}\mathbf{Ax}\\ \vdots\\ \frac{\partial}{\partial x_{n}}\mathbf{x}^{\prime}\mathbf{Ax} \end{array}\right)=2\mathbf{Ax}.\tag{3.14} \end{align}\] We will demonstrate these results with simple examples. Let, \[ \mathbf{A}=\left(\begin{array}{cc} a & b\\ b & c \end{array}\right),~\mathbf{x}=\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right),\mathbf{y}=\left(\begin{array}{c} y_{1}\\ y_{2} \end{array}\right). \] First, consider (3.12). Now, \[ \mathbf{x}^{\prime}\mathbf{y}=x_{1}y_{1}+x_{2}y_{2}. \] Then, \[ \frac{\partial}{\partial\mathbf{x}}\mathbf{x}^{\prime}\mathbf{y}=\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\mathbf{x}^{\prime}\mathbf{y}\\ \frac{\partial}{\partial x_{2}}\mathbf{x}^{\prime}\mathbf{y} \end{array}\right)=\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(x_{1}y_{1}+x_{2}y_{2}\right)\\ \frac{\partial}{\partial x_{2}}\left(x_{1}y_{1}+x_{2}y_{2}\right) \end{array}\right)=\left(\begin{array}{c} y_{1}\\ y_{2} \end{array}\right)=\mathbf{y}. \] Next, consider (3.13). Note that, \[ \mathbf{Ax}=\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right)=\left(\begin{array}{c} ax_{1}+bx_{2}\\ bx_{1}+cx_{2} \end{array}\right), \] and, \[ \left(\mathbf{Ax}\right)^{\prime}=\left(ax_{1}+bx_{2},bx_{1}+cx_{2}\right). \] Then, \[ \frac{\partial}{\partial\mathbf{x}}\mathbf{Ax}=\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(ax_{1}+bx_{2},bx_{1}+cx_{2}\right)\\ \frac{\partial}{\partial x_{2}}\left(ax_{1}+bx_{2},bx_{1}+cx_{2}\right) \end{array}\right)=\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)=\mathbf{A.} \] Finally, consider (3.14). We have, \[ \mathbf{x}^{\prime}\mathbf{Ax}=\left(\begin{array}{cc} x_{1} & x_{2}\end{array}\right)\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right)=ax_{1}^{2}+2bx_{1}x_{2}+cx_{2}^{2}. \] Then, \[\begin{align*} \frac{\partial}{\partial\mathbf{x}}\mathbf{x}^{\prime}\mathbf{Ax} & =\left(\begin{array}{c} \frac{\partial}{\partial x_{1}}\left(ax_{1}^{2}+2bx_{1}x_{2}+cx_{2}^{2}\right)\\ \frac{\partial}{\partial x_{2}}\left(ax_{1}^{2}+2bx_{1}x_{2}+cx_{2}^{2}\right) \end{array}\right)=\left(\begin{array}{c} 2ax_{1}+2bx_{2}\\ 2bx_{1}+2cx_{2} \end{array}\right)\\ & =2\left(\begin{array}{cc} a & b\\ b & c \end{array}\right)\left(\begin{array}{c} x_{1}\\ x_{2} \end{array}\right)=2\mathbf{Ax}. \end{align*}\]

Example 3.9 (Calculating an asset’s marginal contribution to portfolio volatility)

In portfolio risk budgeting (see chapter 14), asset i’s marginal contribution to portfolio volatility \(\sigma_{p}=\mathbf{\left(x^{\prime}\Sigma x\right)}^{1/2}\) is given by \[ \mathrm{MCR_{i}^{\sigma}}=\frac{\partial\sigma_{p}}{\partial x_{i}}=\frac{\partial \left(\mathbf{x}^{\prime}\Sigma \mathbf{x}\right)^{1/2}}{\partial x_{i}}, \] and approximates how much portfolio volatility changes when the allocation to asset i increases by a small amount. Using the chain rule and matrix derivatives we can compute the entire vector of asset marginal contributions at once:

\[\begin{align*} \frac{\partial \left(\mathbf{x}^{\prime}\Sigma \mathbf{x}\right)^{1/2}}{\partial\mathbf{x}} & = \frac{1}{2}\left(\mathbf{x}^{\prime}\Sigma \mathbf{x}\right)^{-1/2} \frac{\partial \mathbf{x}^{\prime}\Sigma \mathbf{x}}{\partial\mathbf{x}} = \frac{1}{2}\left(\mathbf{x}^{\prime} \Sigma \mathbf{x} \right)^{-1/2} 2 \Sigma \mathbf{x} \\ & = \left(\mathbf{x}^{\prime} \Sigma \mathbf{x} \right)^{-1/2} \Sigma \mathbf{x}= \frac{\Sigma \mathbf{x}}{\sigma_{p}}. \end{align*}\]

Then asset \(i\)’s marginal contribution is given by the \(i\)-th row of \(\frac{\Sigma \mathbf{x}}{\sigma_{p}}\).

\(\blacksquare\)

Example 2.33 (Finding the global minimum variance portfolio)

Let \(\mathbf{R}\) denote an \(n\times1\) random vector of asset returns with \(E[\mathbf{R}]=\mu\) and \(\mathrm{var}(\mathbf{R})=\Sigma\). The global minimum variance portfolio (see Chapter 11, Section 11.3) \(\mathbf{m}\) solves the constrained minimization problem: \[\begin{equation} \min_{\mathbf{m}}~\sigma_{p,m}^{2}=\mathbf{m}^{\prime}\Sigma m\text{ s.t. }\mathbf{m}^{\prime}\mathbf{1}=1.\tag{3.15} \end{equation}\] The Lagrangian function is: \[ L(\mathbf{m},\lambda)=\mathbf{m}^{\prime}\Sigma \mathbf{m}+\lambda\mathbf{(m}^{\prime}\mathbf{1}-1). \] The first order conditions can be expressed in matrix notation as, \[\begin{align} \underset{(n\times1)}{\mathbf{0}} & =\frac{\partial L(\mathbf{m},\lambda)}{\partial\mathbf{m}}=\frac{\partial}{\partial\mathbf{m}}\mathbf{m}^{\prime}\Sigma \mathbf{m+}\frac{\partial}{\partial\mathbf{m}}\lambda\mathbf{(m}^{\prime}\mathbf{1}-1)=2\cdot\Sigma \mathbf{m+}\lambda\cdot\mathbf{1}\tag{3.16}\\ \underset{(1\times1)}{0} & =\frac{\partial L(\mathbf{m},\lambda)}{\partial\lambda}=\frac{\partial}{\partial\lambda}\mathbf{m}^{\prime}\Sigma \mathbf{m+}\frac{\partial}{\partial\lambda}\lambda\mathbf{(m}^{\prime}\mathbf{1}-1)=\mathbf{m}^{\prime}\mathbf{1}-1\tag{3.17} \end{align}\] These first order conditions represent a system of \(n+1\) linear equations in \(n+1\) unknowns (\(\mathbf{m}\) and \(\lambda\)). These equations can be represented in matrix form as the system \[ \left[\begin{array}{cc} 2\Sigma & \mathbf{1}\\ \mathbf{1}^{\prime} & 0 \end{array}\right]\left[\begin{array}{c} \mathbf{m}\\ \lambda \end{array}\right]=\left[\begin{array}{c} \mathbf{0}\\ 1 \end{array}\right], \] which is of the form \(\mathbf{Az}=\mathbf{b}\) for \[ \mathbf{A}=\left[\begin{array}{cc} 2\Sigma & \mathbf{1}\\ \mathbf{1}^{\prime} & 0 \end{array}\right],\,\mathbf{z}=\left[\begin{array}{c} \mathbf{m}\\ \lambda \end{array}\right],\,\mathbf{b}=\left[\begin{array}{c} \mathbf{0}\\ 1 \end{array}\right]. \] The portfolio weight vector \(\mathbf{m}\) can be found as the first \(n\) elements of \(\mathbf{z}=\mathbf{A}^{-1}\mathbf{b}\).

\(\blacksquare\)