Chapter 15 Multivariate Normal Distribution
15.1 Introduction
In previous sections we have introduced joint distributions between random variables \(X\) and \(Y\) and considered their marginal and conditional distributions. In this section, we study the special case where the joint distribution of \(X_1, X_2, \ldots, X_n\) is a multivariate normal distribution. In this case both marginal and conditional distributions are (multivariate) normal distributions. We pay particular attention to the special case, \(n=2\), the bivariate normal distribution. Multivariate normal distributions appear in many areas of statistic and being able to manipulate multivariate normal distributions is an important skill.
15.2 \(n\)-Dimensional Normal Distribution
A random vector \(\mathbf{X} = (X_1,X_2,\dots,X_n)^T\) is said to have an \(\mathit{n}\)-dimensional normal distribution with parameters \(\mathbf{\mu}\) and \(\mathbf{\Sigma}\) if the joint p.d.f. of \(\mathbf{X}\) is given by \[ f_{\mathbf{X}}(\mathbf{x}) = (2\pi)^{-\frac{n}{2}} \left| \mathbf{\Sigma} \right|^{-\frac{1}{2}} \exp \left\{ -\frac{1}{2} (\mathbf{x}-\mathbf{\mu})^T \mathbf{\Sigma}^{-1} (\mathbf{x}-\mathbf{\mu}) \right\},\] where \(\mathbf{\mu} = (\mu_1,\mu_2,\dots,\mu_n)^T\) and \(\mathbf{\Sigma} = (\sigma_{ij})\) is an \(n \times n\) real, symmetric, positive definite matrix with all positive eigenvalues. It is denoted by \[\mathbf{X} \sim N_n(\mathbf{\mu},\mathbf{\Sigma}).\]
The multivariate normal distribution has the following important properties:
- If \(\mathbf{D}\) is a \(p \times n\) matrix and \(\mathbf{X} \sim N_n(\mathbf{\mu},\mathbf{\Sigma})\), then \(\mathbf{Z} = \mathbf{D}\mathbf{X} \sim N_p (\mathbf{D} \mathbf{\mu}, \mathbf{D} \mathbf{\Sigma} \mathbf{D}^T)\).
- The marginal distribution of each component \(X_i\) is normal with \(E[X_i]=\mu_i\) and \(\text{Var}(X_i)=\sigma_{ii}\). Note that this is a direct consequence of the first property taking \(\mathbf{D}=(0,\dots,0,1,0,\dots,0)\), that is, the \(i\)th component equal to \(1\).
- The components \(X_1,X_2,\dots,X_n\) of a multivariate normal random vector are independent of each other if and only if \(X_1, X_2,\ldots,X_n\) are uncorrelated, i.e. \(\sigma_{ij} = \text{Cov}(X_i,X_j) = 0\) for all \(i \neq j\).
That is, for normal random variables uncorrelated (zero covariance) implies independence.
- Conditional distributions derived from joint normal distributions are normal.
The bivariate normal distribution is the special case \(n=2\) for the \(n\)-dimensional normal distribution.
The random variables \(X_1\) and \(X_2\) are said to have a bivariate normal distribution with mean \(\mathbf{\mu} = (\mu_1,\mu_2)\) and variance-covariance matrix \(\mathbf{\Sigma} = \begin{pmatrix} \sigma_1^2 & \rho \sigma_1 \sigma_2 \\ \rho \sigma_1 \sigma_2 & \sigma_2^2 \end{pmatrix}\) if their joint p.d.f. is given by \[\begin{align*} f_{X_1,X_2}(x_1,x_2) = \frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-\rho^2}} &\exp \left\{ -\frac{1}{2(1-\rho^2)} \left[ \left( \frac{x_1-\mu_1}{\sigma_1} \right)^2 \right. \right. \\ & \left. \left. -2\rho \left( \frac{x_1-\mu_1}{\sigma_1} \right) \left( \frac{x_2-\mu_2}{\sigma_2} \right) + \left( \frac{x_2-\mu_2}{\sigma_2} \right)^2 \right] \right\}. \end{align*}\]
For \(X_1\) and \(X_2\) with bivariate normal distribution, we have:
\(X_1 \sim N(\mu_1, \sigma_1^2)\) and \(X_2 \sim N(\mu_2, \sigma_2^2)\) and \(\text{Cov}(X_1,X_2) = \rho \sigma_1 \sigma_2\).
\((X_1 | X_2 = x_2) \sim N \left( \mu_1 + \rho \frac{\sigma_1}{\sigma_2} (x_2-\mu_2), \sigma_1^2 (1-\rho^2) \right)\).
\((X_2 | X_1 = x_1) \sim N \left( \mu_2 + \rho \frac{\sigma_2}{\sigma_1} (x_1-\mu_1), \sigma_2^2 (1-\rho^2) \right)\).
Suppose that \(\mathbf{Z} =(Z_1,Z_2)\), where \(Z_1\) and \(Z_2\) are independent \(N(0,1)\) random variables, that is, \[\mathbf{Z} \sim N_2 ( \mathbf{0}, \mathbf{I}_2),\] where \(\mathbf{I}_2\) is the \(2 \times 2\) identity matrix.
Then for any \(\mu =(\mu_1,\mu_2) \in \mathbb{R}^2\) and variance-covariance matrix \[ \Sigma = \begin{pmatrix} \sigma_1^2 & \rho \sigma_1 \sigma_2 \\ \rho \sigma_1 \sigma_2 & \sigma_2^2\end{pmatrix}, \] with \(\sigma_1, \sigma_2 >0\) and \(-1 < \rho <1\), \[\mathbf{X} \sim N_2 ( \mu, \Sigma)\] can be expressed as a transformation of \(\mathbf{Z}\). Specifically, \[ \mathbf{X} = \mu + \mathbf{L} \mathbf{Z}, \] where \[ \mathbf{L} = \begin{pmatrix} \sigma_1 & 0 \\ \rho \sigma_2 & \sqrt{1-\rho^2} \sigma_2\end{pmatrix}. \] Note that \[ \Sigma = \mathbf{L} \mathbf{L}^\top \] and this is known as the Cholesky decomposition.
The Cholesky decomposition extends to \(n \times n\) variance-covariance matrix, \(\Sigma\) and is a common approach to convert \(n\) independent univariate normal distributions into a multivariate normal distribution.
Trivariate normal.
Suppose \(\mathbf{X}=(X_1,X_2,X_3)^T \sim N_3(\mathbf{0},\mathbf{\Sigma})\), where
Find the distribution of \(Y=X_1+X_2\).
Determine the constant \(c\) such that \(Y_1 = 2X_1 + cX_2\) and \(Y_2 = 2X_1 + cX_3\) are independent.
Attempt Exercise 1: Trivariate Normal and then watch Video 23 for the solutions.
Video 23: Trivariate Normal
Alternatively the solutions are available:
Solution to Exercise 1
Writing \(Y = X_1+X_2\), in the form \(\mathbf{DX}\) requires \(\mathbf{D} = (1\ 1\ 0)\). By the properties of a multivariate normal distribution \[Y \sim N(\mathbf{D0}, \mathbf{D \Sigma D}^T),\] where \(\mathbf{D0}=\mathbf{0}\) and \[\begin{align*} \mathbf{D} \mathbf{\Sigma} \mathbf{D}^T &= \begin{pmatrix} 1 & 1 & 0 \end{pmatrix} \begin{pmatrix} 2 & 1 & 0 \\ 1 & 4 & 0 \\ 0 & 0 & 5 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} \\ &= \begin{pmatrix} 3 & 5 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix} \\ &= 8. \end{align*}\] Therefore, \(Y \sim N(0,8)\).
Let \(\mathbf{Y} = \begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix} = \mathbf{DX}\). Choose \[\mathbf{D} = \begin{pmatrix} 2 & c & 0 \\ 2 & 0 & c \end{pmatrix}.\] By the properties of a multivariate normal distribution, \(\mathbf{Y} \sim N_2(\mathbf{D0}, \mathbf{D \Sigma D}^T)\), where \(\mathbf{D0} = \mathbf{0}\) and \[\begin{align*} \mathbf{D \Sigma D}^T &= \begin{pmatrix} 2 & c & 0 \\ 2 & 0 & c \end{pmatrix} \begin{pmatrix} 2 & 1 & 0 \\ 1 & 4 & 0 \\ 0 & 0 & 5 \end{pmatrix} \begin{pmatrix} 2 & 2 \\ c & 0 \\ 0 & c \end{pmatrix} \\[3pt] &= \begin{pmatrix} 4+c & 2+4c & 0 \\ 4 & 2 & 5c \end{pmatrix} \begin{pmatrix} 2 & 2 \\ c & 0 \\ 0 & c \end{pmatrix} \\[3pt] &= \begin{pmatrix} 8+4c+4c^2 & 8+2c \\ 8+2c & 8+5c^2 \end{pmatrix}. \end{align*}\]
For \(Y_1\) to be independent of \(Y_2\), necessarily \(\text{Cov}(Y_1,Y_2) = 8+2c = 0\). Therefore \(c=-4\).
Task: Lab 8
Attempt the R Markdown file for Lab 8:
Lab 8: Transformations and Multivariate Normal Distribution