3.6 Multivariate Probability Distributions Using Matrix Algebra

In this section, we show how matrix algebra can be used to simplify many of the messy expressions concerning expectations and covariances between multiple random variables, and we show how certain multivariate probability distributions (e.g., the multivariate normal distribution) can be expressed using matrix algebra.

3.6.1 Random vectors

Let X1,,Xn denote n random variables. For i=1,,n let μi=E[Xi] and σ2i=var(Xi), and let σij=cov(Xi,Xj) for ij. Define the n×1 random vector X=(X1,,Xn). Associated with X is the n×1 vector of expected values: μn×1=E[X]=(E[X1]E[Xn])=(μ1μn).

3.6.2 Covariance matrix

The covariance matrix, Σ, summarizes the variances and covariances of the elements of the random vector X. In general, the covariance matrix of a random vector X (sometimes called the variance of the vector X) with mean
vector μ is defined as:

cov(X)=E[(Xμ)(Xμ)]=Σ.

If X has n elements then Σ will be the symmetric and positive semi-definite n×n matrix,

Σn×n=(σ21σ12σ1nσ12σ22σ2nσ1nσ2nσ2n).

For the case n=2, we have by direct calculation E[(Xμ)(Xμ)]=E[(X1μ1X2μ2)(X1μ1,X2μ2)]=E[((X1μ1)2(X1μ1)(X2μ2)(X2μ2)(X1μ1)(X2μ2)2)]=(E[(X1μ1)2]E[(X1μ1)(X2μ2)]E[(X2μ2)(X1μ1)]E[(X2μ2)2])=(var(X1)cov(X1,X2)cov(X2,X1)var(X2))=(σ21σ12σ12σ22)=Σ.

If ρ12±1 then det(Σ)0, Σ has rank 2 and is positive definite. However, if ρ12=±1 then det(Σ)=0, Σ has rank 1 and is positive semi-definite.

3.6.3 Correlation matrix

The correlation matrix, C, summarizes all pairwise correlations between the elements of the n×1 random vector X and is given by

C=(1ρ12 ρ1nρ121 ρ2n   ρ1nρ2n 1)

Let Σ denote the covariance matrix of X. Then the correlation matrix C can be computed using C=D1ΣD1, where D is an n×n diagonal matrix with the standard deviations of the elements of X along the main diagonal: D=(σ1000σ2000σn).

Example 2.28 (Computing correlation matrix from covariance matrix in R)

In R the cov2cor() function can be used to compute the correlation matrix given the covariance matrix using (3.6):

sigma.1 = 1
sigma.2 = 2
rho.12 = 0.5
sigma.12 = rho.12*sigma.1*sigma.2
Sigma = matrix(c(sigma.1^2, sigma.12, sigma.12, sigma.2^2), 
               2, 2, byrow=TRUE)
Sigma
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    4
cov2cor(Sigma)
##      [,1] [,2]
## [1,]  1.0  0.5
## [2,]  0.5  1.0

3.6.4 Variance of linear combination of random vectors

Consider the n×1 random vector X with mean vector μ and covariance matrix Σ. Let a=(a1,,an) be an n×1 vector of constants and consider the random variable Y=aX=a1X1++anXn. The expected value of Y is:

μY=E[Y]=E[aX]=aE[X]=aμ.

The variance of Y is var(Y)=var(aX)=E[(aXaμ)2]=E[(a(Xμ))2] since Y=aX is a scalar. Now we use a trick from matrix algebra. If z is a scalar (think of z=2) then zz=zz=z2. Let z=a(Xμ) and so zz=a(Xμ)(Xμ)a. Then, var(Y)=E[z2]=E[zz]=E[a(Xμ)(Xμ)a]=aE[(Xμ)(Xμ)]a=acov(X)a=aΣa.

3.6.5 Covariance between linear combination of two random vectors

Consider the n×1 random vector X with mean vector μ and covariance matrix Σ. Let a=(a1,,an) and b=(b1,,bn) be n×1 vectors of constants, and consider the random variable Y=aX=a1X1++anXn and Z=bX=b1X1++bnXn. From the definition of covariance we have: cov(Y,Z)=E[(YE[Y])(ZE[Z])] which may be rewritten in matrix notation as, cov(aX,bX)=E[(aXaμ)(bXbμ)]=E[a(Xμ)b(Xμ)]=E[a(Xμ)(Xμ)b]=aE[(Xμ)(Xμ)]b=aΣb. Since a(Xμ) and b(Xμ) are scalars, we can use the trick: a(Xμ)b(Xμ)=a(Xμ)(Xμ)b.

3.6.6 Variance of a vector of linear functions of a random vector

Consider the n×1 random vector X with mean vector μ and covariance matrix Σ. Let A be an m×n matrix. Then the m×1 random vector Y=AX represents m linear functions of X. By the linearity of expectation, we have E[Y]=AE[X]=Aμ. The m×m covaraince matrix of Y is given by var(Y)=E[(YE[Y])(YE[Y])]=E[(AXAμ)(AXAμ)]=E[A(Xμ)(A(Xμ))]=E[A(Xμ)(Xμ)A]=AE[(Xμ)(Xμ)]A=AΣA.

3.6.7 Multivariate normal distribution

Consider n random variables X1,,Xn and assume they are jointly normally distributed. Define the n×1 vectors X=(X1,,Xn), x=(x1,,xn) and μ=(μ1,,μn), and the n×n covariance matrix: Σ=(σ21σ12σ1nσ12σ22σ2nσ1nσ2nσ2n). Then XN(μ,Σ) means that the random vector X has a multivariate normal distribution with mean vector μ and covariance matrix Σ. The pdf of the multivariate normal distribution can be compactly expressed as: f(x)=12πn/2det

Example 3.8 (Creating a multivariate normal random vector from a vector of independent standard normal random variables)

Let \mathbf{Z}=(Z_{1},\ldots,Z_{n})^{\prime} denote a vector of n iid standard normal random variables. Then \mathbf{Z}\sim N(\mathbf{0},\,\mathbf{I}_{n}) where \mathbf{I}_{n} denotes the n-dimensional identity matrix. Given \mathbf{Z}, we would like to create a random vector \mathbf{X} such that \mathbf{X}\sim N(\mu,\,\Sigma). Let \Sigma^{1/2} denote the upper triangular Cholesky factor of \Sigma such that \Sigma=\Sigma^{1/2 \prime}\Sigma^{1/2}. Define \mathbf{X} using \mathbf{X}=\mu+\Sigma^{1/2 \prime}\mathbf{Z}. Then \begin{eqnarray*} E[\mathbf{X}] & = & \mu+\Sigma^{1/2}E[\mathbf{Z}]=\mu,\\ \mathrm{var}(\mathbf{X}) & = & \mathrm{var}\left(\Sigma^{1/2}\mathbf{Z}\right)=\Sigma^{1/2 \prime}\mathrm{var}(\mathbf{Z})\Sigma^{1/2}=\Sigma^{1/2 \prime}\mathbf{I}_{n}\Sigma^{1/2}=\Sigma. \end{eqnarray*} Hence, \mathbf{X}\sim N(\mu,\,\Sigma).

\blacksquare

Example 2.30 (Simulating multivariate normal random vectors in R)

Here, we illustrate how to simulate \mathbf{X} \sim N(\mu, \Sigma) where \mu = (1,1)^{\prime} and \Sigma = \left(\begin{array}{ll} 1 & 1\\ 1 & 4 \end{array}\right) First, specify the mean vector and covariance matrix for the the bivariate normal distribution:

mu.vec = c(1,1)
sigma.1 = 1
sigma.2 = 2
rho.12 = 0.5
sigma.12 = rho.12*sigma.1*sigma.2
Sigma = matrix(c(sigma.1^2, sigma.12, sigma.12, sigma.2^2), 
               2, 2, byrow=TRUE)
mu.vec
## [1] 1 1
Sigma
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    4

Second, compute the Cholesky factorization of \Sigma:

Sigma.5 = chol(Sigma)
Sigma.5
##      [,1]     [,2]
## [1,]    1 1.000000
## [2,]    0 1.732051

Third, simulate \mathbf{Z} \sim N(\mathbf{0}, \mathbf{I}_{2})

n = 2
set.seed(123)
Z = rnorm(n)

Fourth, compute \mathbf{X} = \mu + \Sigma^{1/2 \prime}\mathbf{Z}

X = mu.vec + t(Sigma.5)%*%Z
X
##            [,1]
## [1,] 0.43952435
## [2,] 0.04084525

\blacksquare