Chapter 6 Multivariate Normal Distributions
To date, we have studied a number of standard probability distributions: uniform, Bernoulli, binomial, geometric, negative binomial, Poisson, exponential, gamma, normal. These are all random variables that govern exactly one quantity. However throughout this course we have studied joint probability distribution functions: that is the study of multiple quantities simultaneously. This begs the question, are there any standard probability distributions in higher dimensions?
6.1 Matrices: symmetry, eigenvalues and postive definiteness
Throughout this chapter, we will use the language of matrices freely. As a point of notation, we will represent all matrices with variable names in a bold font, for example . A vector will be thought of as a matrix in this sense: it is a matrix with either row or column.
Further when we use the notation , this labels the entry of in the row and column by .
We consolidate our knowledge on the definition of symmetric matrices, the determinant of an matrix, the definitions of eigenvalues and eigenvectors, and what it means for matrices to be positive definite.
Let be a matrix.Then is said to be symmetric if .
Definition 6.1.1 is equivalent to satisfying for all with .
Are the following matrices symmetric?
The answers are
Yes;
Yes;
No;
Yes.
Let be a matrix. A scalar is called an eigenvalue of if there is a non-zero vector such that The vector corresponding to is called an eigenvector of .
It was shown that an eigenvalue of an matrix satisfies the characteristic polynomial: Recall that the determinant of any matrix is calculated by fixing some value with , and finding where is the minor of , that is, the determinant of the matrix obtained by deleting the row and column of .
What are the eigenvalues and eigenvectors of ?
Note that is a matrix, so in the notation of Definition 6.1.3, . Calculate
So
So are the eigenvalues.
For , we are look for a eigenvector satisfying
Equation (1) rearranges to . Substituting this into equation (2), we obtain
So clearly and is a solution. So we have an eigenvector
For , we are looking for eigenvector satisfying
Similarly to the previous eigenvalue, we solve these three equations simultaneously giving
For , we are looking for eigenvector satisfying
Similarly to the first eigenvalue, we solve these three equations simultaneously giving
Therefore has eigenvalues with corresponding eigenvectors given respectively by .A symmetric matrix is positive definite if the quantity is positive for all column vectors of length .
Note that is the transformation of under . It follows that is the dot product of and . This being positive is equivalent to the angle between and being less than . Therefore a positive definite matrix is informally a matrix that transforms all vectors in such a way that they still point in the same general direction.
Determine whether the matrix is positive definite.
For any vector , we have Note for all real numbers , we have and . It follows that for all .
Determine whether the matrix is positive definite.
Set . We have Therefore is not positive definite.
Matrix calculations can be performed in R. To create a matrix variable you can use the matrix function. As input the function takes a vector, below this is `c(2, 4, 3, 1, 5, 7)’, representing the entries, and a parameter nrows representing the number of rows the entries should be sorted into.
R also allows for algebraic manipulations of matrices such as matrix addition, matrix multiplication and scalar multiplication.
Finally R allows for easy calculation of eigenvalues and eigenvectors of a square matrix.
D = matrix( c(13, -4, 2, -4, 11, -2, 2, -2, 8), nrow=3)
ev <- eigen(D)
values <- ev$values
vectors <- ev$vectors
values
vectors
R also allows for algebraic manipulations of matrices
6.2 Definition of Multivariate Normal Distribution
The normal distribution that we saw in Section 1.3 generalises to higher dimensions. Let represent the dimension we are interested in.
Informally a multivariate normal distribution can be defined as follows: Consider standard normal distributions , that is, for all with . From this, Form a vector of random variables . Then for any non-singular matrix and any vector , define . Then the vector of random variables is distributed by a multivariate normal distribution.
This is saying that every component is a linear combination of some fixed group of standard normal distributions, and a translation.
Formally we define the multivariate normal distribution by the joint probability density function of .
Let and be an real, symmetric, positive definite matrix all of whose eigenvalues are positive.
A vector of random variables is said to have an -dimensional normal distribution with parameters and , denoted , if the joint PDF of is given by
The PDF outlined in the formal Definition 6.2.1 can be derived from the informal construction of at the beginning of the section by taking .
Note for , Definition 6.2.1 simplifies to give the PDF of the one-dimensional normal distribution.
The multivariate normal distribution has the following important properties:
The expectation of is given by the entry of , that is, for all . Succinctly, the vector is the expectation vector of ;
The covariance , reducing to when , is given by the element in the row and column of , that is, for all and . Succinctly, the matrix is the variance-covariance matrix of . It follows that if are independent then for all ;
These two points indicate why and are the two input parameters for the multivariate normal distribution.
6.3 Two-Dimensional Normal Distribution
In this section we study the -dimensional normal distribution, often referred to as the bivariate normal distribution.
When , or another suitably small value, we can explicitly label the elements of and as and respectively. Here is the standard deviation of , is the standard deviation of and is the correlation coefficient .
Why are the upper right and lower left entries of equal to ?
Show that is non-singular if and only if and .
Substitution of these values into Definition 6.2.1 gives
The following code creates a contour plot of the PDF above with and .
library(mnormt)
#create bivariate normal distribution
x <- seq(-3, 3, 0.1)
y <- seq(-3, 3, 0.1)
mu <- c(0, 0)
sigma <- matrix(c(2, -1, -1, 2), nrow=2)
f <- function(x, y) dmnorm(cbind(x, y), mu, sigma)
z <- outer(x, y, f)
#create contour plot
contour(x, y, z)
A sample of size can be observed from a bivariate normal distribution using the following R code.
library(plotly)
library(mixtools)
# Set up means, standard deviations, correlations
mu1 <- 1; mu2 <- 2
sig1 <- 1; sig2 <- 1; rho <- 0;
# Construct mean vector and (co)variance matrix
mu <- c(mu1,mu2)
Sigma <- matrix(c(sig1^2,rho*sig1*sig2,rho*sig1*sig2,sig2^2), ncol=2)
#plot all realisations
X=rmvnorm(n=200, mu, Sigma)
plot(X)
In the above code, by setting mu as a vector of length , and as a matrix before inputting them into rmvnorm allows one to simulate an -dimensional normal distribution.
6.4 Properties of Multivariate Normal Distribution
The following theorem states that the transformation of a multivariate normal distribution is itself a multivariate normal distribution.
Let and consider some matrix . Define a new random vector . Then ;
As a corollary of this theorem, we can easily calculate the marginal distributions of a multivariate normal distribution.
The marginal distribution of each component is a one-dimensional normal distribution.
This is a direct consequence of Theorem 6.4.1 by setting where the is in the position.
Suppose , where Find the distribution of .
Suppose , where Determine the constant such that and are independent.
Writing in the form requires
By the properties of a multivariate normal distribution,
Calculate and
For and to be independent necessarily . Therefore .