Chapter 2 Multivariate Normal Distributions
To date, we have studied a number of standard probability distributions: uniform, Bernoulli, binomial, geometric, negative binomial, Poisson, exponential, gamma, normal. These are all random variables that govern exactly one quantity. This begs the question, are there any standard probability distributions in higher dimensions?
2.1 Definition of Multivariate Normal Distribution
The normal distribution generalises to higher dimensions. Let represent the dimension we are interested in.
Informally a multivariate normal distribution can be defined as follows: Consider standard normal distributions , that is, for all with . From this, Form a vector of random variables . Then for any non-singular matrix and any vector , define . Then the vector of random variables is distributed by a multivariate normal distribution.
This is saying that every component is a linear combination of some fixed group of standard normal distributions, and a translation.
Formally we define the multivariate normal distribution by the joint probability density function of .
Let and be an real, symmetric, positive definite matrix all of whose eigenvalues are positive.
A vector of random variables is said to have an -dimensional normal distribution with parameters and , denoted , if the joint PDF of is given by
The PDF outlined in the formal Definition 2.1.1 can be derived from the informal construction of at the beginning of the section by taking .
Note for , Definition 2.1.1 simplifies to give the PDF of the one-dimensional normal distribution.
The multivariate normal distribution has the following important properties:
The expectation of is given by the entry of , that is, for all . Succinctly, the vector is the expectation vector of ;
The covariance , reducing to when , is given by the element in the row and column of , that is, for all and . Succinctly, the matrix is the variance-covariance matrix of . It follows that if are independent then for all ;
These two points indicate why and are the two input parameters for the multivariate normal distribution.
Consider and . Then and are independent if and only if .
Clearly if and are independent then by well-established results in statistics. It remains to show the other direction of the implication.
Assume . Then Now calculate that Therefore and are independent.
In practice, one often has observations that are assumed to come from a multivariate distribution whose parameters are not specified. The mean and the the sample variance matrix can be approximated using the usual sample statistics.
Consider . Show that .
If is an independent and identically distributed random sample from , then the sample mean and sample variance matrix are independent.
Since from Exercise 2.1, we know . Set . Then
Therefore and are independent. Since it follows that and are independent.
2.2 Two-Dimensional Normal Distribution
In this section we study the -dimensional normal distribution, often referred to as the bivariate normal distribution.
When , or another suitably small value, we can explicitly label the elements of and as and respectively. Here is the standard deviation of , is the standard deviation of and is the correlation coefficient .
Why are the upper right and lower left entries of equal to ?
Show that is non-singular if and only if and .
Substitution of these values into Definition 2.1.1 gives
The following code creates a contour plot of the PDF above with and .
library(mnormt)
#create bivariate normal distribution
x <- seq(-3, 3, 0.1)
y <- seq(-3, 3, 0.1)
mu <- c(0, 0)
sigma <- matrix(c(2, -1, -1, 2), nrow=2)
f <- function(x, y) dmnorm(cbind(x, y), mu, sigma)
z <- outer(x, y, f)
#create contour plot
contour(x, y, z)
A sample of size can be observed from a bivariate normal distribution using the following R code.
library(plotly)
library(mixtools)
# Set up means, standard deviations, correlations
mu1 <- 1; mu2 <- 2
sig1 <- 1; sig2 <- 1; rho <- 0;
# Construct mean vector and (co)variance matrix
mu <- c(mu1,mu2)
Sigma <- matrix(c(sig1^2,rho*sig1*sig2,rho*sig1*sig2,sig2^2), ncol=2)
#plot all realisations
X=rmvnorm(n=200, mu, Sigma)
plot(X)
In the above code, by setting mu as a vector of length , and as a matrix before inputting them into rmvnorm allows one to simulate an -dimensional normal distribution.
2.3 Properties of Multivariate Normal Distribution
The following theorem states that the transformation of a multivariate normal distribution is itself a multivariate normal distribution.
Let and consider some matrix . Define a new random vector . Then ;
As a corollary of this theorem, we can easily calculate the marginal distributions of a multivariate normal distribution.
The marginal distribution of each component is a one-dimensional normal distribution.
This is a direct consequence of Theorem 2.3.1 by setting where the is in the position.
Suppose , where Find the distribution of .
Suppose , where Determine the constant such that and are independent.
Writing in the form requires
By the properties of a multivariate normal distribution,
Calculate and
The independence of and is equivalent to by Proposition 2.1.2. Therefore .
2.4 Inference Testing
Let be a random sample from where is unknown, but the covariance matrix is known. Suppose one wished to do hypothesis testing for the mean of the population from the sample. In particular, how can one conduct the hypothesis test where is some specified fixed vector, to infer on the population mean .
If , and is positive definite. Then
Define . Then by Theorem 2.3.1 , and so by Corollary 2.3.2 the components of have independent univariate distributions with mean and variance .
Then Since each and are pairwise independent, it follows by the definition of the Chi-squared distribution that
By Exercise 2.1 Define the test statistic Assuming the null hypothesis is true, the test statistic can be written and so by Proposition 2.4.1, the test statistic follows a distribution. Therefore at significance level , one should reject the null hypothesis if , where is the upper quantile of the distribution. The -value is given by . Furthermore the confidence region for is which will be an ellipsoid.
Download the file weight-height_data.csv from the Moodle page. The data details the weights and heights of adult males in stones and inches respectively. Assume that the data follows a distribution where A recent article in a men’s health magazine claims that men have an average height of inches and an average weight of stone. Use R to conduct a hypothesis to determine whether you are sure that this claim is accurate.
Begin by loading the data into R:
## Height Weight
## 1 73.84702 241.8936
## 2 68.78190 162.3105
## 3 74.11011 212.7409
## 4 71.73098 220.0425
## 5 69.88180 206.3498
## 6 67.25302 152.2122
The entirety of the data can be visualised using the plot() function. Note that the data follows an ellipse shape, typical of a multivariate normal distribution.
plot(df[,1],df[,2],
main="Scatter Plot of Height and Weight of Males",
xlab="Height",
ylab="Weight",
cex=0.3,
pch=19
)
The question is equivalent to conducting the hypothesis test at the significance level. By the above theory, one should calculate the test statistic . This can be completed by the following code. Note that the sample mean is calculated using the mean() function that belongs to the matlib library. and then stored in a variable sample_mean. The covariance matrix given in the question is stored in a variable cov_mat, and the proposed mean is stored in a variable a.
library(matlib)
sample_mean = c(mean(df[,1]), mean(df[,2]))
a = c(68.97,186.93)
cov_mat = matrix(c(8.2, 48.9, 48.9, 391.3), nrow=2, ncol=2)
zeta2 = 5000 * t(sample_mean - a) %*% inv(cov_mat) %*% (sample_mean - a)
zeta2
## [,1]
## [1,] 4.956178
The test statistic should finally be compared to the critical value , that is the value such that . This can be computed using the built-in quantile function for the chi-squared distribution in R called qchisq().
## [,1]
## [1,] TRUE
The output indicates that the test statistic is within an acceptable range, and we do not reject the null hypothesis. Indeed the -value of the observed value for the test statistic can be calculated using the distribution function of the chi-squared distribution. In R, this is given by pchisq().
## [,1]
## [1,] 0.08390342
One might think conducting different hypothesis tests, one z-test to consider the mean of each of the univariate normal distributions governing each variable, which would be equivalent to the above hypothesis test. However this is not the case. Since the correlations between the variables is not taken into consideration, some of the separate univariate z-tests may indicate a rejection of the null hypotheses while the multivariate analogue described above.