30 Joint Normal Distributions

Jointly continuous random variables $X$ and $Y$ have a Bivariate Normal distribution with parameters $μ_{X}$ , $μ_{Y}$ , $σ_{X} > 0$ , $σ_{Y} > 0$ , and $- 1 < ρ < 1$ if the joint pdf is { $\begin{aligned} f_{X, Y} (x, y) & = \frac{1}{2 π σ_{X} σ_{Y} \sqrt{1 - ρ^{2}}} \exp (- \frac{1}{2 (1 - ρ^{2})} [{(\frac{x - μ_{X}}{σ_{X}})}^{2} + {(\frac{y - μ_{Y}}{σ_{Y}})}^{2} - 2 ρ (\frac{x - μ_{X}}{σ_{X}}) (\frac{y - μ_{Y}}{σ_{Y}})]), - \infty < x < \infty, - \infty < y < \infty \end{aligned}$ }
It can be shown that if the pair $(X, Y)$ has a BivariateNormal( $μ_{X}$ , $μ_{Y}$ , $σ_{X}$ , $σ_{Y}$ , $ρ$ ) distribution $\begin{aligned} E (X) & = μ_{X} \\ E (Y) & = μ_{Y} \\ SD (X) & = σ_{X} \\ SD (Y) & = σ_{Y} \\ Corr (X, Y) & = ρ \end{aligned}$
A Bivariate Normal Density has elliptical contours. For each height $c > 0$ the set ${(x, y) : f_{X, Y} (x, y) = c}$ is an ellipse. The density decreases as $(x, y)$ moves away from $(μ_{X}, μ_{Y})$ , most steeply along the minor axis of the ellipse, and least steeply along the major of the ellipse.
A scatterplot of $(x, y)$ pairs generated from a Bivariate Normal distribution will have a rough linear association and the cloud of points will resemble an ellipse.
If $X$ and $Y$ have a Bivariate Normal distribution, then the marginal distributions are also Normal: $X$ has a Normal $(μ_{X}, σ_{X})$ distribution and $Y$ has a Normal $(μ_{Y}, σ_{Y})$ .
If $X$ and $Y$ have a Bivariate Normal distribution and $Corr (X, Y) = 0$ then $X$ and $Y$ are independent. (Remember, in general it is possible to have situations where the correlation is 0 but the random variables are not independent.)
It can also be shown that if $X$ and $Y$ have a Bivariate Normal distribution then any conditional distribution is Normal. The conditional distribution of $Y$ given $X = x$ is

$N (μ_{Y} + \frac{ρ σ_{Y}}{σ_{X}} (x - μ_{X}), σ_{Y} \sqrt{1 - ρ^{2}})$

The conditional expected value of $Y$ given $X = x$ is a linear function of $x$ , called the regression line of $Y$ on $X$ : $E (Y | X = x) = μ_{Y} + ρ σ_{Y} (\frac{x - μ_{X}}{σ_{X}})$
- The regression line passes through the point of means $(μ_{X}, μ_{Y})$ and has slope $\frac{ρ σ_{Y}}{σ_{X}}$
- The regression line estimates that if the given $x$ value is $z$ SDs above of the mean of $X$ , then the corresponding $Y$ values will be, on average, $ρ z$ SDs away from the mean of $Y$ $\frac{E (Y | X = x) - μ_{Y}}{σ_{Y}} = ρ (\frac{x - μ_{X}}{σ_{X}})$
- Since $| ρ | \leq 1$ , for a given $x$ value the corresponding $Y$ values will be, on average, relatively closer to the mean of $Y$ than the given $x$ value is to the mean of $X$ . This is known as regression to the mean.
For Bivariate Normal distributions, the conditional variance of $Y$ given $X = x$ does not depend on $x$ : $SD (Y | X = x) = σ_{Y} \sqrt{1 - ρ^{2}}$
$X$ and $Y$ have a Bivariate Normal distribution if and only if every linear combination of $X$ and $Y$ has a Normal distribution. That is, $X$ and $Y$ have a Bivariate Normal distribution if and only if $a X + b Y + c$ has a Normal distribution for all $a$ , $b$ , $c$ .

Example 30.1

Suppose that SAT Math ( $M$ ) and Reading ( $R$ ) scores of CalPoly students have a Bivariate Normal distribution. Math scores have mean 640 and SD 80, Reading scores have mean 610 and SD 70, and the correlation between scores is 0.7.

Find the probability that a student has a Math score above 700.
Find the probability that a student has a total score above 1500.
Compute and interpret $E (M | R = 700)$ .
Find the probability that a student has a higher Math than Reading score if the student scores 700 on Reading.
Describe how you could use a Normal(0, 1) spinner to simulate an $(X, Y)$ pair.
Find the probability that a student has a higher Math than Reading score.

Example 30.2

Let $X$ and $I$ be independent, $X$ has a Normal(0,1) distribution, and $I$ takes values 1 or $- 1$ with probability $1 / 2$ each. Let $Y = I X$ .

How could you use spinners to simulate an $(X, Y)$ pair?
Identify the distribution of $Y$ .
Sketch a scatterplot of simulated $(X, Y)$ values.
Are $X$ and $Y$ independent? (Careful, it is not enough to say “no, because $Y$ is a function of $X$ ”. You can check that $Y$ and $I$ are independent even though $Y$ is a function of $I$ .)
Find $Cov (X, Y)$ and $Corr (X, Y)$ .
Is the distribution of $X + Y$ Normal? (Hint: find $P (X + Y = 0)$ .)
Does the pair $(X, Y)$ have a Bivariate Normal distribution?

If the pair $(X, Y)$ has a joint Normal distribution then each of $X$ and $Y$ has a Normal distribution.
But the example shows that the converse is not true. That is, if each of $X$ and $Y$ has a Normal distribution, it is not necessarily true that the pair $(X, Y)$ has a joint Normal distribution
However, if $X$ and $Y$ are independent and each of $X$ and $Y$ has a Normal distribution, then the pair $(X, Y)$ has a joint Normal distribution.