27 Normal Distributions
- Normal distributions are probably the most important distributions in probability and statistics.
- Any Normal distribution follows the “empirical rule” which determines the percentiles that give a Normal distribution its particular bell shape. For example,
- 38% of values are within 0.5 standard deviations of the mean
- 50% of values are within 0.67 standard deviations of the mean
- 68% of values are within 1 standard deviation of the mean
- 87% of values are within 0.5 standard deviations of the mean
- 95% of values are within 2 standard deviations of the mean
- 99% of values are within 2.6 standard deviations of the mean
- 99.7% of values are within 3 standard deviations of the mean
Percentile | SDs away from the mean |
---|---|
0.1% | 3.09 SDs below the mean |
0.5% | 2.58 SDs below the mean |
1% | 2.33 SDs below the mean |
2.5% | 1.96 SDs below the mean |
10% | 1.28 SDs below the mean |
15.9% | 1 SDs below the mean |
25% | 0.67 SDs below the mean |
30.9% | 0.5 SDs below the mean |
50% | 0 SDs above the mean |
69.1% | 0.5 SDs above the mean |
75% | 0.67 SDs above the mean |
84.1% | 1 SDs above the mean |
90% | 1.28 SDs above the mean |
97.5% | 1.96 SDs above the mean |
99% | 2.33 SDs above the mean |
99.5% | 2.58 SDs above the mean |
99.9% | 3.09 SDs above the mean |
- A continuous random variable
has a Standard Normal distribution if its pdf is - If
has a Standard Normal distribution then - The Standard Normal pdf is symmetric about its mean of 0, and the peak of the density occurs at 0.
- The standard deviation is 1, and 1 also indicates the distance from the mean to where the concavity of the density changes. That is, there are inflection points at
. - A continuous random variable
has a Normal (a.k.a., Gaussian) distribution with mean and standard deviation if its pdf is - If
has a Normal( , ) distribution then - A Normal density is a particular “bell-shaped” curve which is symmetric about its mean
. The mean is a location parameter: indicates where the center and peak of the distribution is. - The standard deviation
is a scale parameter: indicates the distance from the mean to where the concavity of the density changes. That is, there are inflection points at .
Example 27.1 The pdfs in the plot below represent the distribution of hypothetical test scores in three classes. The test scores in each class follow a Normal distribution. Identify the mean and standard deviation for each class.
Example 27.2 Daily high temperatures (degrees Fahrenheit) in San Luis Obispo in August follow (approximately) a Normal distribution with a mean of 76.9 degrees F. The temperature exceeds 100 degrees Fahrenheit on about 1.5% of August days.
What is the standard deviation?
Suppose the mean increases by 2 degrees Fahrenheit. On what percentage of August days will the daily high temperature exceed 100 degrees Fahrenheit? (Assume the standard deviation does not change.)
A mean of 78.9 is 1.02 times greater than a mean of 76.9. By what (multiplicative) factor has the percentage of 100-degree days increased? What do you notice?
If the mean is 76.9, what is the 25th percentile of daily temperatues?
27.1 Bivariate Normal distributions
- Jointly continuous random variables
and have a Bivariate Normal distribution with parameters , , , , and if the joint pdf is, for , - It can be shown that if the pair
has a BivariateNormal( , , , , ) distribution - A Bivariate Normal Density has elliptical contours. For each height
the set is an ellipse. The density decreases as moves away from , most steeply along the minor axis of the ellipse, and least steeply along the major of the ellipse. - A scatterplot of
pairs generated from a Bivariate Normal distribution will have a rough linear association and the cloud of points will resemble an ellipse. - If
and have a Bivariate Normal distribution, then the marginal distributions are also Normal: has a Normal distribution and has a Normal . - If
and have a Bivariate Normal distribution and then and are independent. (Remember, in general it is possible to have situations where the correlation is 0 but the random variables are not independent.) - It can also be shown that if
and have a Bivariate Normal distribution then any conditional distribution is Normal. The conditional distribution of given is - The conditional expected value of
given is a linear function of , called the regression line of on :- The regression line passes through the point of means
and has slope - The regression line estimates that if the given
value is SDs above of the mean of , then the corresponding values will be, on average, SDs away from the mean of - Since
, for a given value the corresponding values will be, on average, relatively closer to the mean of than the given value is to the mean of . This is known as regression to the mean.
- The regression line passes through the point of means
- For Bivariate Normal distributions, the conditional variance of
given does not depend on : and have a Bivariate Normal distribution if and only if every linear combination of and has a Normal distribution. That is, and have a Bivariate Normal distribution if and only if has a Normal distribution for all , , .
Example 27.3 Suppose that SAT Math (
Find the probability that a student has a Math score above 700.
Find the probability that a student has a total score above 1500.
Compute and interpret
.
Compute and interpret
.
Find the probability that a student has a higher Math than Reading score if the student scores 700 on Reading.
Describe how you could use a Normal(0, 1) spinner to simulate an
pair.
Find the probability that a student has a higher Math than Reading score.
Example 27.4 Let
How could you use spinners to simulate an
pair?
Identify the distribution of
.
Sketch a scatterplot of simulated
values.
Are
and independent? (Careful, it is not enough to say “no, because is a function of ”. You can check that and are independent even though is a function of .)
Find
and .
Is the distribution of
Normal? (Hint: find .)
Does the pair
have a Bivariate Normal distribution?
- If the pair
has a joint Normal distribution then each of and has a Normal distribution. - But the example shows that the converse is not true. That is, if each of
and has a Normal distribution, it is not necessarily true that the pair has a joint Normal distribution - However, if
and are independent and each of and has a Normal distribution, then the pair has a joint Normal distribution.