9 Joint Distributions and Correlation
9.1 Joint distributions
- The joint distribution of random variables
and is a probability distribution on pairs.
Example 9.1
Roll a fair four-sided die twice. Let
- Construct a “flat” table displaying the distribution of
pairs, with one pair in each row.
- Construct a two-way displaying the joint distribution on
and .
- Sketch a plot depicting the joint distribution of
and .
- Starting with the two-way table, how could you obtain the marginal distribution of
? of ?
- Starting with the marginal distribution of
and the marginal distribution of , could you necessarily construct the two-way table of the joint distribution? Explain.
- The joint distribution of two random variables summarizes the possible pairs of values and their relative likelihoods.
- It is possible to obtain marginal distributions from a joint distribution.
- In general, marginal distributions alone are not enough to determine a joint distribution. (The exception is when random variables are independent.)
Example 9.2
Continuing the dice rolling example, construct a spinner representing The joint distribution of
Example 9.3
Donny Don’t says “Now I see why we need the spinner from the previous example to simulate
- The joint distribution of two continuous random variables can be described by a probability density function, for which volumes under the surface determine probabilities. The “density” height is whatever it needs to be so that volumes under the surface represent appropriate probabilities.
- Marginal distributions can be obtained from a joint distribution by “stacking”/“collapsing”/“aggregating” out the other variable.
9.2 Correlation
- Quantities like long run average, variance, and standard deviation summarize characteristics of the marginal distribution of a single random variable.
- Covariance and correlation summarize in a single number a characteristic of the joint distribution of two random variables, namely, the degree to which they “co-deviate from the their respective means”.
- The covariance of random values
and is defined as the long run average of the product of the paired deviations from the respective means
- A positive covariance indicate an overall positive association: above average values of
tend to be associated with above average values of - A negative covariance indicates am overall negative association: above average values of
tend to be associated with below average values of - A covariance of zero indicates that the random variables are uncorrelated: there is no overall positive or negative association. But be careful: if
and are uncorrelated there can still be a relationship between and . We will see examples later that demonstrate that being uncorrelated does not necessarily imply that random variables are independent.
Example 9.4
Consider the probability space corresponding to two rolls of a fair four-sided die. Let
and
and
and
and
and
- The numerical value of the covariance depends on the measurement units of both variables, so interpreting it can be difficult.
- Covariance is a measure of joint association between two random variables that has many nice theoretical properties, but the correlation (coefficient) is often a more practical measure.
- The correlation for two random variables is the covariance between the corresponding standardized random variables.
- When standardizing, subtracting the means doesn’t change the scale of the possible pairs of values; it merely shifts the center of the joint distribution. Therefore, correlation is the covariance divided by the product of the standard deviations.
- A correlation coefficient has no units and is measured on a universal scale. Regardless of the original measurement units of the random variables
and
if and only if for some if and only if for some- Therefore, correlation is a standardized measure of the strength of the linear association between two random variables.
- The closer the correlation is to 1 or
, the closer the joint distribution of pairs hugs a straight line, with positive or negative slope. - Because correlation is computed between standardized random variables, correlation is not affected by a linear rescaling of either variable (e.g., a change in measurement units from minutes to seconds)
Example 9.5
Donny Don’t has just completed a problem where it was assumed that SAT Math scores follow a Normal(500, 100) distribution. Now a follow up problem asks Donny how he could simulate a single (Math, Reading) pair of scores. Donny says: “That’s easy; just spin the Normal(500, 100) twice, once for Math and once for Reading.” Do you agree? Explain your reasoning.
- Just as Normal distributions are commonly assumed for marginal distributions of individual random variables, joint Normal distributions are often assumed for joint distributions of several random variables.
- A “Bivariate Normal” distribution is a joint distribution for two random variables which has five parameters: the two means, the two standard deviations, and the correlation
- A marginal Normal distribution is a “bell-shaped curve”; a Bivariate Normal distribution is a “mound-shaped” curve — imagine a pile of sand.