21 Covariance and Correlation
- Quantities like expected value and variance summarize characteristics of the marginal distribution of a single random variable.
- When there are multiple random variables their joint distribution is of interest.
- Covariance and correlation summarize a characteristic of the joint distribution of two random variables, namely, the degree to which they “co-deviate from the their respective means”.
21.1 Covariance
- The covariance between two random variables
and is - Covariance is defined as the long run average of the products of the paired deviations from the mean
- Covariance can be computed as the expected value of the product minus the product of expected values.
Example 21.1 Consider the probability space corresponding to two rolls of a fair four-sided die. Let
Find
. Is it equal to ?
Find
. Why is the covariance positive?
(positive association): above average values of tend to be associated with above average values of (negative association): above average values of tend to be associated with below average values of indicates that the random variables are uncorrelated: there is no overall positive or negative association.- But be careful! Uncorrelated is not the same as independent.
Example 21.2 What is another name for
21.2 Law of the unconscious statistician (LOTUS)
- In general
- LOTUS for two random variables
Example 21.3 Let
21.3 Correlation
- The correlation (coefficient) between random variables
and is - The correlation for two random variables is the covariance between the corresponding standardized random variables. Therefore, correlation is a standardized measure of the association between two random variables.
- A correlation coefficient has no units and is measured on a universal scale. Regardless of the original measurement units of the random variables
and if and only if for some if and only if for some- Therefore, correlation is a standardized measure of the strength of the linear association between two random variables.
- Covariance is the correlation times the product of the standard deviations.
Example 21.4 Let
Recall that in Example 20.5 we found the marginal distributions of
Compute
.
Compute
.