• Quantities like expected value and variance summarize characteristics of the marginal distribution of a single random variable.
• When there are multiple random variables their joint distribution is of interest.
• Covariance and correlation summarize a characteristic of the joint distribution of two random variables, namely, the degree to which they “co-deviate from the their respective means”.

## 21.1 Covariance

• The covariance between two random variables $$X$$ and $$Y$$ is \begin{align*} \text{Cov}(X,Y) & = \text{E}\left[\left(X-\text{E}[X]\right)\left(Y-\text{E}[Y]\right)\right]\\ & = \text{E}(XY) - \text{E}(X)\text{E}(Y) \end{align*}
• Covariance is defined as the long run average of the products of the paired deviations from the mean
• Covariance can be computed as the expected value of the product minus the product of expected values.

Example 21.1 Consider the probability space corresponding to two rolls of a fair four-sided die. Let $$X$$ be the sum of the two rolls and $$Y$$ the larger of the two rolls. The joint pmf is contained in Example 10.1. It can also be shown that $$\text{E}(X) = 5$$ and $$\text{E}(Y) = 3.125$$.

1. Find $$\text{E}(XY)$$. Is it equal to $$\text{E}(X)\text{E}(Y)$$?

2. Find $$\text{Cov}(X, Y)$$. Why is the covariance positive?

• $$\text{Cov}(X,Y)>0$$ (positive association): above average values of $$X$$ tend to be associated with above average values of $$Y$$
• $$\text{Cov}(X,Y)<0$$ (negative association): above average values of $$X$$ tend to be associated with below average values of $$Y$$
• $$\text{Cov}(X,Y)=0$$ indicates that the random variables are uncorrelated: there is no overall positive or negative association. $X \text{ and } Y \text{ are uncorrelated} \Longleftrightarrow \text{Cov}(X, Y) = 0 \Longleftrightarrow \text{E}(XY) = \text{E}(X)\text{E}(Y)$
• But be careful! Uncorrelated is not the same as independent.

Example 21.2 What is another name for $$\text{Cov}(X, X)$$?

## 21.2 Law of the unconscious statistician (LOTUS)

• In general \begin{align*} \text{E}\left(g(X, Y)\right) & \neq g\left(\text{E}(X), \text{E}(Y)\right) \end{align*}
• LOTUS for two random variables \begin{align*} & \text{Discrete X, Y with joint pmf p_{X, Y}:} & \text{E}[g(X, Y)] & = \sum_{x}\sum_{y} g(x, y) p_{X, Y}(x, y)\\ & \text{Continuous X, Y with joint pdf f_{X, Y}:} & \text{E}[g(X, Y)] & = \int_{-\infty}^\infty\int_{-\infty}^\infty g(x, y) f_{X, Y}(x, y)\,dxdy \end{align*}

Example 21.3 Let $$X$$ and $$Y$$ denote the resistances (ohms) of two randomly selected resistors, with, respectively, Uniform(135, 165) and Uniform(162, 198) marginal distributions. Assume $$X$$ and $$Y$$ are independent. Suppose the resistors are connected in parallel so that the system resistance is $R = \frac{1}{1/X + 1/Y} = \frac{XY}{X+Y}$ Compute $$\text{E}(R)$$.

## 21.3 Correlation

• The correlation (coefficient) between random variables $$X$$ and $$Y$$ is \begin{align*} \text{Corr}(X,Y) & = \text{Cov}\left(\frac{X-\text{E}(X)}{\text{SD}(X)},\frac{Y-\text{E}(Y)}{\text{SD}(Y)}\right)\\ & = \frac{\text{Cov}(X, Y)}{\text{SD}(X)\text{SD}(Y)} \end{align*}
• The correlation for two random variables is the covariance between the corresponding standardized random variables. Therefore, correlation is a standardized measure of the association between two random variables.
• A correlation coefficient has no units and is measured on a universal scale. Regardless of the original measurement units of the random variables $$X$$ and $$Y$$ $-1\le \textrm{Corr}(X,Y)\le 1$
• $$\textrm{Corr}(X,Y) = 1$$ if and only if $$Y=aX+b$$ for some $$a>0$$
• $$\textrm{Corr}(X,Y) = -1$$ if and only if $$Y=aX+b$$ for some $$a<0$$
• Therefore, correlation is a standardized measure of the strength of the linear association between two random variables.
• Covariance is the correlation times the product of the standard deviations. $\text{Cov}(X, Y) = \text{Corr}(X, Y)\text{SD}(X)\text{SD}(Y)$

Example 21.4 Let $$X$$ be the time (hours), starting now, until the next earthquake (of any magnitude) occurs in SoCal, and let $$Y$$ be the time (hours), starting now, until the second earthquake from now occurs (so that $$Y-X$$ is the time between the first and second earthquake). Suppose that $$X$$ and $$Y$$ are continuous RVs with joint pdf

$f_{X, Y}(x, y) = \begin{cases} 4e^{-2y}, & 0 < x< y < \infty,\\ 0, & \text{otherwise} \end{cases}$

Recall that in Example 20.5 we found the marginal distributions of $$X$$ and $$Y$$. Using the marginal distributions, it can be shown that $$\text{E}(X) = 1/2$$, $$\text{E}(Y) = 1$$, $$\text{SD}(X)=1/2$$, and $$\text{SD}(Y)=1/\sqrt{2}.$$

1. Compute $$\text{Cov}(X, Y)$$.

2. Compute $$\text{Corr}(X, Y)$$.

## 21.4 Independent implies uncorrelated

• If $$X$$ and $$Y$$ are independent then $$X$$ and $$Y$$ are uncorrelated, that is, $$\text{Cov}(X, Y) = 0$$.

Example 21.5 Consider a simple electrical circuit with just a single resistor who resistance follows a Uniform(10, 20) distribution.

1. Suppose a random current is generated. Assume the current and resistance are independent. If the current follows an Exponential distribution with mean 0.1, what is the expected voltage?

2. Suppose a random voltage is applied. Assume the voltage and resistance are independent. If the voltage follows a Normal(10, 2) distribution, what is the expected current?

## 21.5 Uncorrelated does not imply independent

Example 21.6 Find $$\text{Cov}(X,Y)$$ in each of the following situations. Notice that the marginal distribution of $$X$$ is the same for each part, and similarly for $$Y$$, and $$\text{E}(Y) = 0$$.

1. $$p_{X, Y}(x, y)$$
$$x$$ \ $$y$$ -1 0 1 $$p_X(x)$$
0 1/6 1/6 1/6 1/2
1 1/6 1/6 1/6 1/2
$$p_Y(y)$$ 1/3 1/3 1/3
2. $$p_{X, Y}(x, y)$$
$$x$$ \ $$y$$ -1 0 1 $$p_X(x)$$
0 5/24 2/24 5/24 1/2
1 3/24 6/24 3/24 1/2
$$p_Y(y)$$ 1/3 1/3 1/3
3. In which of the previous parts are $$X$$ and $$Y$$ independent? Why?

Example 21.7 Suppose that

• $$X$$ has a Normal(0, 1) distribution
• $$U$$ has a Uniform(-2, 2) distribution
• $$X$$ and $$U$$ are generated independently
• $$Y = UX$$.
1. Compute $$\text{Cov}(X, Y)$$. Are $$X$$ and $$Y$$ uncorrelated?

2. Are $$X$$ and $$Y$$ independent?

• If $$X$$ and $$Y$$ are independent then $$X$$ and $$Y$$ are uncorrelated, that is, $$\text{Cov}(X, Y) = 0$$.
• However, the converse is not true in general: there are many situations in which $$\text{Cov}(X, Y) = 0$$ but $$X$$ and $$Y$$ are not independent.
• Independence means there is no probabilistic relationship between two random variables.
• Covariance only measures a particular kind of probabilistic relationship, namely: how closely do the $$(X, Y)$$ pairs tend to follow a single straight line with non-zero slope?