21  Covariance and Correlation

21.1 Covariance

  • The covariance between two random variables X and Y is Cov(X,Y)=E[(XE[X])(YE[Y])]=E(XY)E(X)E(Y)
  • Covariance is defined as the long run average of the products of the paired deviations from the mean
  • Covariance can be computed as the expected value of the product minus the product of expected values.

Example 21.1 Consider the probability space corresponding to two rolls of a fair four-sided die. Let X be the sum of the two rolls and Y the larger of the two rolls. The joint pmf is contained in Example 10.1. It can also be shown that E(X)=5 and E(Y)=3.125.

  1. Find E(XY). Is it equal to E(X)E(Y)?




  2. Find Cov(X,Y). Why is the covariance positive?




  • Cov(X,Y)>0 (positive association): above average values of X tend to be associated with above average values of Y
  • Cov(X,Y)<0 (negative association): above average values of X tend to be associated with below average values of Y
  • Cov(X,Y)=0 indicates that the random variables are uncorrelated: there is no overall positive or negative association. X and Y are uncorrelatedCov(X,Y)=0E(XY)=E(X)E(Y)
  • But be careful! Uncorrelated is not the same as independent.

Example 21.2 What is another name for Cov(X,X)?






21.2 Law of the unconscious statistician (LOTUS)

  • In general E(g(X,Y))g(E(X),E(Y))
  • LOTUS for two random variables Discrete X,Y with joint pmf pX,Y:E[g(X,Y)]=xyg(x,y)pX,Y(x,y)Continuous X,Y with joint pdf fX,Y:E[g(X,Y)]=g(x,y)fX,Y(x,y)dxdy

Example 21.3 Let X and Y denote the resistances (ohms) of two randomly selected resistors, with, respectively, Uniform(135, 165) and Uniform(162, 198) marginal distributions. Assume X and Y are independent. Suppose the resistors are connected in parallel so that the system resistance is R=11/X+1/Y=XYX+Y Compute E(R).






21.3 Correlation

  • The correlation (coefficient) between random variables X and Y is Corr(X,Y)=Cov(XE(X)SD(X),YE(Y)SD(Y))=Cov(X,Y)SD(X)SD(Y)
  • The correlation for two random variables is the covariance between the corresponding standardized random variables. Therefore, correlation is a standardized measure of the association between two random variables.
  • A correlation coefficient has no units and is measured on a universal scale. Regardless of the original measurement units of the random variables X and Y 1Corr(X,Y)1
  • Corr(X,Y)=1 if and only if Y=aX+b for some a>0
  • Corr(X,Y)=1 if and only if Y=aX+b for some a<0
  • Therefore, correlation is a standardized measure of the strength of the linear association between two random variables.
  • Covariance is the correlation times the product of the standard deviations. Cov(X,Y)=Corr(X,Y)SD(X)SD(Y)

Example 21.4 Let X be the time (hours), starting now, until the next earthquake (of any magnitude) occurs in SoCal, and let Y be the time (hours), starting now, until the second earthquake from now occurs (so that YX is the time between the first and second earthquake). Suppose that X and Y are continuous RVs with joint pdf

fX,Y(x,y)={4e2y,0<x<y<,0,otherwise

Recall that in Example 20.5 we found the marginal distributions of X and Y. Using the marginal distributions, it can be shown that E(X)=1/2, E(Y)=1, SD(X)=1/2, and SD(Y)=1/2.

  1. Compute Cov(X,Y).




  2. Compute Corr(X,Y).




21.4 Independent implies uncorrelated

  • If X and Y are independent then X and Y are uncorrelated, that is, Cov(X,Y)=0.

Example 21.5 Consider a simple electrical circuit with just a single resistor who resistance follows a Uniform(10, 20) distribution.

  1. Suppose a random current is generated. Assume the current and resistance are independent. If the current follows an Exponential distribution with mean 0.1, what is the expected voltage?




  2. Suppose a random voltage is applied. Assume the voltage and resistance are independent. If the voltage follows a Normal(10, 2) distribution, what is the expected current?




21.5 Uncorrelated does not imply independent

Example 21.6 Find Cov(X,Y) in each of the following situations. Notice that the marginal distribution of X is the same for each part, and similarly for Y, and E(Y)=0.

  1. pX,Y(x,y)
    x \ y -1 0 1 pX(x)
    0 1/6 1/6 1/6 1/2
    1 1/6 1/6 1/6 1/2
    pY(y) 1/3 1/3 1/3
  2. pX,Y(x,y)
    x \ y -1 0 1 pX(x)
    0 5/24 2/24 5/24 1/2
    1 3/24 6/24 3/24 1/2
    pY(y) 1/3 1/3 1/3
  3. In which of the previous parts are X and Y independent? Why?




Example 21.7 Suppose that

  • X has a Normal(0, 1) distribution
  • U has a Uniform(-2, 2) distribution
  • X and U are generated independently
  • Y=UX.
  1. Compute Cov(X,Y). Are X and Y uncorrelated?





  2. Are X and Y independent?




  • If X and Y are independent then X and Y are uncorrelated, that is, Cov(X,Y)=0.
  • However, the converse is not true in general: there are many situations in which Cov(X,Y)=0 but X and Y are not independent.
  • Independence means there is no probabilistic relationship between two random variables.
  • Covariance only measures a particular kind of probabilistic relationship, namely: how closely do the (X,Y) pairs tend to follow a single straight line with non-zero slope?