Chapter 4 Bivariate Distributions

4.1 Bivariate Distributions of the Discrete Type

Definition of a joint probability mass function:

Given a random experiment with sample space S and discrete random variables X and Y, then the joint p.m.f. f(x,y) is the function f:R×RR that gives the probability that X=x and Y=y simultaneously, i.e. f(x,y)=P(X=xY=y).

Example: Let X represent the number of heads obtained when we flip a fair coin twice. The sample space is SX={0,1,2}. You should recognize that X is binomial; XBIN(n=2,π=0.5).

Let Y represent the result from rolling a standard six-sided die. Its sample space is SY={1,2,3,4,5,6}. We often describe Y as having a discrete uniform distribution with n=6.

In general, the p.m.f. for a discrete random variable YDU(n) is: f(y)=1ny=1,2,,n=0otherwise

The sample space for the joint distribution of X and Y is all ordered pairs with non-zero probability. S={(0,1),(0,2),(0,3),,(2,4),(2,5),(2,6)}

Eventually, we would like to have a formula to represent the joint pmf f(x,y). First, we will consider the marginal distributions.

For X, we know that f(x)=(2x)(0.5)x(0.5)2x;x=0,1,2

Since X is binomial, we can use the ‘shortcut’ formulas, rather than the definition, for finding the expected value and variance.

E(X)=nπ=2×0.5=1 Var(X)=nπ(1π)=12

Since Y is discrete uniform, we have the following ‘shortcut’ formulas, which can be derived (exercise #64) using the defintions of expected value & variance, along with some of the summation formulas we found through proof by induction in Appendix B.

E(X)=n+12=72 Var(X)=n2112=3512

Let’s look at a graph of the joint p.m.f. How do we get the probabilities that are shown?

JointPMF
JointPMF

Consider f(1,1), the probability that both X=1 and Y=1. Since X and Y are independent, we can find f(x,y)=fX(x)fY(y). We would NOT be able to directly multiply if the variables were not independent.

f(1,1)=fX(1)fY(1)=12×16=112

Similarly, f(0,2) and f(2,6) are both equal to 124.

f(0,2)=fX(0)fY(2)=14×16=124 f(2,6)=fX(2)fY(6)=14×16=124

So the joint pmf of X and Y is: f(x,y)=124x=0,2;y=1,2,3,4,5,6=112x=1;y=1,2,3,4,5,6

or f(x,y)=(2x)124x=0,1,2;y=1,2,3,4,5,6=0otherwise

Now let’s look at a situation where we know the joint pmf and need to determine the marginal pmfs (i.e. the individual distributions of X and Y). Also, we will determine if X and Y are independent or not.

Suppose the joint pmf for discrete random variables X and Y are:

f(x,y)=x+y32,x=1,2;y=1,2,3,4

(Draw a diagram to display f(x,y), fX(x), fY(y))

The marginal pmfs for X and Y are: fX(x)=yf(x,y)=4y=1x+y32=x+132+x+232+x+332+x+432=4x+1032=2x+516x=1,2=0otherwise

4.2 The Correlation Coefficient

Recall that Var(X+Y)=Var(X)+Var(Y) when X and Y are independent. What if they are NOT independent?

In that case, we have Var(X+Y)=Var(X)+Var(Y)+2E(XY)2E(X)E(Y) where the final two terms do NOT cancel out, due to the lack of independence. We define covariance as: Cov(X,Y)=E(XY)E(X)E(Y)=E[(XμX)(YμY)].

Thus, the variance of the sum of two jointly distributed random variables is: Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)

Obviously, if X and Y are independent, then Cov(X,Y)=0. However, the converse is not always true.

Let’s compute the covariance for our dependent discrete random variables X and Y where: f(x,y)=x+y32,x=1,2;y=1,2,3,4 fX(x)=2x+516,x=1,2 fY(y)=2y+332,y=1,2,3,4

We need to compute E(X), E(Y), and E(XY).

E(X)=2516,E(Y)=8032=4016

E(XY)=2x=14y=1xyf(x,y)=(1)(1)1+132+(1)(2)1+232++(2)(4)2+432=14032

Hence Cov(X,Y)=140322516×4016=1532.

The covariance is hard to interpret. I have no idea if Cov(X,Y)=σXY=1532 indicates a strong or weak degree of dependence between X and Y. We typically ‘standardize’ covariance into a unitless version known as the correlation coefficient, denoted as ρ. The correlation coefficient has the property that 1ρ1.

ρ=Cov(X,Y)Var(X)Var(Y)=σXYσXσY.

For our random variables, Var(X)=63256,Var(Y)=4516.

So the correlation is ρ=15/3263/256×45/16==1035105=0.563

Suppose we consider two independent random variables. Let XBIN(2,12) and YDU(4); that is, X is the number of heads obtained by flipping 2 coins and Y is the result of rolling a four-sided die.

For X, we have E(X)=nπ=2(12)=1 and Var(X)=nπ(1π)=2(12)(12)=12.

For Y, we have E(Y)=n+12=52 and VarY=n2112=1512

Since f(x)=(2x)(12)x(12)2x=14(2x),x=0,1,2 f(y)=14,y=1,2,3,4, then f(x,y)=(2x)116,x=0,1,2;y=1,2,3,4

The covariance term is: E(XY)=2x=04y=1xyf(x,y)=(0)(1)(20)116++(2)(4)(22)116=4016=52Cov(X,Y)=E(XY)E(X)E(Y)=52(1)(52)=0

Since Cov(X,Y)=0, then ρ=0.

Remember, when X and Y are independent, then Cov(X,Y)=ρ=0, but this lemma does not work in the other direction. That is, it is possible for Cov(X,Y)=0 when X and Y are not independent!

Example: Suppose that the joint pmf of X and Y is defined over three ordered pairs (x,y) as follows: f(x,y)=13,(x,y){(0,1),(1,0),(2,1)}

If you draw the joint distribution, notice that the points do NOT form a Cartesian product or ‘rectangular’ support space. In this simple distribution, if I know that Y=0, then I am certain that X=1, hence they cannot be independent.

fX(x)=13,x=0,1,2 but fY(y)=13,y=0;fY(y)=23,y=1 and notice that f(1,1)=0 (the point (1,1) is not in our space) but that fX(1)fY(1)=13×230, so f(x,y)fX(x)fY(y) for at least some (x,y). Thus, X and Y are not independent.

Let’s compute covariance. E(X)=xf(x)=0(13)+1(13)+2(13)=1E(Y)=yf(y)=0(13)+1(23)=23E(XY)=xyf(x,y)=0(1)(13)+(1)(0)(13)+2(1)(13)=23Cov(XY)=E(XY)E(X)E(Y)=231(23)=0

Thus, we have an example where two random variables are NOT independent but still have a covariance and a correlation of zero.

4.3 Conditional Distributions

We can find conditional distributions, where the conditional distribution is the ratio of the joint to the marginal.

fX|Y=y=f(x,y)fY(y)=(x+y)/32(2y+3)/32=x+y2y+3x=1,2;y=1,2,3,4

For instance, fX|y=1=g(x|y=1)=x+15,x=1,2/

Similarly, we can find the conditional distribution in the other direction.

fY|X=x=f(x,y)fX(x)=(x+y)/32(4x+10)/32=x+y4x+10x=1,2;y=1,2,3,4

For instance, fY|x=2=h(y|x=1)=2+y18,y=1,2,3,4/

We can find conditional means and variances just by applying the defintions of mean and variance.

E(X)=xxfX(x)=2x=1x4x+1032=1×1432+2×1832=5032=1.5625 E(X|y=1)=xxg(x|y=1))=2x=1xx+15=1×25+2×35=85=1.6

Notice that E(X)E(X|y=1).

Var(X)=E(X2)[E(X)]2=xx2fX(x)(5032)2=2x=1x24x+1032(2516)2=12×1432+22×1832625256=63256=0.2460938

Var(X|y=1)=E(X2|y=1)[E(X|y=1)]2=xx2g(x|y=1)(85)2=2x=1x2x+15(85)2=12×25+22×356425=625=0.24 Again, Var(X)Var(X|y=1).

Are the random variables X and Y independent? If they are, then the joint pmf is equal to the product of the marginal pmfs (i.e. the joint pmf can be factored into the pmfs for the two individual random variables)?

We have f(x,y)=x+y32, fX(x)=2x+516, and fY(y)=2y+332.

Clearly X and Y are not independent, as 2x+516×2y+316x+y32

4.4 Bivariate Distributions of the Continuous Type

A joint probability density function (pdf) is some function f:RkR such that:

  • f(x1,x2,,fk)=f( x )0 for all x Rk

  • f(x1,x2,,xn)dx1dx2dxn=xRkf( x )d x =1

Example: Let X and Y have joint pdf f(x,y)=x+y,0<x<1,0<y<1

The marginal pdfs are easily obtained by integrating out the other variable.

fX(x)=10x+ydy=xy+y22|10=x+12,0<x<1

Similarly, fY(y)=y+12,0<y<1

Suppose I want the probability P(X+Y1). I must use the joint pdf to evaluate this.

P(X+Y1)=101x0(x+y)dxdy=10[xy+y22]1x0dx=10x(1x)+(1x)22dx=10xx2+12x+12x2dx=101212x2dx=x2x36|10=1216=13

Geometrically, this probability is the volume under the surface f(x,y)=x+y above the set {(x,y):0<x,x+y1}.

The conditional distribution is defined similarly as before, with Conditional = JointMarginal

g(x|y)=fX|Y=y=f(x,y)fX(x) In our example, g(x|y)=x+yx+1/2,0<x<1.

Another example of a joint pdf

Suppose that f(x,y)=140,0x10,10xy14x

Do the following:

  1. Draw the support space SXY.

  2. Find the marginal pdf for X, fX(x)

  3. Find the conditional pdf for Y|X=x, fY|X=x(y) or g(y|x)

  4. Find the conditional expectation E(Y|x)

  5. Find the conditional variance Var(Y|x)

Notice that X and Y are not independent. The support space SXY is not rectangular with the support of Y depends upon the value of X.

Find the marginal pdf for X.

fX(x)=f(x,y)dy=14x10x140dy=y40|14x10x=14x4010x40=440=110,0x10 Notice that XUNIF(0,10).

Find the conditional pdf for Y given X.

g(y|x)=f(x,y)fX(x)=1/401/10=14,10xy14x,0x10

Notice that YUNIF(10x,14x).

The conditional expectation is:

E(Y|x)=yg(y|x)dy=14x10xy4dy=y28|14x10x=(14x)28(10x)28=(19628x+x2)(10020x+x2)8=968x8=12x So if X=5, then E(Y|X=5)=125=7.

The conditional variance is:

E(Y2|x)=y2g(y|x)dy=14x10xy24dy=y31214x10x=(14x)312(10x)312

Recall that (xy)3=x33x2y+3xy2y3, so

E(Y2|x)=(2744588x+42x2x3)(1000300x+30x2x3)12=1744288x+12x212=436324x+x2Var(Y|x)=E(Y2|x)[E(Y|x)]2=(436324x+x2)(12x)2=(436324x+x2)(14424x+x2)=4363144=43

If I had defined XUNIF(0,10) and YUNIF(0,4) such that X and Y were independent, then f(x,y)=fX(x)×fY(y)=110×14=140,0x10,0y4

Of course, the support space in now rectangular (although that does NOT guarantee independence) and the support for Y no longer depends on X.

4.5 The Bivariate Normal Distribution