23  Conditional Distributions

23.1 Discrete random variables: Conditional probability mass functions

Example 23.1 Roll a fair four-sided die once and let \(X\) be the number rolled. Then flip a fair coin \(X\) times and let \(Y\) be the number of heads.

  1. Identify the possible values of \(X\).




  2. Identify the possible values of \(Y\).




  3. Find the conditional distribution of \(Y\) given \(X=4\).




  4. Find the conditional distribution of \(Y\) given \(X=3\).




  5. Find the probability that \(X=3\) and \(Y=2\).




  6. Find the probability that \(X=3\) and \(Y=y\) for \(y = 0, 1, 2, 3, 4\).




  7. Find the joint distribution of \(X\) and \(Y\).




  8. Find the marginal distribution of \(Y\).




  9. Find the conditional distribution of \(X\) given \(Y=2\).




  • Let \(X\) and \(Y\) be two discrete random variables defined on a probability space with probability measure \(\text{P}\). For any fixed \(x\) with \(\text{P}(X=x)>0\), the conditional probability mass function (pmf) of \(Y\) given \(X=x\) is a function \(p_{Y|X}\) defined by \(p_{Y|X}(y|x)=\text{P}(Y=y|X=x)\). \[\begin{align*} p_{Y|X}(y|x) = \text{P}(Y=y|X=x) & = \frac{\text{P}(X=x,Y=y)}{\text{P}(X=x)} = \frac{p_{X,Y}(x,y)}{p_X(x)}& & \text{a function of $y$ for fixed $x$} \end{align*}\]
  • To emphasize, the notation \(p_{Y|X}(\cdot|x)\) represents the distribution of the random variable \(Y\) given a fixed value \(x\) of the random variable \(X\). In the expression \(p_{Y|X}(y|x)\), \(y\) is treated as the variable and \(x\) is treated like a fixed constant.
  • Notice that the pmfs satisfy \[ \text{conditional} = \frac{\text{joint}}{\text{marginal}} \]
  • Conditional distributions can be obtained from a joint distribution by slicing and renormalizing. The conditional pmf of \(Y\) given \(X=x\) can be thought of as:
    • the slice of the joint pmf \(p_{X, Y}(x, y)\) of \((X, Y)\) corresponding to \(X=x\), a function of \(y\) alone,
    • renormalized — by dividing by \(p_X(x)\) — so that the probabilitiess, corresponding to different \(y\) values, for the slice sum to 1.
  • For a fixed \(x\), the shape of the conditional pmf of \(Y\) given \(X=x\) is determined by the shape of the \(x\)-slice of the joint pmf, \(p_{X, Y}(x, y)\). That is, \[ \text{As a function of values of $Y$}, \quad p_{Y|X}(y|x) \propto p_{X, Y}(x, y) \]
  • For each fixed \(x\), the conditional pmf \(p_{Y|X}(\cdot |x)\) is a different distribution on values of the random variable \(Y\). There is not one “conditional distribution of \(Y\) given \(X\)”, but rather a family of conditional distributions of \(Y\) given different values of \(X\).
  • Rearranging the definition of a conditional pmf yields the multiplication rule for pmfs of discrete random variables \[\begin{align*} p_{X,Y}(x,y) & = p_{Y|X}(y|x)p_X(x)\\ & = p_{X|Y}(x|y)p_Y(y)\\ \text{joint} & = \text{conditional}\times\text{marginal} \end{align*}\]
  • Marginal distributions can be obtained from the joint distribution by collapsing/stacking using the law of total probability. The law of total probability for pmfs is \[\begin{align*} p_{Y}(y) & = \sum_x p_{X,Y}(x, y)\\ & =\sum_x p_{Y|X}(y|x)p_X(x) \end{align*}\]

23.2 Continuous random variables: Conditional probability density functions

  • Let \(X\) and \(Y\) be two continuous random variables with joint pdf \(f_{X,Y}\) and marginal pdfs \(f_X, f_Y\). For any fixed \(x\) with \(f_X(x)>0\), the conditional probability density function (pdf) of \(Y\) given \(X=x\) is a function \(f_{Y|X}\) defined by \[\begin{align*} f_{Y|X}(y|x) &= \frac{f_{X,Y}(x,y)}{f_X(x)}& & \text{a function of $y$ for fixed $x$} \end{align*}\]
  • To emphasize, the notation \(f_{Y|X}(y|x)\) represents a conditional distribution of the random variable \(Y\) for a fixed value \(x\) of the random variable \(X\). In the expression \(f_{Y|X}(y|x)\), \(x\) is treated like a constant and \(y\) is treated as the variable.
  • Notice that the pdfs satisfy \[ \text{conditional} = \frac{\text{joint}}{\text{marginal}} \]
  • Conditional distributions can be obtained from a joint distribution by slicing and renormalizing. The conditional pdf of \(Y\) given \(X=x\) can be thought of as:
    • the slice of the joint pdf \(f_{X, Y}(x, y)\) of \((X, Y)\) corresponding to \(X=x\), a function of \(y\) alone,
    • renormalized — by dividing by \(f_X(x)\) — so that the density heights, corresponding to different \(y\) values, for the slice are such that the total area under the density slice is 1.
  • For a fixed \(x\), the shape of the conditional pdf of \(Y\) given \(X=x\) is determined by the shape of the \(x\)-slice of the joint pdf, \(f_{X, Y}(x, y)\). That is, \[ \text{As a function of values of $Y$}, \quad f_{Y|X}(y|x) \propto f_{X, Y}(x, y) \]
  • For each fixed \(x\), the conditional pdf \(f_{Y|X}(\cdot |x)\) is a different distribution on values of the random variable \(Y\). There is not one “conditional distribution of \(Y\) given \(X\)”, but rather a family of conditional distributions of \(Y\) given different values of \(X\).
  • Rearranging the definition of a conditional pdf yields the multiplication rule for pdfs of continuous random variables \[\begin{align*} f_{X,Y}(x,y) & = f_{Y|X}(y|x)f_X(x)\\ & = f_{X|Y}(x|y)f_Y(y)\\ \text{joint} & = \text{conditional}\times\text{marginal} \end{align*}\]
  • Marginal distributions can be obtained from the joint distribution by collapsing/stacking using the law of total probability. The law of total probability for pmfs is \[\begin{align*} f_{Y}(y) & = \int_{-\infty}^\infty f_{X,Y}(x, y)\, dx\\ & =\int_{-\infty}^\infty f_{Y|X}(y|x)f_X(x)\, dx \end{align*}\]
  • Remember that the probability that a continuous random variable is equal to a particular value is 0; that is, for continuous \(X\), \(\text{P}(X=x)=0\). When we condition on \(\{X=x\}\) we are really conditioning on \(\{|X-x|<\epsilon\}\) and seeing what happens in the idealized limit when \(\epsilon\to0\).
  • When simulating, never condition on \(\{X=x\}\); rather, condition on \(\{|X-x|<\epsilon\}\) where \(\epsilon\) represents some suitable degree of precision (e.g. \(\epsilon=0.005\) if rounding to two decimal places).
  • Remember pdfs do not return probabilities directly; \(f_{Y|X}(y|x)\) is not a probability of anything. But \(f_{Y|X}(y|x)\) is related to the probability that \(Y\) is “close to” \(y\) given that \(X\) is “close to” \(x\): \[ \text{P}(y-\epsilon/2<Y < y+\epsilon/2\; \vert\; x-\epsilon/2<X < x+\epsilon/2) \approx \epsilon f_{Y|X}(y|x) \]

Example 23.2 Recall the continuous analog of the four-sided die problem. Spin the Uniform(1, 4) spinner twice and let \(X\) be the sum of the two spins and \(Y\) the larger to the two spins (or the common value if a tie). Recall that the joint pdf is

\[ f_{X, Y}(x, y) = \begin{cases} 2/9, & 2<x<8,\; 1<y<4,\; x/2<y<x-1,\\ 0, & \text{otherwise,} \end{cases} \] the marginal pdf of \(Y\) is \[ f_Y(y) = \begin{cases} (2/9)(y-1), & 1<y<4,\\ 0, & \text{otherwise,} \end{cases} \] and the marginal pdf of \(X\) is \[ f_X(x) = \begin{cases} (1/9)(x-2), & 2 < x< 5,\\ (1/9)(8-x), & 5<x<8,\\ 0, & \text{otherwise.} \end{cases} \]

  1. Find \(f_{X|Y}(\cdot|3)\), the conditional pdf of \(X\) given \(Y=3\).




  2. Find \(\text{P}(X > 5.5 | Y = 3)\).




  3. Find \(f_{X|Y}(\cdot|4)\), the conditional pdf of \(X\) given \(Y=4\).




  4. Find \(\text{P}(X > 5.5 | Y = 4)\).




  5. Find \(f_{X|Y}(\cdot|y)\), the conditional pdf of \(X\) given \(Y=y\), for \(1<y<4\).




  6. Find \(f_{Y|X}(\cdot|3.5)\), the conditional pdf of \(Y\) given \(x=3.5\).




  7. Find \(f_{Y|X}(\cdot|6)\), the conditional pdf of \(Y\) given \(x=6\).




  8. Find \(f_{Y|X}(\cdot|x)\), the conditional pdf of \(Y\) given \(x\).




Example 23.3 Suppose \(X\) and \(Y\) are continuous RVs with joint pdf

\[ f_{X, Y}(x, y) = \frac{1}{x}e^{-x}, \qquad x > 0,\quad 0<y<x. \]

  1. Identify by name the one-way conditional distributions that you can obtain from the joint pdf, without doing any calculus or computation.




  2. Identify by name the marginal distribution you can obtain without doing any calculus or computation.




  3. Describe how could you use the Exponential(1) spinner and the Uniform(0, 1) spinner to generate an \((X, Y)\) pair.




  4. Sketch a plot of the joint pdf.




  5. Sketch a plot of the marginal pdf of \(Y\).




  6. Set up the calculation you would perform to find the marginal pdf of \(Y\).