12  Joint Distributions

Example 12.1

Roll a fair four-sided die twice. Let \(X\) be the sum of the two dice, and let \(Y\) be the larger of the two rolls (or the common value if both rolls are the same). Recall Table 5.1.

  1. Compute and interpret \(p_{X, Y}(5, 3) = \text{P}(X = 5, Y = 3)\).




  2. Construct a “flat” table displaying the distribution of \((X, Y)\) pairs, with one pair in each row.




  3. Construct a two-way displaying the joint distribution on \(X\) and \(Y\).




  4. Sketch a plot depicting the joint distribution of \(X\) and \(Y\).




  5. Starting with the two-way table, how could you obtain \(\text{P}(X = 5)\)?




  6. Starting with the two-way table, how could you obtain the marginal distribution of \(X\)? of \(Y\)?




  7. Starting with the marginal distribution of \(X\) and the marginal distribution of \(Y\), could you necessarily construct the two-way table of the joint distribution? Explain.




Table 12.1: Flat table representing the joint distribution of the sum (X) and larger (Y) of two rolls of a four-sided die.
(x, y) P(X = x, Y = y)
(2, 1) 0.0625
(3, 2) 0.1250
(4, 2) 0.0625
(4, 3) 0.1250
(5, 3) 0.1250
(5, 4) 0.1250
(6, 3) 0.0625
(6, 4) 0.1250
(7, 4) 0.1250
(8, 4) 0.0625
Table 12.2: Flat table representing the joint distribution of the sum (\(X\)) and larger (\(Y\)) of two rolls of a four-sided die.
\(x\) \ \(y\) 1 2 3 4
2 1/16 0 0 0
3 0 2/16 0 0
4 0 1/16 2/16 0
5 0 0 2/16 2/16
6 0 0 1/16 2/16
7 0 0 0 2/16
8 0 0 0 1/16

Example 12.2

Continuing the dice rolling example, construct a spinner representing the joint distribution of \(X\) and \(Y\).






N_rep = 16000

# first roll 
u1 = sample(1:4, size = N_rep, replace = TRUE)

# second roll
u2 = sample(1:4, size = N_rep, replace = TRUE)

# sum
x = u1 + u2

# max
y = pmax(u1, u2)

dice_sim = data.frame(u1, u2, x, y)
Repetition First roll Second roll X (sum) Y (max)
1 1 2 3 2
2 2 4 6 4
3 1 3 4 3
4 4 2 6 4
5 4 3 7 4
6 2 1 3 2
# Joint distribution: counts
table(x, y)
   y
x      1    2    3    4
  2 1018    0    0    0
  3    0 2025    0    0
  4    0  990 1937    0
  5    0    0 2040 2005
  6    0    0  942 2056
  7    0    0    0 1944
  8    0    0    0 1043
# Joint distribution: proportions
table(x, y) / N_rep
   y
x           1         2         3         4
  2 0.0636250 0.0000000 0.0000000 0.0000000
  3 0.0000000 0.1265625 0.0000000 0.0000000
  4 0.0000000 0.0618750 0.1210625 0.0000000
  5 0.0000000 0.0000000 0.1275000 0.1253125
  6 0.0000000 0.0000000 0.0588750 0.1285000
  7 0.0000000 0.0000000 0.0000000 0.1215000
  8 0.0000000 0.0000000 0.0000000 0.0651875
sum((x == 5) * (y == 3)) / N_rep
[1] 0.1275
library(tidyverse)
library(viridis)

ggplot(dice_sim |>
         # changing to factor ("categorical" helps with plotting)
         mutate(x = factor(x), y = factor(y)),
       aes(x = x, y = y)) +
  
  # fill color is relative frequency
  stat_bin_2d(aes(fill = after_stat(count) / sum(after_stat(count)))) +
  
  # color scale
  scale_fill_viridis(limits = c(0, 2 / 16 + 0.01)) + 
  
  # labels
  labs(x = "X (sum)",
       y = "Y (max)",
       fill = "Relative frequency")