12 Joint Distributions

Example 12.1

Roll a fair four-sided die twice. Let $X$ be the sum of the two dice, and let $Y$ be the larger of the two rolls (or the common value if both rolls are the same). Recall Table 5.1.

Compute and interpret $p_{X, Y} (5, 3) = P (X = 5, Y = 3)$ .
Construct a “flat” table displaying the distribution of $(X, Y)$ pairs, with one pair in each row.
Construct a two-way displaying the joint distribution on $X$ and $Y$ .
Sketch a plot depicting the joint distribution of $X$ and $Y$ .
Starting with the two-way table, how could you obtain $P (X = 5)$ ?
Starting with the two-way table, how could you obtain the marginal distribution of $X$ ? of $Y$ ?
Starting with the marginal distribution of $X$ and the marginal distribution of $Y$ , could you necessarily construct the two-way table of the joint distribution? Explain.

The joint distribution of random variables $X$ and $Y$ is a probability distribution on $(x, y)$ pairs, and describes how the values of $X$ and $Y$ vary together or jointly.
Marginal distributions can be obtained from a joint distribution by “stacking”/“collapsing”/“aggregating” out the other variable.
In general, marginal distributions alone are not enough to determine a joint distribution. (The exception is when random variables are independent.)

Table 12.1: Flat table representing the joint distribution of the sum (X) and larger (Y) of two rolls of a four-sided die.
(x, y)	P(X = x, Y = y)
(2, 1)	0.0625
(3, 2)	0.1250
(4, 2)	0.0625
(4, 3)	0.1250
(5, 3)	0.1250
(5, 4)	0.1250
(6, 3)	0.0625
(6, 4)	0.1250
(7, 4)	0.1250
(8, 4)	0.0625

Table 12.2: Flat table representing the joint distribution of the sum ( $X$ ) and larger ( $Y$ ) of two rolls of a four-sided die.
$x$ \ $y$	1	2	3	4
2	1/16	0	0	0
3	0	2/16	0	0
4	0	1/16	2/16	0
5	0	0	2/16	2/16
6	0	0	1/16	2/16
7	0	0	0	2/16
8	0	0	0	1/16

Figure 12.1: Tile plot representing the joint distribution of the sum ( $X$ ) and larger ( $Y$ ) of two rolls of a four-sided die.

Example 12.2

Continuing the dice rolling example, construct a spinner representing the joint distribution of $X$ and $Y$ .

N_rep = 16000

# first roll 
u1 = sample(1:4, size = N_rep, replace = TRUE)

# second roll
u2 = sample(1:4, size = N_rep, replace = TRUE)

# sum
x = u1 + u2

# max
y = pmax(u1, u2)

dice_sim = data.frame(u1, u2, x, y)

Repetition	First roll	Second roll	X (sum)	Y (max)
1	1	2	3	2
2	2	4	6	4
3	1	3	4	3
4	4	2	6	4
5	4	3	7	4
6	2	1	3	2

# Joint distribution: counts
table(x, y)

   y
x      1    2    3    4
  2 1018    0    0    0
  3    0 2025    0    0
  4    0  990 1937    0
  5    0    0 2040 2005
  6    0    0  942 2056
  7    0    0    0 1944
  8    0    0    0 1043

# Joint distribution: proportions
table(x, y) / N_rep

   y
x           1         2         3         4
  2 0.0636250 0.0000000 0.0000000 0.0000000
  3 0.0000000 0.1265625 0.0000000 0.0000000
  4 0.0000000 0.0618750 0.1210625 0.0000000
  5 0.0000000 0.0000000 0.1275000 0.1253125
  6 0.0000000 0.0000000 0.0588750 0.1285000
  7 0.0000000 0.0000000 0.0000000 0.1215000
  8 0.0000000 0.0000000 0.0000000 0.0651875

sum((x == 5) * (y == 3)) / N_rep

[1] 0.1275

library(tidyverse)
library(viridis)

ggplot(dice_sim |>
         # changing to factor ("categorical" helps with plotting)
         mutate(x = factor(x), y = factor(y)),
       aes(x = x, y = y)) +
  
  # fill color is relative frequency
  stat_bin_2d(aes(fill = after_stat(count) / sum(after_stat(count)))) +
  
  # color scale
  scale_fill_viridis(limits = c(0, 2 / 16 + 0.01)) + 
  
  # labels
  labs(x = "X (sum)",
       y = "Y (max)",
       fill = "Relative frequency")

Tile plot of simulated pairs of the sum ( $X$ ) and larger ( $Y$ ) of two rolls of a fair four-sided die.