6 Commmon Discrete Distributions

6.1 Poisson Distributions

Example 6.1 Let $X$ be the number of home runs hit (in total by both teams) in a randomly selected Major League Baseball game. Technically, there is no fixed upper bound on what $X$ can be, so mathematically it is convenient to consider $0, 1, 2, \dots$ as the possible values of $X$ . Assume that the pmf of $X$ is $p_{X} (x) = e^{- 2.3} \frac{{2.3}^{x}}{x!}, x = 0, 1, 2, \dots$ This distribution is called the Poisson(2.3) distribution.

Compute $P (X = 3)$ . Then interpret the value as a long run relative frequency, and a relative likelihood.
Construct a table, plot, and spinner corresponding to the distribution of $X$ .
Compute and interpret $P (X \leq 2)$ .
Compute and interpret $P (X \leq 13)$ . (The most home runs ever hit in a baseball game is 13.)

Relative frequencies of home runs per game in the 2018 MLB season, compared with the Poisson(2.3) distribution.

Table representing the Poisson(2.3) probability mass function.
$x$	$p (x)$	Value
0	$e^{- 2.3} \frac{{2.3}^{0}}{0!}$	0.100259
1	$e^{- 2.3} \frac{{2.3}^{1}}{1!}$	0.230595
2	$e^{- 2.3} \frac{{2.3}^{2}}{2!}$	0.265185
3	$e^{- 2.3} \frac{{2.3}^{3}}{3!}$	0.203308
4	$e^{- 2.3} \frac{{2.3}^{4}}{4!}$	0.116902
5	$e^{- 2.3} \frac{{2.3}^{5}}{5!}$	0.053775
6	$e^{- 2.3} \frac{{2.3}^{6}}{6!}$	0.020614
7	$e^{- 2.3} \frac{{2.3}^{7}}{7!}$	0.006773
8	$e^{- 2.3} \frac{{2.3}^{8}}{8!}$	0.001947
9	$e^{- 2.3} \frac{{2.3}^{9}}{9!}$	0.000498
10	$e^{- 2.3} \frac{{2.3}^{10}}{10!}$	0.000114
11	$e^{- 2.3} \frac{{2.3}^{11}}{11!}$	0.000024
12	$e^{- 2.3} \frac{{2.3}^{12}}{12!}$	0.000005
13	$e^{- 2.3} \frac{{2.3}^{13}}{13!}$	0.000001
14	$e^{- 2.3} \frac{{2.3}^{14}}{14!}$	0.000000

dpois(3, 2.3)

[1] 0.2033082

x = 0:14

p_x = dpois(x, 2.3)

data.frame(x, p_x) |>
  kbl(digits = 6) |>
  kable_styling(fixed_thead = TRUE)

x	p_x
0	0.100259
1	0.230595
2	0.265185
3	0.203308
4	0.116902
5	0.053775
6	0.020614
7	0.006773
8	0.001947
9	0.000498
10	0.000114
11	0.000024
12	0.000005
13	0.000001
14	0.000000

sum(dpois(0:2, 2.3))

[1] 0.5960388

ppois(2, 2.3)

[1] 0.5960388

sum(dpois(0:13, 2.3))

[1] 0.9999998

ppois(13, 2.3)

[1] 0.9999998

rpois(10, 2.3)

 [1] 2 4 2 1 0 3 2 6 1 2

N_rep = 10000

x = rpois(N_rep, 2.3)

# Summarize the simulate values
plot(table(x) / N_rep,
     type = "h",
     xlab = "x",
     ylab = "Approximate P(X = x)")

Figure 6.1: Simulated distribution of $X$ with a Poisson(2.3) distribution.

A discrete random variable $X$ has a Poisson distribution with parameter $μ > 0$ if its probability mass function $p_{X}$ satisfies $p_{X} (x) = \frac{e^{- μ} μ^{x}}{x!}, x = 0, 1, 2, \dots$
The function $μ^{x} / x!$ defines the shape of the pmf. The constant $e^{- μ}$ ensures that the probabilities sum to 1.
Poisson distributions are often used to model random variables which count the number of “relative rare” events that occur.
In R:
- rpois(N_rep, mu) to simulate values
- dpois(x, mu) to compute the probability mass function
- ppois(x, mu) to compute the cumulative distribution function $P (X \leq x)$ .

6.2 Binomial Distributions

Example 6.2 Consider an extremely simplified model for the daily closing price of a certain stock. Every day the price either goes up or goes down, and the movements are independent from day-to-day. Assume that the probability that the stock price goes up on any single day is 0.25. Let $X$ be the number of days in which the price goes up in the next 5 days.

Compute and interpret $P (X = 0)$ .
Compute the probability that the price goes up on the first day and then down on the following four days.
Why is $P (X = 1)$ different from the probability in the previous part? Compute and interpret $P (X = 1)$ .
Suggest a general formula for the probability mass function of $X$ .

A discrete random variable $X$ has a Binomial distribution with parameters $n$ , a nonnegative integer, and $p \in [0, 1]$ if its probability mass function is $\begin{aligned} p_{X} (x) & = (\binom{n}{x}) p^{x} (1 - p)^{n - x}, & x = 0, 1, 2, \dots, n \end{aligned}$
In R:
- rbinom(N_rep, n, p) to simulate values
- dbinom(x, n, p) to compute the probability mass function
- pbinom(x, n, p) to compute the cumulative distribution function $P (X \leq x)$ .
The binomial coefficient (read “ $n$ choose $x$ ”) $(\binom{n}{x}) = \frac{n!}{x! (n - x)!}$ counts the number of success/failure sequences of length $n$ in which there are exactly $x$ successes. (Remember: by definition $0! = 1$ .)

Example 6.3 Continuing Example 6.2.

Construct a table, plot, and spinner representing the distribution of $X$ .
Compute and interpret $P (X \leq 2)$ .

Table representing the Binomial(5, 0.25) probability mass function.
$x$	$p (x)$	Value
0	$(\binom{5}{0}) {0.25}^{0} (1 - 0.25)^{5 - 0}$	0.237305
1	$(\binom{5}{1}) {0.25}^{1} (1 - 0.25)^{5 - 1}$	0.395508
2	$(\binom{5}{2}) {0.25}^{2} (1 - 0.25)^{5 - 2}$	0.263672
3	$(\binom{5}{3}) {0.25}^{3} (1 - 0.25)^{5 - 3}$	0.087891
4	$(\binom{5}{4}) {0.25}^{4} (1 - 0.25)^{5 - 4}$	0.014648
5	$(\binom{5}{5}) {0.25}^{5} (1 - 0.25)^{5 - 5}$	0.000977

dbinom(1, 5, 0.25)

[1] 0.3955078

x = 0:5

p_x = dbinom(x, 5, 0.25)

data.frame(x, p_x) |>
  kbl(digits = 6) |>
  kable_styling(fixed_thead = TRUE)

x	p_x
0	0.237305
1	0.395508
2	0.263672
3	0.087891
4	0.014648
5	0.000977

sum(dbinom(0:2, 5, 0.25))

[1] 0.8964844

pbinom(2, 5, 0.25)

[1] 0.8964844

rbinom(10, 5, 0.25)

 [1] 1 1 0 1 2 0 1 0 3 0

N_rep = 10000

x = rbinom(N_rep, 5, 0.25)

# Summarize the simulated values
plot(table(x) / N_rep,
     type = "h",
     xlab = "x",
     ylab = "Approximate P(X = x)")

Figure 6.2: Simulated distribution of $X$ with a Binomial(5, 0.25) distribution.

sum(dbinom(0:2, 5, 0.25))

[1] 0.8964844

pbinom(2, 5, 0.25)

[1] 0.8964844

Example 6.4 Continuing Example 6.2.

What does the random variable $5 - X$ represent? What is its distribution?
Suppose that the price is currently $100 and each it either moves up $2 or down $2. Let $S$ be the stock price after 5 days. How does $S$ relate to $X$ ? Does $S$ have a Binomial distribution?
Recall that $X$ is the number of days on which the price goes up in the next five days. Suppose that $Y$ is the number of days on which the price goes up in the ten days after that (days 6-15). What is the distribution of $X + Y$ ? (Continue to assume independence between days, with probability 0.25 of an up movement on any day.)

Imagine a box containing tickets
- Each ticket is labeled either 1 (“success”) or 0 (“failure”)
- $p$ is the proportion of tickets in the box labeled 1 (“success”); the rest are labeled 0 (“failure”).
- Randomly select $n$ tickets from the box with replacement and let $X$ be the number of tickets in the sample that are labeled 1.
- Then $X$ has a Binomial( $n$ , $p$ ) distribution.
- Since the tickets are labeled 1 and 0, the random variable $X$ which counts the number of successes is equal to the sum of the 1/0 values on the tickets.
The above situation involves a sequence of Bernoulli( $p$ ) trials.
- There are only two possible outcomes, “success” (1) and “failure” (0), on each trial.
- The unconditional/marginal probability of success is the same on every trial, and equal to $p$
- The trials are independent
  - If sampling with replacement, or
  - If sampling without replacement but the population size (number of tickets in the box) is much larger than the sample size $n$
If $X$ counts the number of successes in a fixed number, $n$ , of Bernoulli( $p$ ) trials then $X$ has a Binomial( $n, p$ ) distribution.

Example 6.5 In each of the following situations determine whether or not $X$ has a Binomial distribution. If so, specify $n$ and $p$ . If not, explain why not.

Roll a die 20 times; $X$ is the number of times the die lands on an even number.
Roll a die 20 times; $X$ is the number of times the die lands on 6.
Roll a die until it lands on 6; $X$ is the total number of rolls.
Roll a die 20 times; $X$ is the sum of the numbers rolled.
Shuffle a standard deck of 52 cards (13 hearts, 39 other cards) and deal 5 without replacement; $X$ is the number of hearts dealt. (Hint: be careful about why.)
Roll a fair six-sided die 10 times and a fair four-sided die 10 times; $X$ is the number of 3s rolled (out of 20).
Randomly select a sample of 35 Cal Poly students; $X$ is the number of students in the sample who are CA residents.

Example 6.6 Donny Dont is thoroughly confused about the distinction between a random variable and its distribution. Help him understand by by providing a simple concrete example of two different random variables $X$ and $Y$ that have the same distribution. Can you think of $X$ and $Y$ that have the same distribution but for which $P (X = Y) = 0$ ? (Hint: think coin flipping.)