6  Commmon Discrete Distributions

6.1 Poisson Distributions

Example 6.1 Let X be the number of home runs hit (in total by both teams) in a randomly selected Major League Baseball game. Technically, there is no fixed upper bound on what X can be, so mathematically it is convenient to consider 0,1,2, as the possible values of X. Assume that the pmf of X is

pX(x)=e2.32.3xx!,x=0,1,2, This distribution is called the Poisson(2.3) distribution.

  1. Compute P(X=3). Then interpret the value as a long run relative frequency, and a relative likelihood.




  2. Construct a table, plot, and spinner corresponding to the distribution of X.




  3. Compute and interpret P(X2).




  4. Compute and interpret P(X13). (The most home runs ever hit in a baseball game is 13.)




Table representing the Poisson(2.3) probability mass function.
x p(x) Value
0 e2.32.300! 0.100259
1 e2.32.311! 0.230595
2 e2.32.322! 0.265185
3 e2.32.333! 0.203308
4 e2.32.344! 0.116902
5 e2.32.355! 0.053775
6 e2.32.366! 0.020614
7 e2.32.377! 0.006773
8 e2.32.388! 0.001947
9 e2.32.399! 0.000498
10 e2.32.31010! 0.000114
11 e2.32.31111! 0.000024
12 e2.32.31212! 0.000005
13 e2.32.31313! 0.000001
14 e2.32.31414! 0.000000
dpois(3, 2.3)
[1] 0.2033082
x = 0:14

p_x = dpois(x, 2.3)

data.frame(x, p_x) |>
  kbl(digits = 6) |>
  kable_styling(fixed_thead = TRUE)
x p_x
0 0.100259
1 0.230595
2 0.265185
3 0.203308
4 0.116902
5 0.053775
6 0.020614
7 0.006773
8 0.001947
9 0.000498
10 0.000114
11 0.000024
12 0.000005
13 0.000001
14 0.000000
sum(dpois(0:2, 2.3))
[1] 0.5960388
ppois(2, 2.3)
[1] 0.5960388
sum(dpois(0:13, 2.3))
[1] 0.9999998
ppois(13, 2.3)
[1] 0.9999998
rpois(10, 2.3)
 [1] 0 1 2 2 3 1 1 2 0 1
N_rep = 10000

x = rpois(N_rep, 2.3)

# Summarize the simulate values
plot(table(x) / N_rep,
     type = "h",
     xlab = "x",
     ylab = "Approximate P(X = x)")

  • A discrete random variable X has a Poisson distribution with parameter μ>0 if its probability mass function pX satisfies pX(x)=eμμxx!,x=0,1,2,
  • The function μx/x! defines the shape of the pmf. The constant eμ ensures that the probabilities sum to 1.
  • Poisson distributions are often used to model random variables which count the number of “relative rare” events that occur.
  • In R:
    • rpois(N_rep, mu) to simulate values
    • dpois(x, mu) to compute the probability mass function
    • ppois(x, mu) to compute the cumulative distribution function P(Xx).

6.2 Binomial Distributions

Example 6.2

Consider an extremely simplified model for the daily closing price of a certain stock. Every day the price either goes up or goes down, and the movements are independent from day-to-day. Assume that the probability that the stock price goes up on any single day is 0.25. Let X be the number of days in which the price goes up in the next 5 days.

  1. Compute and interpret P(X=0).





  2. Compute the probability that the price goes up on the first day and then down on the following four days.





  3. Why is P(X=1) different from the probability in the previous part? Compute and interpret P(X=1).




  4. Suggest a general formula for the probability mass function of X.




  • A discrete random variable X has a Binomial distribution with parameters n, a nonnegative integer, and p[0,1] if its probability mass function is pX(x)=(nx)px(1p)nx,x=0,1,2,,n
  • In R:
    • rbinom(N_rep, n, p) to simulate values
    • dbinom(x, n, p) to compute the probability mass function
    • pbinom(x, n, p) to compute the cumulative distribution function P(Xx).
  • The binomial coefficient (read “n choose x”) (nx)=n!x!(nx)! counts the number of success/failure sequences of length n in which there are exactly x successes. (Remember: by definition 0!=1.)

Example 6.3

Continuing Example 6.2.

  1. Construct a table, plot, and spinner representing the distribution of X.




  2. Compute and interpret P(X2).




Table representing the Binomial(5, 0.25) probability mass function.
x p(x) Value
0 (50)0.250(10.25)50 0.237305
1 (51)0.251(10.25)51 0.395508
2 (52)0.252(10.25)52 0.263672
3 (53)0.253(10.25)53 0.087891
4 (54)0.254(10.25)54 0.014648
5 (55)0.255(10.25)55 0.000977
dbinom(1, 5, 0.25)
[1] 0.3955078
x = 0:5

p_x = dbinom(x, 5, 0.25)

data.frame(x, p_x) |>
  kbl(digits = 6) |>
  kable_styling(fixed_thead = TRUE)
x p_x
0 0.237305
1 0.395508
2 0.263672
3 0.087891
4 0.014648
5 0.000977
sum(dbinom(0:2, 5, 0.25))
[1] 0.8964844
pbinom(2, 5, 0.25)
[1] 0.8964844
rbinom(10, 5, 0.25)
 [1] 2 1 2 2 0 2 2 1 2 1
N_rep = 10000

x = rbinom(N_rep, 5, 0.25)

# Summarize the simulated values
plot(table(x) / N_rep,
     type = "h",
     xlab = "x",
     ylab = "Approximate P(X = x)")

sum(dbinom(0:2, 5, 0.25))
[1] 0.8964844
pbinom(2, 5, 0.25)
[1] 0.8964844

Example 6.4

Continuing Example 6.2.

  1. What does the random variable 5X represent? What is its distribution?




  2. Suppose that the price is currently $100 and each it either moves up $2 or down $2. Let S be the stock price after 5 days. How does S relate to X? Does S have a Binomial distribution?




  3. Recall that X is the number of days on which the price goes up in the next five days. Suppose that Y is the number of days on which the price goes up in the ten days after that (days 6-15). What is the distribution of X+Y? (Continue to assume independence between days, with probability 0.25 of an up movement on any day.)




  • Imagine a box containing tickets
    • Each ticket is labeled either 1 (“success”) or 0 (“failure”)
    • p is the proportion of tickets in the box labeled 1 (“success”); the rest are labeled 0 (“failure”).
    • Randomly select n tickets from the box with replacement and let X be the number of tickets in the sample that are labeled 1.
    • Then X has a Binomial(n, p) distribution.
    • Since the tickets are labeled 1 and 0, the random variable X which counts the number of successes is equal to the sum of the 1/0 values on the tickets.
  • The above situation involves a sequence of Bernoulli(p) trials.
    • There are only two possible outcomes, “success” (1) and “failure” (0), on each trial.
    • The unconditional/marginal probability of success is the same on every trial, and equal to p
    • The trials are independent
      • If sampling with replacement, or
      • If sampling without replacement but the population size (number of tickets in the box) is much larger than the sample size n
  • If X counts the number of successes in a fixed number, n, of Bernoulli(p) trials then X has a Binomial(n,p) distribution.

Example 6.5

In each of the following situations determine whether or not X has a Binomial distribution. If so, specify n and p. If not, explain why not.

  1. Roll a die 20 times; X is the number of times the die lands on an even number.




  2. Roll a die 20 times; X is the number of times the die lands on 6.




  3. Roll a die until it lands on 6; X is the total number of rolls.




  4. Roll a die 20 times; X is the sum of the numbers rolled.




  5. Shuffle a standard deck of 52 cards (13 hearts, 39 other cards) and deal 5 without replacement; X is the number of hearts dealt. (Hint: be careful about why.)




  6. Roll a fair six-sided die 10 times and a fair four-sided die 10 times; X is the number of 3s rolled (out of 20).




  7. Randomly select a sample of 35 Cal Poly students; X is the number of students in the sample who are CA residents.




Example 6.6

Donny Dont is thoroughly confused about the distinction between a random variable and its distribution. Help him understand by by providing a simple concrete example of two different random variables X and Y that have the same distribution. Can you think of X and Y that have the same distribution but for which P(X=Y)=0? (Hint: think coin flipping.)