13  Equally Likely Outcomes, Counting Rules, and Uniform Probability Meaures

13.1 Equally likely outcomes

  • For a sample space \(\Omega\) with finitely many possible outcomes, assuming equally likely outcomes corresponds to a probabiliy measure \(\text{P}\) which satisfies \[ \text{P}(A) = \frac{|A|}{|\Omega|} = \frac{\text{number of outcomes in $A$}}{\text{number of outcomes in $\Omega$}} \qquad{\text{when outcomes are equally likely}} \]
  • Computing probabilities in the equally likely case reduces to just counting outcomes.
  • But remember: even if the sample space outcomes are equally likely, the possible values of related random variables are usually not.

13.2 Some counting rules

Example 13.1 Suppose that to send an internet packet from the east coast of the US to the west coast, a packet must go through a major east-coast city (Boston, New York, Washington DC, or Atlanta), then a major midwest city (Chicago, St. Louis, or New Orleans), and then a major west-coast city (San Francisco or Los Angeles). How many possible routes are there?






  • Multiplication principle for counting. Suppose that stage 1 of a process can be completed in any one of \(n_1\) ways. Further, suppose that for each way of completing the stage 1, stage 2 can be completed in any one of \(n_2\) ways. Then the two-stage process can be completed in any one of \(n_1\times n_2\) ways.
  • This rule extends naturally to a \(\ell\)-stage process, which can then be completed in any one of \(n_1\times n_2\times n_3\times\cdots\times n_\ell\) ways.
  • In the multiplication principle it is not important whether there is a “first” or “second” stage. What is important is that there are distinct stages, each with its own number of “choices”.

Example 13.2 Suppose the board of directors of a corporation has identified 5 candidates — Ariana, Beyonce, Cardi, Drake, Elvis — for three executive positions: chief executive officer (CEO), chief financial officer (CFO), and chief operating officer (COO). In the interest of fairness, the board assigns 3 of the 5 candidates to the positions completely at random. No individual can hold more than one of the positions.

When calculating probabilities below, consider the sample space of all possible executive teams.

  1. How many executive teams are possible?



  2. What is the probability that Ariana is CEO, Beyonce is CFO, and Cardi is COO?




  3. What is the probability that Ariana is CEO and Beyonce is CFO?



  4. What is the probability that Ariana is CEO?



  • Number of “ordered” arrangements. The number of “ordered” arrangements of \(k\) items, selected without replacement from a set of \(n\) distinct items is \[ n(n-1)(n-2)\cdots(n-k+1) = \frac{n!}{(n-k)!} \]
  • Recall the factorial notation: \(m!=m(m-1)(m-2)\cdots (3)(2)(1)\). For example, \(5!=5\times4\times3\times2\times1=120\). By definition, 0!=1.

Example 13.3 Your boss is forming a committee of 3 people for a new project team, and 5 people — Ariana, Beyonce, Cardi, Drake, Elvis— have volunteered to be on the committee. In the interest of fairness, 3 of the 5 people will be selected uniformly at random to form the committee.

  1. How is this situation different from the executive team example?



  2. How many possible committees consist of Ariana, Beyonce, Cardi? How many executive teams consisted of Ariana, Beyonce, Cardi?



  3. How many different possible committees of 3 people can be formed from the 5 volunteers?



  • The following is the relationship between “ordered” and “unordered” counting. \[\begin{align*} & \quad \left(\text{number of ``ordered'' selections of $k$ from $n$}\right)\\ = &\quad \left(\text{number of ``unordered'' selections of $k$ from $n$}\right)\\ &\qquad \times\left(\text{number of ways of arranging the $k$ items in order}\right). \end{align*}\]
  • “Ordered” and “unordered” are somewhat misnomers. It is not important whether there is a “first”, “second”, “third”, etc. What is important is that there are distinct stages, each with its own number of “choices”.
    • In Example 13.2, it doesn’t matter if we pick the CEO first and the CFO second; what does matter is that choosing the CEO is a distinct stage from choosing the CFO.
  • Number of permutations. The number of ways of arranging \(k\) items in order is \[ k\times (k-1)\times (k-2)\times\cdots\times 3\times 2\times1 = k! \]
  • Number of combinations. The number of ways to choose \(k\) items without replacement from a group of \(n\) distinct items where “order” does not matter, denoted \(\binom{n}{k}\), is \[ \binom{n}{k} = \frac{n(n-1)(n-2)\cdots(n-k+1)}{k!} = \frac{n!}{k!(n-k)!} \]
  • The quantity on the right is just a compact way of representing the quantity in the middle. But since factorials can be very large, it’s best to use the quantity in the middle to compute. In R: choose(n, k). In Python: scipy.special.comb(n, k)
  • The symbol \(\binom{n}{k}\) is by definition equal to the quantity in the middle above. It is read as “\(n\) choose \(k\)” and is referred to as a binomial coefficient.

Example 13.4 Continuous Example 13.3. Your boss is forming a committee of 3 people for a new project team, and 5 people — Ariana, Beyonce, Cardi, Drake, Elvis— have volunteered to be on the committee. In the interest of fairness, 3 of the 5 people will be selected uniformly at random to form the committee.

  1. Find the probability that the committee consists of Ariana, Beyonce, and Cardi.



  2. Find the probability that Ariana and Beyonce are on the committee.



  3. Find the probability that Ariana is on the committee.



13.3 Hypergeometric distributions

Example 13.5 Capture-recapture sampling is a technique often used to estimate the size of a population. Suppose you want to estimate \(N\), the number of monarch butterflies in Pismo Beach. (Assume that \(N\) is a fixed but unknown number; the population size doesn’t change over time.) You first capture a sample of \(N_1\) butterflies, selected randomly without replacement, and tag them and release them. At a later date, you then capture a second sample of \(n\) butterflies, selected randomly without replacement. Let \(X\) be the number of butterflies in the second sample that have tags (because they were also caught in the first sample). (Assume that the tagging has no effect on behavior, so that selection in the first sample is independent of selection in the second sample.)

In practice, \(N\) is unknown. But let’s start with a simpler, but unrealistic, example where there are \(N=52\) butterflies, \(N_1 = 13\) are tagged in the first sample, and \(n=5\) is the size of the second sample.

  1. What are the possible values of \(X\)?




  2. Describe in detail how you could use simulation to approximate the distribution of \(X\).




  3. Find \(\text{P}(X = 0)\) in two ways.




  4. Find the probability that in the second sample the first butterfly selected is tagged but the rest are not.




  5. Find the probability that in the second sample the first four butterflies selected are not tagged but the fifth is.




  6. Find \(\text{P}(X = 1)\) in two ways.




  7. Find \(\text{P}(X = 2)\) in two ways.




  8. Suggest a formula for determining the distribution of \(X\).




  9. Suggest a simple shortcut formula for the long run average value of \(X\).




  10. Now suppose that \(N\) is unknown and that there are \(N_1 = 1000\) tagged butterflies (from the first sample). In a later sample of \(n=100\) butterflies, \(x = 5\) are tagged. What is an intuitive estimate of \(N\) based on this sample?




  • A discrete random variable \(X\) has a Hypergeometric distribution with parameters \(n, N_0, N_1\), all nonnegative integers — with \(N = N_0+N_1\) and \(p=N_1/N\) — if its distribution satisfies1 \[\begin{align*} \text{P}(X=x) & = \frac{\binom{N_1}{x}\binom{N_0}{n-x}}{\binom{N_0+N_1}{n}},\quad x = 0, 1, \ldots, n; x \le N_1; x \ge n - N_0 \\ \end{align*}\]
  • If \(X\) has a Hypergeometric(\(n\), \(N_1\), \(N_0\)) distribution \[\begin{align*} \text{Long run average value of } X & = np\\ \text{Variance of } X & = np(1-p)\left(\frac{N-n}{N-1}\right) \end{align*}\]
  • Imagine a box containing \(N=N_1+N_0\) tickets
    • \(N_1\) of which are labeled 1 (“success”) and \(N_0\) of which are labeled 0 (“failure”).
    • Randomly select \(n\) tickets from the box without replacement and let \(X\) be the number of tickets in the sample that are labeled 1.
    • Then \(X\) has a Hypergeometric(\(N_1\), \(N_0\), \(n\)) distribution.
    • Since the tickets are labeled 1 and 0, the random variable \(X\) which counts the number of successes is equal to the sum of the 1/0 values on the tickets.

13.4 Continuous Uniform probability measures

  • The continuous analog of equally likely outcomes is a uniform probability measure.
  • When the sample space is uncountable, size is measured continuously (length, area, volume) rather that discretely (counting).

\[ \text{P}(A) = \frac{|A|}{|\Omega|} = \frac{\text{size of } A}{\text{size of } \Omega} \qquad \text{if $\text{P}$ is a uniform probability measure} \]

Example 13.6 A circuit board is covered by a grid of squares, each square having sides 0.75mm in length. (Assume that the lines forming the sides of the squares having negligible width.) A single drop of solder lands, uniformly at random, on the circuit board. The diameter of the drop is 0.50mm. What is probability that the drop of solder lands entirely within a single square? Hint: focus on whichever square the center of the drop lands in and think of where the center needs to be so that the entire drop falls inside the square.







  1. We must have \(X\le N_1\) since there can’t be more successes in the sample than there are in the population. Similarly, we must have \(X\ge n - N_0\) (that is, \(n-X\le N_0\)) since there can’t be more failures in the sample than there are in the population. Often the population sizes \(N_0\) and \(N_1\) are large relative to the sample size \(n\) in which case \(X\) simply takes values \(0, 1,\ldots, n\).↩︎