1.2 Interpretations of probability

In the previous section we encountered a variety of scenarios which involved uncertainty, a.k.a. randomness. Just as there are a few “types” of randomness, there are a few ways of interpreting probability, namely, long run relative frequency and subjective probability.

Exercise 1.2 Revisit the scenarios in Exercise 1.1. Now consider how “probability” is interpreted in the different scenarios. In each scenario, what does “probability” mean? How might you estimate the probability? Start to make guesses for the probabilities; are they “high” or “low”? How high or low? Again, the goal is not to do any calculations but rather to think about, via these examples, similarities and differences of situations in which probabilities are of interest.

In particular, compare

  1. Scenarios 3 and 4
  2. Scenarios 5 and 6
  3. Scenarios 6 and 7
  4. How are scenarios 1 through 8 (collectively) different from scenarios 9 through 13 (collectively)?
  • The probability of an event is a number in the interval \([0, 1]\) measuring the event’s likelihood or degree of uncertainty.
  • A probability can take any values in the continuous scale from 0% to 100%3. In particular, a probability requires much more interpretation than “is the probability greater than, less than, or equal to 50%?”
  • When interpreting probabilities, be careful not to confuse “the particular” with “the general”.
    • (“The particular.”) A very specific event, surprising or not, often has low probability.
      • Even though in 10000 flips of a fair coin we would expect to see about 5000 heads, the probability that exactly 5000 out of 10000 flips are heads is fairly small (about4 0.008).
      • The probability that the winning powerball number is 6-7-16-23-26-(4) is exactly the same as the probability that the winning powerball number is 1-2-3-4-5-(6). Each of these sequences is just one of the roughly 300 million possible sequences, and each sequences has about a 1 in 300 million chance of being the winning number. However, many people think 6-7-16-23-26-(4) is more likely because 1-2-3-4-5-(6) “doesn’t look random”.
      • The probability that you get a text from your best friend at 7:43pm on Oct 12, 2019 inviting you to dinner after you’ve just ordered pizza from your favorite pizza place is probably pretty small. None of these items — getting a text, having a friend invite you to dinner, ordering pizza from your favorite pizza place — is unusual, but the chances of them all combining in this way at this particular time are fairly small.
    • (“The general.”) However, if there are many like events, their combined probability can be high.
      • The probability that around 5000 out of 10000 coin flips land on heads is fairly large. For example, if “around” is interpreted as between 4900 and 5100 (for a proportion of heads between 0.49 to 0.51) the probability is about 0.956.
      • The probability that the winning powerball number is an ordered sequence, like 1-2-3-4-5-(6), is extremely small5. However, the probability that the winning number is not an ordered sequence, like 6-7-16-23-26-(4), is exremely high. When interpreting probabilities, be careful not to confuse an event like “the winning number is 6-7-16-23-26-(4)” (the particular, low probability) with an event like “the winning number is not an ordered sequence” (the general, high probability).
      • The probability that some time in the next month or so a friend invites you for dinner after you’ve already had dinner on your own is probably fairly high.
  • Even if an event has extremely small probability, given enough repetitions of the random phenomenon, the probability that the event occurs on at least one of the repetitions is high.
    • The probability that a specific powerball ticket is the winning number is about 1 in 300 million. So if you buy a single ticket, it is extremely unlikely that you will win.
    • However, if hundreds of millions of powerball tickets are sold, the probability that someone somewhere wins is pretty high. For example, if 500 million tickets are sold then there is a roughly 80% chance that at least one ticket has the winning number (under certain assumptions).

1.2.1 Relative frequency

The probability that a single flip of a fair coin lands on heads is 0.5. How do we interpret this 0.5? The notation of “fairness” implies that the two outcomes, heads and tails, should be equally likely, so we have a “50/50 chance”. But how else can we interpret this 50%? One way is by considering what would happen if we flipped the coin main times. Now, if we would flipped the coin twice, we wouldn’t expect to necessarily see one head and one tail. And we already mentioned that if we flipped the coin 10000 times, the chances of seeing exactly 5000 heads is small. But in many flips, we might expect to see heads on something close to 50% of flips.

Consider Figure 1.1 below. Each dot represents a set of 10,000 fair coin flips. There are 100 dots displayed, representing 100 different sets of 10,000 coin flips each. For each set of flips, the proportion of the 10,000 flips which landed on head is recorded. For example, if in one set 4973 out of 10,000 flips landed on heads, the proportion of heads is 0.4973. The plot displays 100 such proportions. We see that only 5 of these 100 proportions are less than 0.49 or greater than 0.51. So if between 0.49 and 0.51 is considered “close to 0.5”, then yes, in 10000 coin flips we would expect the proportion of heads to be close to 0.5. (In 10000 flips, the probability of heads on between 49% and 51% of flips is 0.956, so 95 out of 100 provides a rough estimate of this probability.)

Proportion of flips which are heads in 100 sets of 10,000 fair coin flips. Each dot represents a set of 10,000 fair coin flips.

Figure 1.1: Proportion of flips which are heads in 100 sets of 10,000 fair coin flips. Each dot represents a set of 10,000 fair coin flips.

But what if we want to be stricter about what qualifies as “close to 0.5”? You might suspect that with even more flips we would expect to observe heads on even closer to 50% of flips. Indeed, this is the case. Figure 1.2 displays the results of 100 sets of 1,000,000 fair coin flips. The pattern seems similar to Figure 1.1 but pay close attention to the horizontal axis which covers a much shorter range of values than in the previous figure. Now 96 of the 100 proportions are between 0.499 and 0.501. So in 1,000,000 flips we would expect the proportion of heads to be between 0.499 and 0.501, pretty close to 0.5. (In 1,000,000 flips, the probability of heads on between 49.9% and 50.1% of flips is 0.955, and 96 out of 100 sets provides a rough estimate of this probability.)

Proportion of flips which are heads in 100 sets of 1,000,000 fair coin flips. Each dot represents a set of 1,000,000 fair coin flips.

Figure 1.2: Proportion of flips which are heads in 100 sets of 1,000,000 fair coin flips. Each dot represents a set of 1,000,000 fair coin flips.

In Figure 1.3 each dot represents a set of 100 millions flips. The pattern seems similar to the previous figures, but again pay close attention the horizontal access which covers a smaller range of values. Now 96 of the 100 proportions are between 0.4999 and 0.5001. (In 100 million flips, The probability of heads on between 49.99% and 50.01% of flips is 0.977, so 96 out of 100 sets provides a rough estimate of this probability.)

Proportion of flips which are heads in 100 sets of 100,000,000 fair coin flips. Each dot represents a set of 100,000,000 fair coin flips.

Figure 1.3: Proportion of flips which are heads in 100 sets of 100,000,000 fair coin flips. Each dot represents a set of 100,000,000 fair coin flips.

The previous figures illustrate that the more flips there are, the more likely it is that we observe a proportion of flips landing on heads close to 0.5. We also see that with more flips we can refine our definition of “close to 0.5”: increasing the number of flips by a factor of 100 (10,000 to 1,000,000 to 100,000,000) seems to give us an additional decimal place of precision (\(0.5\pm0.01\) to \(0.5\pm 0.001\) to \(0.5\pm 0.0001\).) These observations illustrate the relative frequency interpretation of probability.

  • The probability of an event corresponding to the result of a random phenomenon can be interpreted as the proportion of times that the event would occur in a very large number of hypothetical repetitions of the random phenomenon.
  • That is, a probability can be interpreted as a long run proportion or long run relative frequency.
  • This means that the probability of an event can be approximated by simulating the random phenomenon a large number of times and determining the proportion of simulated repetitions on which the event occurred out of the total number of repetitions of the simulation (this proportion is also called the relative frequency of the event.)
    • A simulation involves an artificial recreation of the random phenomenon, usually using a computer.
    • For example, if a basketball player is successful on 90% of her free throw attempts, we can simulate the player shooting a single free throw attempt by taking 10 cards and labeling 9 as “success” and 1 as “miss” then shuffling well and dealing one card.
  • The long run relative frequency interpretation of probability can be applied when a situation can be repeated numerous times, at least conceptually, and the outcome can be observed each time.
  • The relative frequency of a particular event will settle down to a single constant value after many repetitions, and that long run value is the probability of that event.
    • However, what constitutes the random phenomenon or how the simulation is conducted depends on certain assumptions. Changing those assumptions can affect probabilities of interest.
      • For example, if you’re interested in the probability that a die lands on 1, you need to know if it’s a four-sided die or a six-sided die, and if the die is consider “fair”.
      • As a more complicated example, simulating the outcome of the next Superbowl involves many assumptions

1.2.2 Subjective probability

The relative frequency interpretation is natural in scenarios 1 through 8 of Exercise 1.1. We can consider as repeateable situations like flipping a coin, drawing powerballs from a bin, or selecting a student at random.

On the other hand, it is difficult to conceptualize scenarios 8 through 13 of Exercise 1.1 as relative frequencies. Superbowl 2020 will only be played once, the 2020 U.S. Presidential Election will only be conducted once (we hope), and there was only one April 17, 2009 on which you either did or did not eat an apple. But while these situations are not naturally repeatable they still involve randomness (uncertainty) and it is still reasonable to assign probabilities. At this point in time, the Chargers are less likely than the Patriots (ugh) to win Superbowl 2020, Donald Trump is more likely than Dwayne Johnson to win the U.S. 2020 Presidential Election, and if you’ve always been an “apple-a-day” person, there’s a good chance you ate one on April 17, 2009. So it still makes sense to talk about probability in uncertain, but not necessarily repeated situations.

However, the meaning of probability does seem different in scenarios 9 through 13 compared to 1 through 8. Consider Superbowl 2020. As of Sept 16,

  • According to fivethirtyeight.com, the Patriots have a 19% chance of winning the Superbowl, the highest of any team, while the Chargers have a 4% chance.
  • According to footballoutsider.com, the Patriots have a 26% chance of winning the Superbowl, the highest of any team, while the Chargers have a 6% chance.
  • According to playoffstatus.com the Patriots have a 7% chance of winning the Superbowl, behind the Chiefs and Packers at 9% each, while the Chargers have a 3% chance.

All three websites, as well as many others, ascribe different probabilities to the Patriots or Chargers winning. Which website, if any, is correct? In the coin flipping example, we could perform a simulation to see that the long run relative frequency is 0.5. However, simulating Superbowl 2020 involves first simulating the 2019 season to determine the playoff matchups, then simulating the playoffs to see which teams make the Superbowl, then simulating the Superbowl matchup itself. And simulating the 2019 involves simulating all the weekly matchups and potential injuries and their effects. Even just simulating a single game involves many assumptions; differences in opinions with regards to these assumptions can lead to different probabilities. For example, according to fivethirtyeight, the Chiefs have a 69% chance of beating the Ravens on Sept 22, but according to pickingpros.com it’s only 52%. Unlike in the coin flipping problem, there is no single set of rules for running the simulation, and there is no single relative frequency that determines the probability. Therefore, in a situation like forecasting the Superbowl we consider subjective probability.

  • There are many situations where the outcome is uncertain, but it does mnot make sense to consider the situation as repeatable. In such situations, a subjective (a.k.a. personal) probability describes the degree of likelihood a given individual ascribes to a certain event.
  • As the name suggests, different individuals might have different subjective (personal) probabilities for the same event.
    • In contrast to the long run relative frequency situation, in which the probability is agreed to be defined as the long run relative frequency.
  • The fivethirtyeight NFL predictions are the output of a probabilistic forecast.
    • Probabilistic forecasts combine observed data and statistical models to make predictions.
    • Rather than providing a single prediction (such as “the Patriots will win Superbowl 2020”), probabilistic forecasts provide a range of scenarios and their relative likelihoods.
    • Be sure to make a distinction between assumption and observation.

  1. Probabilities are usually defined as decimals, but are often colloquially referred to as percentages. We’re not sticklers; we’ll refer to probabilities as decimals and as percentages.

  2. We will see how to compute probabilities like this one in upcoming chapters.

  3. If a “sequence” is defined with the powerball as the last number in the sequence, then the probability is about 21 out of 300 million. Since the powerball must be a number from 1 to 26, there are only 21 tickets out of 300 million possibilities for which the numbers are in an ordered sequence: 1-2-3-4-5-(6), 2-3-4-5-6-(7), … 21-22-23-24-25-(26).