1.1 Instances of randomness

A wide variety of situations involve probability. Consider just a few examples.

The probability that you roll doubles in a turn of a board game.
The probability you win the next Powerball lottery if you purchase a single ticket, 4-8-15-16-42, plus the Powerball number, 23.
The probability that a “randomly selected” Cal Poly student is a California resident.
The probability that the high temperature in San Luis Obispo tomorrow is above 90 degrees F.
The probability that Hurricane Peter makes landfall in the U.S.
The probability that the San Francisco 49ers win the next Superbowl.
The probability that President Biden wins the 2024 U.S. Presidential Election.
The probability that extraterrestrial life currently exists somewhere in the universe.
The probability that Alexander Hamilton actually wrote 51 of the Federalist Papers. (The papers were published under a common pseudonym and authorship of some of the papers is disputed.)
The probability that you ate an apple on April 17, 2009.

Example 1.1 How are the situations above similar, and how are they different? What is one feature that all of the situations have in common? Is the interpretation of “probability” the same in all situations? Take some time to consider these questions before looking at the solution. The goal here is to just think about these questions, and not to compute any probabilities (or to even think about how you would).

Solution. to Example 1.1

Show/hide solution

This exercise is intended to motivate discussion, so you might have thought of some other ideas we don’t address here. That’s good! And some of the things you considered might come up later in the book. But here are a few thoughts we specifically want to mention now.

The one feature that all of the situations have in common is uncertainty. Sometimes the uncertainty arises from a repeatedable physical phenomenon that can result in multiple potential outcomes, like rolling dice or drawing the winning Powerball number. In other cases, there is uncertainty because the probability concerns the future, like tomorrow’s high temperature or the result of the next Superbowl. But there can also be uncertainty about the past: there are some Federalist papers for which the author is unknown, and you probably don’t know for sure whether or not you ate an apple on April 17, 2009.

Whenever there is uncertainty, it is reasonable to consider relative likelihoods of potential outcomes. For example, even though you don’t know for certain whether you ate an apple on April 17, 2009, if you’re usually an apple-a-day person (or were when you were younger) you might think the probability is high. We don’t know for sure what team will win the next Superbowl, but we might think that the 49ers are more likely than the Las Vegas (?!?!) Raiders to be the winner.

While all of the situations in the example involve uncertainty, it seems that there are different “types” of uncertainty. Even though we don’t know which side a die will land on, the notion of “fairness” implies that the sides are “equally likely”. Likewise, there are some rules to how the Powerball drawing works, and it seems like these rules should determine the probability of drawing that particular winning number.

However, there aren’t any specific “rules of uncertainty” that govern whether or not you ate an apple on April 17, 2009. You either did or you didn’t, but that doesn’t mean the two outcomes are necessarily equally likely. Regarding the Superbowl, of course there are rules that govern the NFL season and playoffs, but there are no “rules of uncertainty” that tell us precisely how likely any particular team is to win any particular game, let alone how likely a team is to advance to and win the Superbowl.

It also seems that there are different interpretations of probability. Given that a six-sided die is fair, we might all agree that the probability that it lands on any particular side is 1/6. Similarly, given the rules of the Powerball lottery, we might all agree on the probability that a drawing results in a particular winning number. However, there isn’t necessarily consensus about what the high temperature will be in San Luis Obispo tomorrow. Different weather prediction models, forecasters, or websites might provide different values for the probability that the high temperature will be above 90 degrees Fahrenheit. Similarly, Superbowl odds might vary by source. Situations like tomorrow’s weather or the Superbowl where there is no consensus about the “rules of uncertainty” require some subjectivity in determining probabilities.

Finally, some of these situations are repeatedable. We could (in principle) roll a pair of dice many times and how often we get doubles, or repeat the Powerball drawing over and over to see how the winning numbers behave. However, many of these situations involve something that only happens once, like tomorrow or April, 17, 2009 or the next Superbowl. Even when the phenomenon happens only once it reality, we can still develop models of what might happen if we were to hypothetically repeat the phenomenon many times. For example, meteorologists use historical data and meteorological models to forecast potential paths of a hurricane.

The subject of probability concerns random phenomena. A phenomenon is random¹ if there are multiple potential outcomes, and there is uncertainty about which outcome will occur. Uncertainty is understood in broad terms, and in particular does not only concern future occurrences.

Some phenomena involve physical randomness², like flipping coins, rolling dice, drawing Powerballs at random from a bin, or randomly selecting Cal Poly students. In many other situations randomness just vaguely reflects uncertainty.

Contrary to colloquial uses of the word, random does not mean haphazard. In a random phenomenon, while individual outcomes are uncertain, we will see that there is a regular distribution of outcomes over a large number of (hypothetical) repetitions. For example,

In two flips of a fair coin we wouldn’t necessarily see one head and one tail. But in 10000 flips of a fair coin, we might expect to see close to 5000 heads and 5000 tails.
We don’t know who will win the next Superbowl, but we can and should consider some teams as more likely to win than others. We could imagine a large number of hypothetical 2021-2022 seasons; how often would we expect the 49ers to win? The Raiders?

Random also does not necessarily mean equally likely. In a random phenomenon, certain outcomes or events might be more or less likely than others. For example,

It’s much more likely than not that a randomly selected Cal Poly student is a California resident.
Not all NFL teams are equally likely to win the next Superbowl.

Finally, randomness is also not necessarily undesirable. In particular, many statistical applications often employ the planned use of randomness with the goal of collecting “good” data. For example,

Random selection involves selecting a sample of individuals “at random” from a population (e.g., via random digit dialing), with the goal of selecting a representative sample.
Random assignment involves assigning individuals at random to groups (e.g., in a randomized experiment), with the goal of constructing groups that are similar in all aspects so that the effect of a treatment (like a new vaccine) can be isolated.

The probability of an event associated with a random phenomenon is a number in the interval $[0, 1]$ measuring the event’s likelihood or degree of uncertainty. A probability can take any values in the continuous scale from 0% to 100%³. In particular, a probability requires much more interpretation than “is the probability greater than, less than, or equal to 50%?” As Example 1.1 suggests, there can be different interpretations of “probability”, which we’ll start to explore in the next section.

1.1.1 Exercises

For each of the following, provide examples of random phenomenon that fit the description. Try to think of examples that are interesting to you personally!
1. Just two possible outcomes, but they are not equally likely.
2. Physically repeatable (at least in principle).
3. Well defined “rules of randomness”.
4. Involves subjectivity in determining probabilities.
5. Involves uncertainty about the future.
6. Involves uncertainty about the present or past.
7. Associated with the planned use of randomness in a particular statistical study.

In this book, “random” and “uncertain” are synonyms; the opposite of “random” is “certain”. (Later we will encounter random variables; “constant” is an antonym of “random variable”.) The word “random” has many uses in everyday life, which have evolved over time. Unfortunately, some of the everyday meanings of “random”, like “haphazard” or “unexpected”, are contrary to what we mean by “random” in this book. For example, we would consider Steph Curry shooting a free throw to be a random phenomenon because we’re not certain if he’ll make it or miss it; but we would not consider this process to be haphazard or unexpected.↩︎
We will refer to as “random” any scenario that involves a reasonable degree of uncertainty. We’re avoiding philosophical questions about what is “true” randomness, like the following. Is a coin flip really random? If all factors that affect the trajectory of the coin were known precisely, then wouldn’t the outcome be determined? Does true randomness only exist in quantum mechanics?↩︎
Probabilities are usually defined as decimals, but are often colloquially referred to as percentages. We’re not sticklers; we’ll refer to probabilities both as decimals and as percentages.↩︎