2 Working with Probabilities

Due to the wide variety of types of random phenomena, an outcome can be virtually anything. In particular, an outcome does not have to be a number.
The sample space is the set of all possible outcomes of the random phenomenon..
An event is something that could happen. That is, an event is a collection of outcomes that satisfy some criteria. If the sample space outcomes are represented by rows in a spreadsheet, then an event is a subset of rows that satisfies some criteria
Events are typically denoted with capital letters near the start of the alphabet, with or without subscripts (e.g. $A$ , $B$ , $C$ , $A_{1}$ , $A_{2}$ ). Events can be composed from others using basic set operations like unions ( $A \cup B$ ), intersections ( $A \cap B$ ), and complements ( $A^{c}$ ).
- Read $A^{c}$ as “not $A$ ”.
- Read $A \cap B$ as “ $A$ and $B$ ”
- Read $A \cup B$ as “ $A$ or $B$ ”. Note that unions ( $\cup$ , “or”) are always inclusive. $A \cup B$ occurs if $A$ occurs but $B$ does not, $B$ occurs but $A$ does not, or both $A$ and $B$ occur.
A collection of events $A_{1}, A_{2}, \dots$ are disjoint (a.k.a. mutually exclusive) if $A_{i} \cap A_{j} = \emptyset$ for all $i \neq j$ . That is, multiple events are disjoint if none of the events have any outcomes in common.
A probability measure, typically denoted $P$ , assigns probabilities to events to quantify their relative likelihoods according to the assumptions of the model of the random phenomenon.
The probability of event $A$ , computed according to probability measure $P (A)$ , is denoted $P (A)$ .
A valid probability measure $P$ must satisfy the following three logical consistency “axioms”.
- For any event $A$ , $0 \leq P (A) \leq 1$ .
- If $Ω$ represents the sample space then $P (Ω) = 1$ .
- (Countable additivity.) If $A_{1}, A_{2}, A_{3}, \dots$ are disjoint then $P (A_{1} \cup A_{2} \cup A_{2} \cup \dots) = P (A_{1}) + P (A_{2}) + P (A_{3}) + \dots$
Additional properties of a probability measure follow from the axioms
- Complement rule. For any event $A$ , $P (A^{c}) = 1 - P (A)$ .
- Subset rule. If $A \subseteq B$ then $P (A) \leq P (B)$ .
- Addition rule for two events. If $A$ and $B$ are any two events $P (A \cup B) = P (A) + P (B) - P (A \cap B)$
- Law of total probability. If $C_{1}, C_{2}, C_{3} \dots$ are disjoint events with $C_{1} \cup C_{2} \cup C_{3} \cup \dots = Ω$ , then $P (A) = P (A \cap C_{1}) + P (A \cap C_{2}) + P (A \cap C_{3}) + \dots$
The axioms of a probability measure are minimal logical consistent requirements that ensure that probabilities of different events fit together in a valid, coherent way.
A single probability measure corresponds to a particular set of assumptions about the random phenomenon.

Example 2.1 The probability that a randomly selected U.S. household has a pet dog is 0.47. The probability that a randomly selected U.S. household has a pet cat is 0.25. (These values are based on the 2018 General Social Survey (GSS).)

Represent the information provided using proper symbols.
Donny Don’t says: “the probability that a randomly selected U.S. household has a pet dog OR a pet cat is $0.47 + 0.25 = 0.72$ .” Do you agree? What must be true for Donny to be correct? Explain. (Hint: for the remaining parts it helps to consider two-way tables.)
What is the smallest possible value of the probability that a randomly selected U.S. household has a pet dog AND a pet cat? Describe the (unrealistic) situation in which this extreme case would occur.
What is the largest possible value of the probability that a randomly selected U.S. household has a pet dog AND a pet cat? Describe the (unrealistic) situation in which this extreme case would occur. What would be the probability that a randomly selected U.S. household has a pet dog OR a pet cat in this scenario?
Donny Don’t says: “I remember hearing once that in probability OR means add and AND means multiply. So the probability that a randomly selected U.S. household has a pet dog AND a pet cat is $0.47 \times 0.25 = 0.1175$ .” Do you agree? Explain.
According to the GSS, the probability that a randomly selected U.S. household has a pet dog AND a pet cat is $0.15$ . Compute the probability that a randomly selected U.S. household has a pet dog OR a pet cat.
Compute and interpret $P (C \cap D^{c})$ .

2.1 Equally Likely Outcomes and Uniform Probability Measures

For a sample space $Ω$ with finitely many possible outcomes, assuming equally likely outcomes corresponds to a probabiliy measure $P$ which satisfies $P (A) = \frac{| A |}{| Ω |} = \frac{number of outcomes in A}{number of outcomes in Ω} when outcomes are equally likely$

Example 2.2 Roll a fair four-sided die twice, and record the result of each roll in sequence.

How many possible outcomes are there? Are they equally likely?
Compute $P (A)$ , where $A$ is the event that the sum of the two dice is 4.
Compute $P (B)$ , where $B$ is the event that the sum of the two dice is at most 3.
Compute $P (C)$ , where $C$ the event that the larger of the two rolls (or the common roll if a tie) is 3.
Compute and interpret $P (A \cap C)$ .

Table 2.1: Table representing the sample space of two rolls of a four-sided die. The outcomes in orange comprise the event $A$ , the sum is equal to 4.
First roll	Second roll	Sum is 4?
1	1	no
1	2	no
1	3	yes
1	4	no
2	1	no
2	2	yes
2	3	no
2	4	no
3	1	yes
3	2	no
3	3	no
3	4	no
4	1	no
4	2	no
4	3	no
4	4	no

The continuous analog of equally likely outcomes is a uniform probability measure. When the sample space is uncountable, size is measured continuously (length, area, volume) rather that discretely (counting). $P (A) = \frac{| A |}{| Ω |} = \frac{size of A}{size of Ω} if P is a uniform probability measure$

Example 2.3 Regina and Cady are meeting for lunch. Suppose they each arrive uniformly at random at a time between noon and 1:00, independently of each other. Record their arrival times as minutes after noon, so noon corresponds to 0 and 1:00 to 60.

Draw a picture representing the sample space.
Compute the probability that the first person to arrive has to wait at most 15 minutes for the other person to arrive. In other words, compute the probability that they arrive within 15 minutes of each other.
Compute the probability that the first person to arrive arrives before 12:15.

N_rep = 1000

# Simulate values uniformly between 0 and 60, independently
u1 = runif(N_rep, 0, 60)
u2 = runif(N_rep, 0, 60)

# waiting time
waiting_time = abs(u1 - u2)

# first time
first_arrival_time = pmin(u1, u2)

# put the variables together in a data frame
meeting_sim = data.frame(u1, u2, waiting_time, first_arrival_time)

# first few rows (with kable formatting)
head(meeting_sim) |>
  kbl(digits = 3) |>
  kable_styling()

u1	u2	waiting_time	first_arrival_time
59.440	8.202	51.239	8.202
13.179	37.741	24.562	13.179
11.262	14.005	2.743	11.262
2.024	34.330	32.307	2.024
55.781	36.682	19.098	36.682
48.931	57.138	8.207	48.931

# Approximate probability that waiting time is less than 15
sum(waiting_time < 15) / N_rep

[1] 0.435

# Approximate probability that first arrival time is less than 15
sum(first_arrival_time < 15) / N_rep

[1] 0.438

# "Base R" plots
plot(u1, u2,
     col = ifelse(waiting_time < 15, "orange", "black"),
     xlab = "Regina's arrival time",
     ylab = "Cady's arrival time")
abline(a = 15, b = 1)
abline(a = -15, b = 1)
plot(u1, u2,
     col = ifelse(first_arrival_time < 15, "orange", "black"),
     xlab = "Regina's arrival time",
     ylab = "Cady's arrival time")

Simulated event that waiting time is less than 15

Simulated event that first arrival time is less than 15

library(ggplot2)

# ggplots
ggplot(meeting_sim,
       aes(x = u1, y = u2, col = (waiting_time < 15))) +
  geom_point() +
  geom_abline(slope = c(1, 1), intercept = c(15, -15)) +
  labs(x = "Regina's arrival time",
       y = "Cady's arrival time")
ggplot(meeting_sim,
       aes(x = u1, y = u2, col = (first_arrival_time < 15))) +
  geom_point() +
  labs(x = "Regina's arrival time",
       y = "Cady's arrival time")