Chapter 3 Probability and random events

Probability is a method of mathematically modeling a random process so that we can understand it and/or make predictions about its future results. Probability is an essential tool for casinos, as well as for banks, insurance companies, and any other businesses that manage risks.

Chapter goals

In this chapter we will learn how to:

Model random events using the tools of probability
Calculate and interpret marginal, joint, and conditional probabilities
Interpret and use the assumptions of independence and equal outcome probability

This chapter uses mathematical notation and terminology that you have seen before but may need to review. If you have difficulty with the math, please refer to the sections on Sets and on Functions in the Math Review appendix.

Example 3.1 Example application: Roulette

We will develop ideas by considering the casino game of Roulette. The picture below shows what a roulette wheel looks like.

roulette wheel image Source: Roulette Vectors by Vecteezy

Here are the rules:

It features
- a ball.
- a spinning wheel with numbered/colored slots.
- a table on which to place bets
The slots are numbered from 0 to 36
- Slot number 0 is green
- 18 slots are red
- 18 slots are black.
- The picture above depicts an American roulette table, which has an additional green slot labeled “00,”
- I will assume we have a European roulette table, which does not include the “00” slot.
Players can place various bets on the table including:
- Red (ball lands in a red slot) pays $1 per $1 bet
- Black (ball lands in a black slot) pays $1 per $1 bet
- A straight bet on any specific number (ball lands on that number) pays $35 per $1 bet

Like other casino games, a roulette game is an example of a random process. Something will happen, it matters (to the players and the casino) what will happen, but we don’t know in advance what will happen.

3.1 Outcomes and events

To build a probabilistic model of a random process, we start by defining the outcome we are interested in. An outcome can be a simple yes/no result, it can be a number, or it can be a much more complex object. The outcome should be a complete description of the random process, in the sense that everything we are interested in can be defined in terms of the outcome.

Example 3.2 Outcomes in roulette

The outcome of a single game of roulette can be defined as the number of the slot in which the ball lands. Call that number $b$.

The set of all possible outcomes is called the sample space

Example 3.3 The sample space in roulette

The sample space for a game of roulette can be defined as the set of all numbers the ball can land on: \[\Omega = \{0,1,2,\ldots,36\}\] This sample space has $|\Omega| = 37$ elements.

Next, we define a set of events that we are interested in. We can think of an event as either:

A statement that is either true or false OR
A subset of the sample space

These two concepts are equivalent, though the subset concept makes the math clearer.

Example 3.4 Events in roulette

These roulette events are well-defined for our sample space:

Ball lands on 14: \[b \in \{14\}\]
Ball lands on red: \[\begin{align} b \in Red &= \left\{\begin{aligned} & 1,3,5,7,9,12,14,16,18, \\ & 19,21,23,25,27,30,32,34,36 \\ \end{aligned}\right\} \end{align}\]
Ball lands on black: \[\begin{align} b \in Black &= \left\{\begin{aligned} & 2,4,6,8,10,11,13,15,17, \\ & 20,22,24,26,28,29,31,33,35 \\ \end{aligned}\right\} \end{align}\]
Ball lands on one of the first 12 numbers: \[b \in First12 = \{1,2,3,4,5,6,7,8,9,10,11,12\}\]

We could define many more events, depending on what bets we are interested in.

An event that contains only one outcome is called an elementary event.

Since events are sets, we can use the terminology and mathematical tools for sets.

Example 3.5 Relationships among events

In our roulette example:

Two events are identical $(A = B)$ if they contain exactly the same outcomes:
- The event “ball lands on 14” and “a bet on 14 wins” are identical since $\{14\} = \{14\}$.
- Intuitively, identical means they are just two different ways of describing the same event.
An event implies another event $(A \subset B)$ if all of its outcomes are also in the implied event
- The event “ball lands on 14” implies the event “ball lands on red” since $\{14\} \subset Red$.
- When an event happens, any event it implies also happens.
Two events are disjoint $(A \cap B) = \emptyset$ if they share no outcomes:
- The events “ball lands on red” and “ball lands on black” are disjoint since $Red \cap Black = \emptyset$.
- If two events are disjoint, they cannot both happen.
- But they can both fail to happen. For example, if the ball lands in the green zero slot ($b = 0$), neither red nor black wins.
Any two elementary events are either identical or disjoint
- The events “ball lands on 14” and “ball lands on 25” are disjoint since $\{14\} \cap \{25\} = \emptyset$.

3.2 Probabilities

Our final step is to define a probability distribution for this random process, which is a function that assigns a number to each possible event. The number is called the event’s probability.

Probabilities are normally between zero and one:

If an event has probability zero, it definitely will not happen
If an event has probability strictly between zero and one, it might happen.
If an event has probability one, it definitely will happen.

3.2.1 The axioms of probability

All valid probability distributions must obey the following three conditions, which are sometimes called the axioms of probability.

Probabilities are never negative: \[\Pr(A) \geq 0\]
One of the outcomes will definitely happen: \[\Pr(\Omega) = 1\]
For any two disjoint events $A$ and $B$, the probability that $A$ or $B$ happen is the sum of their individual probabilities: \[\Pr(A \cup B) = \Pr(A) + \Pr(B)\]

Probability distributions have many other properties, but they can all be derived from these three axioms.

Example 3.6 Outcome probabilities for a fair roulette game

Let’s assume that the roulette wheel is “fair” in the sense that each outcome has the same probability. Now, I should emphasize that this doesn’t have to be the case, it’s just an assumption. But it’s a reasonable one in this case because casinos are required by law to run fair roulette wheels and would be subject to heavy penalties if they run unfair wheels. Later on, we will use statistics to confirm that a roulette wheel is fair.

Call that probability $p$: \[p = \Pr(b = 0) = \Pr(b = 1) = \cdots = \Pr(b = 36)\] To find the value of $p$ we use the rules of probability. By rule #2 of probability, one of the outcomes will happen: \[\Pr(\Omega) = 1\] Since the different outcomes are disjoint, rule #3 implies that: \[\underbrace{\Pr(\Omega))}_{1} = \underbrace{\Pr(\{0\})}_{p} + \underbrace{\Pr(\{1\})}_{p} + \cdots + \underbrace{\Pr(\{36\})}_{p}\] Summarizing this equation: \[1 = 37p\] Solving for $p$ we get: \[p = 1/37 \approx 0.027\] That is, each of the 37 elementary events have a probability of $1/37$.

Since this is an introductory course, our sample space will usually contain a finite number of outcomes, as in our roulette example. In that case, probability calculations are pretty simple:

Find the probability of each elementary event.
To find the probability of a specific event, just add up the probabilities of its elementary events.

Example 3.7 Event probabilities for a fair roulette game

In the roulette example, the probability of any event $A$ is just the number of outcomes in $A$ times the probability of each outcome $1/37$: \[ \Pr(A) = |A|*1/37\] The notation $|A|$ just means the size of (number of elements in) the set $A$.

For example: \[\Pr(b=25) = |\{25\}|*1/37 = 1/37 \approx 0.027\] \[\Pr(Red) = |Red|*1/37 = 18/37 \approx 0.486\] \[\Pr(Even) = |Even|*1/37 = 18/37 \approx 0.486\] \[\Pr(First12) = |First12|*1/37 = 12/37 \approx 0.324\]

However, not all sample spaces contain a finite number of outcomes. For example, suppose we are interested in using probability to model the unemployment rate, or a person’s income. Those are real numbers, and can take on any of an infinite number of values. This adds a few complications, and is the reason that the probability axioms refer to events (sets of outcomes) and not individual outcomes.

What do probabilities really mean?

What does it really mean to say that the probability of the ball landing in a red slot is about 0.486? That’s actually a tough question. There are two standard interpretations for probabilities:

Frequentist or classical interpretation: we are thinking of the random process as something that could be repeated many times, and the probability of an event is the approximate fraction of times that the event will occur. That is, if you go to a casino and bet 1000 times on Red, you will win about 486 times.
Bayesian or subjectivist interpretation: the random process is a one-time occurrence, but we have limited information about it and the probability of event represents the strength of our belief that the event will happen.

The frequentist interpretation of probability is well-suited for simple repeated settings like casino games or car insurance, while the Bayesian interpretation makes more sense for things like predicting election results.

3.2.2 Additional rules for probabilities

Let $A$ and $B$ be two events. Then our three axioms of probability imply several additional rules:

Probabilities cannot be higher than one. \[\Pr(A) \leq 1\]
Probabilities of identical events are identical: \[A = B \implies \Pr(A) = \Pr(B)\]
Probabilities of implied events are larger: \[A \subset B \implies \Pr(A) \leq \Pr(B)\]
The probability of an event not happening is: \[\Pr(A^C) = 1 - \Pr(A)\]
The probability of nothing happening is: \[\Pr(\emptyset) = 0\]
The probability of either $A$ or $B$ happening is: \[\begin{align} \Pr(A \cup B) &= \Pr(A) + \Pr(B) - \Pr(A \cap B) \\ &\leq \Pr(A) +\Pr(B) \end{align}\]

These results are not hard to prove, but I will not go through the proofs. However, I will use these results so you should be familiar with them.

3.3 Joint and conditional probabilities

We are often interested in more than one event, and want to talk about how they are related. For example:

In some casino games like poker or blackjack, players take an additional action after partial information about the outcome is revealed.
Politicians often use polls to predict the winner of an election.
Finance people often want to model multiple market scenarios and forecast a company’s earnings under each of them.
Economists often have data on current economic conditions and want to predict future economic conditions.

This section will develop some tools for dealing with the relationship between different random events.

3.3.1 Joint probabilities

The joint probability of two events $A$ and $B$ is the probability that they both happen: \[\Pr(A \cap B)\] Remember that the intersection ($\cap$) of $A$ and $B$ is the set of all outcomes that are in both $A$ and $B$.

Example 3.8 Joint probabilities for roulette bets

Consider two events for a game of roulette: \[\begin{align} Red &= \left\{\begin{aligned} & 1,3,5,7,9,12,14,16,18, \\ & 19,21,23,25,27,30,32,34,36 \\ \end{aligned}\right\} \\ Even &= \left\{\begin{aligned} & 2,4,6,8,10,12,14,16,18, \\ & 20,22,24,26,28,30,32,34,36 \\ \end{aligned}\right\} \end{align}\] Suppose you are interested in the probability that the ball lands on a number that is both red and even. This event is just the intersection of $Red$ and $Even$ so this joint probability is: \[\begin{align} \Pr(Red \cap Even) &= \Pr(\{12,14,16,18,30,32,34,36\}) \\ &= 8/37 \\ &\approx 0.216 \end{align}\]

Joint probabilities are just probabilities, so they obey all of the axioms and rules of probability described in Section 3.2.

3.3.2 Conditional probabilities

The conditional probability of an event $A$ given another event $B$ is defined as: \[\Pr(A|B) = \frac{\Pr(A \cap B)}{\Pr(B)}\] The conditional probability answers the question: if we already know that $B$ is true, what are the chances that $A$ is true?

Conditional probabilities are very important when playing poker. At the beginning of the game, every player has equal chance of having a winning hand. But that is no longer true after you see your cards - having “good” cards increases your chance of winning, and having “bad” cards decreases that chance. In other words, your bet should be based on $\Pr(win|cards)$ rather than $\Pr(win)$. Good poker players have detailed knowledge of these conditional probabilities.

Example 3.9 Conditional probabilities in roulette

In our roulette example: \[\Pr(Red|Even) = \frac{\Pr(Red \cap Even)}{\Pr(Even)} = \frac{8/37}{18/37} \approx 0.444\] \[\Pr(b = 14|Even) = \frac{\Pr(b = 14 \cap Even)}{\Pr(Even)} = \frac{1/37}{18/37} \approx 0.056\]

Like joint probabilities, conditional probabilities are just probabilities, so they obey all of the axioms and rules of probability described in Section 3.2.

3.3.3 Independent events

One common “trick” in modeling joint and conditional probabilities is to assume that certain events are unrelated to each other. This can simplify the math significantly.

We say that two events $A$ and $B$ are independent if their joint probability is just the two individual probabilities multiplied together: \[\Pr(A \cap B) = \Pr(A)\Pr(B)\] We usually express independence with the notation $A \bot B$.

The definition of independence is not very intuitive, but we can clarify it by doing a little math. Consider two independent events $A$ and $B$ that have nonzero¹ probability. Then by the definition of independence: \[\Pr(A|B) = \frac{\Pr(A \cap B)}{\Pr(B)} = \frac{\Pr(A)\Pr(B)}{\Pr(B)} = \Pr(A)\] By the same reasoning: \[\Pr(B|A) = \Pr(B)\] In other words, knowing that one of these events are true tells you nothing useful about whether the other the other event is true.

When would it be reasonable to assume events are independent? The typical scenario would be where there is simply no physical or logical relationship between them, usually due to a separation in time and space.

Example 3.10 Independence across roulette games

We have already shown that events related to a single roulette game are not necessarily independent. But the outcomes/events of two different roulette games can be reasonably assumed to be independent of one another.

Suppose that I bring $100 to a casino this afternoon for a few games of roulette. I bet all of my money on Red for the first game.

If I lose, I am broke and stop playing.
If I win, I keep all of my money (both my initial bet and my winnings) on Red for the next spin.
I keep playing until I run out of money.

After 3 games:

If Red wins all 3 games, I have $w = \$800$.
Otherwise, I have nothing $w = \$0$

What is the probability of each of these events? Since we can assume that each game’s outcome is independent, this is an easy problem: \[\begin{align} \Pr(w = \$800) &= \Pr(Red_1 \cap Red_2 \cap Red_3) \\ &= \Pr(Red_1) \times \Pr(Red_2) \times \Pr(Red_3) \tag{3.1} \\ &= (18/37) \times (18/37) \times (18/37) \approx 0.115 \\ \Pr(w = \$0) &= 1 - \Pr(w = \$800) \\ &\approx 0.885 \\ \end{align}\] So we have an 11.5% chance of winning big, and an 88.5% chance of going broke.

Very important: equation (3.1) only follows from the previous equation because we have assumed the events $Red_1$, $Red_2$, and $Red_3$ are independent.

When is it not reasonable to assume that events are independent? In almost any other case. Remember that events are defined in terms of the same underlying outcome, so they are typically related unless you have some very specific reason to assume otherwise.

Example 3.11 Independence within a roulette game?

Consider the roulette events “Red wins” and “Even wins.” We earlier showed that the unconditional probability that Red wins is: \[\Pr(Red) = 18/37 \approx 0.486\] The conditional probability that Red wins given that Even wins is: \[\Pr(Red|Even) = 8/18 \approx 0.444\] Since $0.44 \neq 0.486$, these two events are not independent.

A common mistake by students who are new to probability and statistics is to take results that only apply under independence and use them when there is no reason to believe that independence holds. Don’t make this mistake: independence is an assumption, and one that can easily be incorrect.

3.3.4 Law of total probability

In addition to the results we have already discussed, there are two important results using conditional probabilities:

The first is the law of total probability which is a rule for determining unconditional probabilities from conditional probabilities:

\[\Pr(A) = \Pr(A|B)\Pr(B) + \Pr(A|B^c)\Pr(B^c)\]

The law of total probability allows us to create a set of scenarios, calculate probabilities under each scenario, and then add them up. It is useful when we are modeling random outcomes that occur in multiple stages, for example a poker game or an energy company making a series of investments to develop an oil field.

Example 3.12 The law of total probability in poker

Suppose you are playing Texas hold’em poker with a few friends, and the hand has one card left to deal (the “river”). If the last card has a heart on it (25% probability) you will have a flush and win the hand with a probability you estimate to be 90%. If not, you will win with a probability you estimate to be 10%. What are your overall chances of winning?

The answer can be calculated using the law of total probability: \[\begin{align} \Pr(Win) &= \Pr(\textrm{Win}|\textrm{Hearts})\Pr(\textrm{Hearts}) + \Pr(\textrm{Win}|\textrm{not Hearts})\Pr(\textrm{not Hearts}) \\ &= 0.9*0.25 + 0.1*0.75 \\ &= 0.3 \end{align}\] So you have a 30% chance of winning.

3.3.5 Bayes’ law

The second is Bayes’ law, which is a rule for determining conditional probabilities: \[\Pr(A|B) = \frac{\Pr(B|A)\Pr(A)}{\Pr(B)}\] Bayes’ law is particularly useful in evaluating evidence, because it allows us to restate one conditional probability in terms of another.

Both the law of total probability and Bayes’ law follow from the definition of conditional probabilities. They are easy to prove, but I won’t prove them here. Instead, I will use an example to show how they can be useful.

Example 3.13 False positives in medical testing

When someone is tested for a disease, the test comes back either “positive” (the person has the disease) or “negative” (the person does not have the disease). However, no test is perfect. Sometimes people who do not have the disease test positive (“false positives”) and sometimes people who do have the disease test negative (“false negative”).

Let the event $T$ mean a particular patient tests positive for a disease, and let the event $D$ mean that this patient actually has the disease.

The sensitivity of the test is an infected patient’s probability of testing positive: \[\Pr(T|D) = p\] the specificity of the test is a healthy patient’s probability of testing negative: \[\Pr(T^c|D^c) = q\] and the prevalence of the infection is the probability that a given patient has the disease: \[\Pr(D) = d\] Suppose that a patient has tested positive. What is the probability that he has the disease, i.e. what the value of $\Pr(D|T)$?

This is a classic probability question, as it makes use of Bayes’ law and the law of total probability, and it has obvious practical usage.

Since we want a conditional probability, we start by stating Bayes’ law: \[\Pr(D|T) = = \frac{\Pr(T|D)\Pr(D)}{\Pr(T)}\] Bayes’ law will allow us to calculate $\Pr(D|T)$ if we can find the components of the right side of this equation. We already know that $\Pr(T|D)=p$ and $\Pr(D)=r$, so all we need is to find $\Pr(T)$.

Since $\Pr(T)$ is an unconditional probability, we can use the law of total probability: \[\Pr(T) = \underbrace{\Pr(T|D)}_{p} \underbrace{\Pr(D)}_{d} + \underbrace{\Pr(T|D^c)}_{1-q} \underbrace{\Pr(D^c)}_{1-d}\] Plugging these results into our formula we get: \[\Pr(D|T) = = \frac{pd}{pd + (1-q)(1-d)}\] which is the result we need.

Now, let’s try out some numbers. Suppose that false positives are rare ($q = 0.99$), and false negatives never happen ($p=1$).

Suppose the disease itself is fairly common ($d = 0.10$). Then: \[\Pr(D|T) = = \frac{1*0.1}{1*0.1 + (1-0.99)*(1-0.1)} \approx 0.917\]
Suppose the disease itself is quite rare ($d = 0.001$). Then \[\Pr(D|T) = = \frac{1*0.001}{1*0.001 + (1-0.99)*(1-0.001)} \approx 0.091\]

In other words, the exact same test has a very different meaning depending on the prevalence in the population: when the disease is common a positive test means a 91.7% chance of having the disease, and when the disease is rare a positive test result means a 9.1% chance of having the disease.

This general issue (even a small false positive rate can have a big impact when prevalence is low) appeared repeatedly in March and April of 2020. Several studies by well-known researchers² dramatically overestimated the early prevalence of the COVID-19 virus and thus dramatically underestimated its fatality rate. These studies were regularly cited as support by those who wanted to substantially relax public health restrictions in April 2020, and had substantial real world consequences.

Chapter review

In this chapter we have learned the basic terminology and concepts of probability. You may have seen a number of these terms and ideas in high school, but we are approaching them at a higher level. Be sure to review these terms and concepts in detail, and do the practice problems to test your knowledge.

Our next step is to take our general framework of outcomes and events, and apply them to random variables - outcomes that are specifically numerical.

Practice problems

Answers can be found in the appendix.

Most of these practice problems will be based on the casino game of craps. Craps is played with a pair of 6-sided dice.

Players take turns rolling the dice, and the player currently rolling the dice is called the “shooter.” There are various bets - pass, don’t pass, come, don’t come, field, place, buy - that can be placed on the results of multiple rolls of the dice. These bets and their probability calculations can be quite complex, so we will focus on “single roll” bets.

A bet on “Snake Eyes” wins if the total showing on the dice is 2.
A bet on “Yo” wins if the total showing on the dice is 11.
A bet on “Boxcars” wins if the total showing on the dice is 12.
A bet on “Field” wins if the total showing on the dice is 2, 3, 4, 9, 10, 11, or 12.

For this example, assume that

One die is red and the other is white.
Both dice are fair, that is each side has equal probability
The dice are independent of one another

An outcome for a single roll of the dice is a pair of numbers $(r,w)$ where $r$ is the amount showing on the red die, and $w$ is the amount showing on the white die. For example an outcome $(2,4)$ means that the red die is showing 2 and the white die is showing 4.

SKILL #1: Define outcomes and sample space for a simple example

Let $\Omega$ be the sample space for the outcome of a single roll in craps.
1. Define $\Omega$ by enumeration.
2. Find the cardinality of $\Omega$.
Using enumeration, define the following events:
1. Yo wins
2. Snake eyes wins
3. Boxcars wins
4. Field wins

SKILL #2: Use set theory to work with events

Which of the following statements are true?
1. The events “Yo wins” and “Boxcars wins” are identical.
2. The events “Yo wins” and $(r,w) = (5,6)$ are identical.
3. The events “Boxcars wins” and $(r,w) = (6,6)$ are identical.
Which of the following statements are true?
1. The events “Yo wins” and “Boxcars wins” are disjoint.
2. The events “Yo wins” and “Field wins” are disjoint.
3. The events “Yo wins” and “Boxcars loses” are disjoint.
4. The events “Yo wins” and “Field loses” are disjoint.
Which of the following statements are true?
1. The event “Yo wins” implies the event “Boxcars wins.”
2. The event “Yo wins” implies the event “Boxcars loses.”
3. The event “Yo wins” implies the event “Field wins.”
4. The event “Yo wins” implies the event “Field loses.”
Which of the following are elementary events?
1. Yo wins.
2. Yo loses.
3. Boxcars wins.
4. Boxcars loses.
5. Field wins.
6. Field loses.

SKILL #3: Calculate event probabilities from elementary event probabilities

Calculate each of the following elementary event probabilities:
1. $(r,w) = (1,1)$
2. $(r,w) = (3,4)$
3. $(r,w) = (6,6)$
Find the probability of each of the following events:
1. A bet on Yo wins.
2. A bet on Snake eyes wins.
3. A bet on Boxcars wins.
4. A bet on Field wins.

SKILL #4: Calculate joint and conditional probabilities

Calculate each of the following joint probabilities:
1. $\Pr(\textrm{Yo wins} \cap \textrm{Boxcars wins})$
2. $\Pr(\textrm{Yo wins} \cap \textrm{Field wins})$
3. $\Pr(\textrm{Yo wins} \cap \textrm{Boxcars loses})$
Calculate each of the following conditional probabilities:
1. $\Pr(\textrm{Yo wins} | \textrm{Boxcars wins})$
2. $\Pr(\textrm{Yo wins} | \textrm{Field wins})$
3. $\Pr(\textrm{Yo wins} | \textrm{Boxcars loses})$
4. $\Pr(\textrm{Field wins} | \textrm{Yo wins})$
5. $\Pr(\textrm{Boxcars wins} | \textrm{Yo wins})$
Which of the following pairs of events are independent?
1. Yo wins and Boxcars wins.
2. Yo wins and Field wins.
3. Yo wins and Yo wins.
4. $r = 3$ and $r = 5$.
5. $r = 3$ and $w =5$.

SKILL #5: Apply the axioms of probability

Let $A$ be an event. Which of the following statements are true?
1. $\Pr(A) \geq 0$.
2. $\Pr(A) > 0$.
3. $\Pr(A) \leq 1$.
4. $\Pr(A) < 1$.
5. $\Pr(A^c) \geq 0$.
6. $\Pr(A^c) > 0$.
7. $\Pr(A^c) \leq 1$.
8. $\Pr(A^c) < 1$.
9. $\Pr(A^c) = 1 - \Pr(A)$.
Let $A$ and $B$ be two events. Which of the following statements are true?
1. $\Pr(A \cup B) = \Pr(A) + \Pr(B)$.
2. $\Pr(A \cup B) = \Pr(A) + \Pr(B) - \Pr(A \cap B)$.
3. $\Pr(A \cup B) \leq \Pr(A) + \Pr(B)$.
4. $\Pr(A \cap B) = \Pr(A)\Pr(B)$.
Let $A$ and $B$ be two disjoint events. Which of the following statements are true?
1. $\Pr(A \cap B) = 0$.
2. $\Pr(A \cap B) = \Pr(A) + \Pr(B)$.
3. $\Pr(A \cup B) = 0$.
4. $\Pr(A \cup B) = \Pr(A) + \Pr(B)$.
5. $\Pr(A \cup B) = \Pr(A) + \Pr(B) - \Pr(A \cap B)$.
6. $\Pr(A \cup B) \leq \Pr(A) + \Pr(B)$.
7. $\Pr(A \cap B) = \Pr(A)\Pr(B)$.
8. $\Pr(A | B) = 0$
Let $A$ and $B$ be two events such that $A \subset B$. Which of the following statements are true?
1. $\Pr(A) \leq \Pr(B)$
2. $\Pr(A \cap B) = \Pr(A)$
3. $\Pr(A | B) = 1$
Let $A$ and $B$ be two independent events. Which of the following statements are true?
1. $\Pr(A \cap B) = 0$.
2. $\Pr(A \cap B) = \Pr(A)\Pr(B)$.
3. $\Pr(A|B) = \Pr(A)$.

You may wonder: if it makes more sense to describe independence in terms of conditional probabilities, why do we define it in terms of joint probabilities? The key is the requirement that the events have nonzero probability. When $B$ has zero probability the conditional probability $\Pr(A|B)$ is not well defined since its denominator is zero.↩︎
If you are interested in learning more about this, an article in Science provides an overview of the controversy, and a blog post by statistician Andrew Gelman provides a thorough discussion of the statistical issues.↩︎