Chapter 2 The Language of Probability and Simulation

A phenomenon is random if there are multiple potential outcomes, and there is uncertainty about which outcome will occur. This chapter introduces the fundamental terminology and objects of random phenomena, including

Possible outcomes of the random phenomenon
Related events that could occur
Random variables which measure numeric quantities based on outcomes
Probability measures which assign likelihoods to events in a logically coherent way and reflect assumptions about the random phenomenon
Distributions of random variables which describe their pattern of variability, and can be summarized by percentiles, expected values, standard deviations (and variances), and correlations (and covariances).
Conditioning, which involves revising probabilities and distributions to reflect new information
Probability models which put it all together

A probability model of a random phenomenon consists of a sample space of possible outcomes, associated events and random variables, and a probability measure which specifies probabilities of events and determines distributions of random variables according to the assumptions of the model. We will see techniques for computing probabilities, distributions, and related characteristics, but in many situations explicit computation is difficult. In this chapter we will also introduce simulation, a powerful tool for investigating probability models and solving complex problems.

Simulation involves using a probability model to artificially recreate a random phenomenon, usually using a computer. Given a probability model, we can simulate outcomes, occurrences of events, and values of random variables, according to the specifications of the probability measure. Simulation can be used to approximate probabilities of events, distributions of random variables, long run averages, and other characteristics.

Throughout this chapter we will illustrate ideas with three examples.

Example 2.1 (Total or best?) In many sports, a competitors’ final ranking is based on the results of multiple attempts. Competitors in Olympic bobsled, for example, make four separate timed runs on the same course and their ranking is based on their total time. Competitors in Olympic shot put make six throws, but their ranking is based on their best throw. In sports with multiple attempts, how do the rankings compare if they are based on the total (or average) over all attempts (as in bobsled) or on the best attempt (as is shot put)?

We’ll start to investigate this idea with an overly simplified example: roll a four-sided²² die twice and consider the sum and the larger of the two rolls (or the common roll if a die).

Example 2.2 (Matching problem) The “matching problem” is one well known probability problem. The general setup involves $n$ distincts objects labeled $1, \ldots, n$ which are placed in $n$ distinct boxes labeled $1, \ldots, n$ , with exactly one object placed in each box. The problem appears in many contexts; the following version is from FiveThirtyEight.

A geology museum in California has $n$ different rocks sitting in a row on a shelf, with labels on the shelf telling what type of rock each is. An earthquake hits and the rocks all fall off the shelf and get mixed up. A janitor comes in and, wanting to clean the floor, puts the rocks back on the shelf in random order. We might be interested in things like whether all the rocks are put back in the correct spot, or how many are, or if the heaviest rock is put back in the correct spot.

Example 2.3 (Meeting problem) Regina and Cady plan to meet for lunch but they are not sure of their arrival times. We might be interested in questions involving whether they arrive within 15 minutes of one another, who arrives first and at what time, or how long the first person to arrive needs to wait for the second.

Why four-sided? Simply to make the number of possibilities a little more manageable (e.g., for in-class simulation activities). Rolling a four-sided die twice yields 16 possible pairs, while rolling a six-sided die yields 36 possible pairs.↩︎