2.3 Random variables

Statisticians use the terms observational unit and variable. Observational units are the people, places, things, etc., for which data are observed. Variables are the measurements made on the observational units. For example, the observational units in a study could be college students, while variables could be high school GPA, college GPA, SAT score, years in college, number of Statistics courses taken, etc.

In probability, an outcome of a random phenomenon plays a role analogous to an observational unit in statistics. The sample space of outcomes is often only vaguely defined. In many situations we are less interested in detailing the outcomes themselves and more interested in whether or not certain events occur, or with measurements that we can make for the outcomes. For example, if the random phenomenon corresponds to randomly selecting a random sample of students at a college, an outcome could be the list of students selected for the sample. But we are less interested in who the students are, and more interested in questions which involve variables, such as: what is the distribution of SAT scores? What is the relationship between high school GPA and college GPA? What is the average number of years before graduation?

Roughly, a random variable assigns a number to each outcome of a random phenomenon. More precisely

Definition 2.3 A random variable (RV) \(X\) is a function that takes an outcome in the sample space as input and returns a real number as output; that is, \(X:\Omega \mapsto \mathbb{R}\). Random variables are typically denoted by capital letters near the end of the alphabet, with or without subscripts: e.g. \(X\), \(Y\), \(Z\), or \(X_1\), \(X_2\), \(X_3\), etc. The value that the random variable \(X\) assigns to the outcome \(\omega\) is denoted \(X(\omega)\).

Example 2.14 Flip a coin 4 times, and record the result of each trial in sequence. For example, HTTH means heads on the first on last trial and tails on the second and third. One random variable is \(X\), the number of heads flipped.

  1. Explain why \(X\) is a random variable.
  2. Evaluate each of the following: \(X(HHHH), X(HTHT), X(TTHH)\).
  3. Identify the possible values of \(X\). Why not let the sample space just consist of this set of possible values?

Solution to Example 2.14

  1. \(X\) maps each outcome to a number via the function “count the number of heads”.
  2. \(X(HHHH) = 4, X(HTHT) = 2, X(TTHH) = 2\).
  3. The possible values of \(X\) are \(0, 1, 2, 3, 4\). You might ask: if we only care about the number of heads, why bother with the coin flip sequence at all? That is, why not define the sample space as \(\{0, 1, 2, 3, 4\}\) rather than \(\{HHHH, HHHT, HHTH, \ldots\}\). The main reason why is that we often want to define many random variables on the same sample space, and study the relationships between random variables. For example, if a sample space outcome were just the number of heads, we would not be able to obtain information about the longest number of heads in a row, or the proportion of heads on trials that follow heads. Moreover we would not be able to study the relationship between random variables unless they were defined on a common sample space. As a statistics analogy, you would not be able to study the relationship between SAT scores and college GPA unless you measured both variables for the same set of observational units. (Another but less important reason is that is often convenient to work with probability spaces in which the outcomes are equally likely, as is the case for \(\{HHHH, HHHT, HHTH, \ldots\}\) but not for \(\{0, 1, 2, 3, 4\}\), for four flips of a fair coin.)

The RV itself is typically denoted with a capital letter (\(X\)); possible values of that random variable are denoted with lower case letters (\(x\)). Think of the capital letter \(X\) as a label standing in for a formula like “the number of heads in 4 flips of a coin” and \(x\) as a dummy variable standing in for a particular value like 3.

We are often interested in events which involve random variables. For example, the expressions \(X=x\) or \(\{X=x\}\) are shorthand for the event \(\{\omega\in\Omega: X(\omega)=x\}\), the set of outcomes \(\omega\) for which \(X(\omega)=x\). Remember that any event is a subset of the sample space. So objects like \(\{X=x\}\) are subsets12 of \(\Omega\).

Example 2.15 Flip a coin 4 times, and record the result of each trial in sequence. For example, HTTH means heads on the first on last trial and tails on the second and third. One random variable is \(X\), the number of heads flipped.
  1. Identify and interpret the event \(\{X=3\}\). That is, identify the outcomes \(\omega\) for which \(X(\omega)=3\).
  2. Identify and interpret the event \(\{X=4\}\).
  3. Identify and interpret the event \(\{X\ge3\}\). How is this event related to the previous two?
Solution to Example 2.15
  1. \(\{X=3\} = \{HHHT, HHTH, HTHH, THHH\}\) is the event that exactly 3 of the flips land on heads. This is an event because it is a subset of the sample space.
  2. \(\{X=4\}= \{HHHH\}\), the event that exactly 4 of the flips land on heads. Notice that the event is the set \(\{HHHH\}\), which consists of the single outcome \(HHHH\).
  3. \(\{X\ge3\} = \{HHHT, HHTH, HTHH, THHH, HHHH\}\) is the event that the at least 3 of the flips land on heads. Also \(\{X\ge 3\} = \{X=3\}\cup \{X=4\}\).

Recall that for a mathematical function13 \(g\), given an input \(u\), the function returns a real number \(g(u)\). For example, if \(g(u) = u^2\) then \(g(3) = 9\). If the input comes from some set \(S\) (i.e. \(u\in S\)), we often write \(g:S\mapsto \mathbb{R}\).

Likewise, a random variable \(X\) is a function which maps each outcome \(\omega\) in the sample space \(\Omega\) to a real number \(X(\omega)\); \(X:\Omega\mapsto\mathbb{R}\). For a single outcome \(\omega\), the value \(x = X(\omega)\) is a single number; notice that \(x\) represents the output of the function \(X\) rather than the input. However, it is important to remember that the RV \(X\) itself is a function, and not a single number.

You are probably familiar with functions which have simple closed form formulas of their arguments: \(g(u)=5u\), \(g(u)=u^2\), \(g(u)=e^u\), etc. While any random variable is some function, the function is rarely specified as an explicit mathetical formula of its input \(\omega\). Often, outcomes are not even numbers (e.g., sequences of coin flips), or only vaguely specified if at all (e.g., possible paths of a hurricane). In the coin flip example, we defined \(X\) only through the words “number of flips that land on heads”; translating even this simple situation into a formula of \(\omega\) requires some notation14.

It is more appropriate to think of a RV as a function in the sense of a scale at a grocery store which maps a fruit to its weight, \(X: \text{fruit}\mapsto\text{weight}\). Put an apple on the scale and the scale returns a number, \(X(\text{apple})\), the weight of the apple. Likewise, \(X(\text{orange})\), \(X(\text{banana})\). The RV \(X\) is the scale itself. (This simplistic analogy assumes a sample space outcome is a single fruit. Of course, it’s even more complicated in reality since an outcome can be considered a set of fruits, so that we have for example \(X(\{\text{2 apples}, \text{3 oranges}\})\), and all fruits do not weigh the same, so that \(X(\text{this apple})\) is not the same as \(X(\text{that apple})\).)

The RV itself is denoted with a capital letter; possible values of that random variable are denoted with lower case letters. For example \(\{X=x\}\) is shorthand for the event \(\{\omega\in\Omega: X(\omega)=x\}\), the set of outcomes \(\omega\) for which \(X(\omega)=x\). Think of the capital letter \(X\) as a label standing in for a formula like “the number of heads in 4 flips of a coin” and \(x\) as a dummy variable standing in for a particular value like 3. Remember that any event is a subset of the sample space. So objects like \(\{X=x\}\) are subsets15 of \(\Omega\). In the scale analogy if \(X\) is weight measured in pounds, we might have \(\{X>5\}=\{\text{watermelon}, \text{pineapple}\}\) is the set of fruits that weigh more than 5 pounds.

Example 2.16 Roll a four-sided die twice, and record the result of each roll in sequence. For example, the outcome \((3, 1)\) represents a 3 on the first roll and a 1 on the second; this is not the same outcome as \((1, 3)\). Let \(X\) be the sum of the two dice, and let \(Y\) be the larger of the two rolls (or the common value if both rolls are the same).
  1. Evaluate \(X((1, 3))\), \(X((4, 3))\), and \(X((2,2))\).
  2. Evaluate \(Y((1, 3))\), \(Y((4, 3))\), and \(Y((2,2))\).
  3. Identify and interpret \(\{X = 4\}\).
  4. Identify and interpret \(\{X = 3\}\).
  5. Identify and interpret \(\{X \le 3\}\).
  6. Identify and interpret \(\{Y = 4\}\).
  7. Identify and interpret \(\{Y = 3\}\).
  8. Identify and interpret \(\{Y \le 3\}\).
  9. Identify and interpret \(\{X = 4, Y = 3\}\) (that is, \(\{X = 4\}\cap \{Y = 3\}\)).
  10. Identify and interpret \(\{X = 4, Y \le 3\}\).
  11. Identify and interpret \(\{X = 3, Y = 3\}\).
  12. Identify and interpret \(\{X \ge Y\}\).
  13. Identify the possible values of \(X\).
  14. Identify the possible values of \(Y\).
  15. Identify the possible values of the pair \((X, Y)\).
Solution to Example 2.16
  1. \(X\) is the sum of the two rolls, so \(X((1, 3))=4\), \(X((4, 3))=7\), and \(X((2,2))=4\).
  2. \(Y\) is the larger of the two rolls (or the common value if a tie) so \(Y((1, 3))=3\), \(Y((4, 3))=4\), and \(Y((2,2))=2\).
  3. \(\{X = 4\} =\{(1, 3), (2, 2), (3, 1)\}\) is the event that the sum of the two dice is 4.
  4. \(\{X = 3\} =\{(1, 2), (2, 1)\}\) is the event that the sum of the two dice is 3.
  5. \(\{X \le 3\}=\{(1, 1), (1, 2), (2, 1)\}\) is the event that the sum of the two dice is at most 3.
  6. \(\{Y = 4\}=\{(1, 4), (2, 4), (3, 4), (4, 4), (4, 1), (4, 2), (4,3)\}\) is the event that the larger of the two rolls is 4.
  7. \(\{Y = 3\}=\{(1, 3), (2, 3), (3, 3), (3, 1), (3, 2)\}\) is the event that the larger of the two rolls is 3.
  8. \(\{Y \le 3\}=\{(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)\}\) is the event that the larger of the two rolls is at most 3. Notice that since in this example \(Y\) can only take values 1, 2, 3, 4, we have \(\{Y\le 3\} = \{Y=4\}^c\).
  9. \(\{X = 4, Y = 3\} \equiv \{X = 4\}\cap \{Y = 3\}=\{(1, 3), (3, 1)\}\) is the event that both the sum of the two dice is 4 and the larger of the two rolls is 3. Even though this involves two random variables, it is a single event (that is, a single subset of the sample space). There are only two outcomes for which both the sum of the two dice is 4 and the larger of the two dice is 3.
  10. \(\{X = 4, Y \le 3\} \equiv \{X = 4\}\cap \{Y \le 3\}=\{(1, 3), (2, 2), (3, 1)\}\) is the event that both the sum of the two dice is 4 and the larger of the two rolls is at most 3. Notice that since in this example \(\{X=4\} \subset \{Y\le 3\}\), we have \(\{X = 4, Y \le 3\} = \{X=4\}\).
  11. \(\{X = 3, Y = 3\} \equiv \{X = 3\}\cap \{Y = 3\}=\emptyset\), since there are no outcomes for which both the sum is 3 and the larger of the two dice is 3. (If the the larger of the two dice is 3, then the sum must be at least 4.)
  12. The event \(\{X\ge Y\}\) represents the set of outcomes \(\{\omega: X(\omega) \ge Y(\omega)\}\). In this example, for every possible outcome the sum of the two dice is at least as large as the larger of the two die, so \(\{X \ge Y\} = \Omega\).
  13. The possible values of \(X\) are \(\{2, 3, \ldots, 8\}\)
  14. The possible values of \(Y\) are \(\{1, 2, 3, 4\}\)
  15. The possible values of the pair \((X, Y)\) are \(\{(2, 1), (3, 2), (4, 2), (4, 3), (5, 3), (5, 4), (6, 3), (6, 4), (7, 4), (8,4)\}\). Notice that while, for example, 8 is a possible value of \(X\) and 1 is a possible value of \(Y\), (8, 1) is not a possible value of the pair \((X, Y)\); it’s not possible for the larger of the two dice to be 1 but their sum to be 8.

When dealing with probabilities, it is common to write \(\textrm{P}(X=3)\) instead of \(\textrm{P}(\{X=3\})\), and \(\textrm{P}(X = 4, Y = 3)\) instead of \(\textrm{P}(\{X = 4\}\cap \{Y = 3\})\); read the comma in \(\textrm{P}(X = 4, Y = 3)\) as “and”. But keep in mind that an expression like “\(X=3\)” really represents an event \(\{X=3\}\), an expression which itself represents \(\{\omega\in\Omega: X(\omega) = 3\}\), a subset of \(\Omega\).

Example 2.17 Customers enter a deli and take a number to mark their place in line. When the deli opens the counter starts 0; the first customer to arrive takes number 1, the second 2, etc. We record the counter over time, continuously, as it changes as customers arrive. Time is measured in minutes after the deli opens (time 0). A sample space outcome could be represented as a path of the number of customers over time; a few such paths are illustrated in Figure 1.2. Notice the stairstep feature: a customer arrives and takes a number then the counter stays on that number for some time (the flat spots) until another customer arrives and the counter increases by one (the jumps).

There are many random variables that could be interest, including

  • \(N_t\), the number of customers that have arrived by time \(t\), where \(t\ge0\) is minutes after time 0
  • \(T_j\), the time (in minutes after time 0) at which the \(j\)th customer arrives, for \(j=1, 2, \ldots\)
  • \(W_j\), the “waiting” time (in minutes) between the arrival of the \(j\)th and the \((j-1)\)th customer.
Sample space outcomes for Example 2.17. Left: a single sample path of the number of customer arrivals over time. Right: several possible paths.Sample space outcomes for Example 2.17. Left: a single sample path of the number of customer arrivals over time. Right: several possible paths.

Figure 2.3: Sample space outcomes for Example 2.17. Left: a single sample path of the number of customer arrivals over time. Right: several possible paths.

For the outcome represented by the path in the plot on the left in Figure 2.3, identify (as best as you can from the plot) the value of the following random variables.

  1. \(N_4\)
  2. \(N_{6.5}\)
  3. \(T_4\)
  4. \(T_5\)
  5. \(W_1\)
  6. \(W_5\)
Solution to Example 2.17
  1. For this outcome \(N_4=3\); the number of customers that have arrived by time 4 is 3.
  2. For this outcome \(N_{6.5}=5\); the number of customers that have arrived by time 6.5 is 5. The number of customers is a whole number, but time is measured continuously (e.g., 6.5 minutes after opening).
  3. For this outcome \(T_4\approx 5.1\). The path jumps to 4 a little after time 5, so the time at which the fourth customer arrives (when the counter jumps to 4) is about 5.1 minutes after open.
  4. For this outcome \(T_5\approx 5.9\). The path jumps to 5 a little before time 6, so the time at which the fifth customer arrives (when the counter jumps to 5) is about 5.9 minutes after open.
  5. For this outcome \(W_1\approx 2\). \(W_1\) is the waiting time from open until the first customer arrives, which seems to happen at about time 2 (when the counter jumps to 1).
  6. For this outcome \(W_5\approx 0.8\). \(W_5\) is the time elapsed between the arrival of the fourth (at time 5.1) and fifth (at time 5.9) customers, which is about 0.8 minutes.
Example 2.18 (Matching problem) So-called “matching problems” concern the following generic scenario. A set of \(n\) cards labeled \(1, 2, \ldots, n\) are placed in \(n\) boxes labeled \(1, 2, \ldots, n\), with exactly one card in each box. Typical questions of interest involve whether the number of a card matches the number of the box in which it is placed. (More colorful descriptions include returning babies at random to mothers or placing rocks at random back on a museum shelf.) Consider the case \(n=4\). Let \(Y\) be the number of cards (out of 4) which match the number of the box in which they are placed. For \(j=1, 2, 3, 4\), let \(I_j=1\) if card \(j\) is placed in box \(j\), and let \(I_j=0\) otherwise.
  1. Evaluate \(Y(1234)\), \(Y(1243)\), and \(Y(2143)\).
  2. Evaluate \(I_1(1234)\), \(I_1(1243)\), and \(I_1(2143)\).
  3. Evaluate \(I_2(1234)\), \(I_2(1243)\), and \(I_2(2143)\).
  4. Evaluate \(I_3(1234)\), \(I_3(1243)\), and \(I_3(2143)\).
  5. Evaluate \(I_4(1234)\), \(I_4(1243)\), and \(I_4(2143)\).
  6. Identify and interpret \(\{Y=4\}\).
  7. Identify and interpret \(\{Y=0\}\).
  8. Identify and interpret \(\{Y=3\}\).
  9. Identify and interpret \(\{I_1=1\}\).
  10. Identify and interpret \(\{I_3=1\}\).
  11. What is the relationship between \(Y\) and the \(I_j\)’s?
Solution to Example 2.18
  1. \(Y\) counts the number of matches so \(Y(1234)=4\) (all numbers match), \(Y(1243)=2\) (1 and 2 match, 3 and 4 do not), and \(Y(2143)=0\) (no numbers match).
  2. \(I_1=1\) only if card 1 is in the first position, so \(I_1(1234)=1\), \(I_1(1243)=1\), and \(I_1(2143)=0\).
  3. \(I_2=1\) only if card 2 is in the second position, so \(I_2(1234)=1\), \(I_2(1243)=1\), and \(I_2(2143)=0\).
  4. \(I_3=1\) only if card 3 is in the third position, so \(I_3(1234)=1\), \(I_3(1243)=0\), and \(I_3(2143)=0\).
  5. \(I_4=1\) only if card 4 is in the fourth position, so \(I_4(1234)=1\), \(I_4(1243)=0\), and \(I_4(2143)=0\).
  6. \(\{Y=4\}=\{1234\}\) is the event that all 4 cards match. Notice that the event is the set \(\{1234\}\), which consists of the single outcome \(1234\).
  7. \(\{Y=0\}=\{2143, 2341, 2413, 3142, 3412, 3421, 4123, 4312, 4321\}\) is the event that none of the cards match
  8. \(\{Y=3\}=\emptyset\) is the event in which exactly 3 cards match their box. There are no outcomes in which exactly 3 cards match; if three cards match, then the fourth card must necessarily match too.
  9. \(\{I_1=1\}=\{1234, 1243, 1324, 1342, 1423, 1432\}\) is the event that card 1 is placed in box 1. Since our sample space consists of the placements of each of the cards, each event must be expressed in terms of these outcomes.
  10. \(\{I_3=1\}=\{1234, 1432, 2134, 2431, 4132, 4231\}\) is the event that card 3 is placed in box 3. Since our sample space consists of the placements of each of the cards, each event must be expressed in terms of these outcomes.
  11. \(Y=I_1+I_2+I_3+I_4\). \(Y\) represents the total count of matches. The sum \(I_1+I_2+I_3+I_4\) starts with the first box and adds 1 to the count if there is a match (and 0 otherwise), and continues for each of the boxes, resulting in the total count of matches. For example, \(Y(1243) = 2 = 1 + 1 + 0 + 0 = I_1(1243)+I_2(1243)+I_3(1243)+I_4(1243)\). Notice that all the random variables are defined on the same probability space; that is, they have the same inputs. We expand on this idea further in the next subsection.

Random variables that only take two possible values, 0 and 1, (like \(I_1\) in the previous example) have a special name.

Definition 2.4 An indicator (a.k.a. Bernoulli) RV can take only the values 0 or 1. If \(A\) is an event then the indicator RV \(\textrm{I}_A\) is defined as \[ \textrm{I}_A(\omega) = \begin{cases} 1, & \omega \in A,\\ 0, & \omega \notin A \end{cases} \]

That is, \(\textrm{I}_A\) equals 1 if event \(A\) occurs, and \(\textrm{I}_A\) equals 0 if event \(A\) does not occur. Indicators provide the bridge between events and random variables. While simple, they can are very useful. For example, representing a count as a sum of indicator RVs as in Exercise 2.18 is a very common and useful strategy.

2.3.1 Transformations of random variables

A function of a random variable is also a random variable. That is, if \(X\) is a random variable and \(g:\mathbb{R}\mapsto\mathbb{R}\) is a function, then \(Y=g(X)\) is a random variable16.

For example, suppose a sample space consists of a set of circles of various sizes. If \(X\) is a random variable representing the radius of a circle selected from this sample space, then \(Y = \pi X^2\) is a random variable representing the area of a circle selected from this sample space. Here we can write \(Y=g(X)\) with \(g(u) = \pi u^2\).

Sums and products, etc., of random variables defined on the same probability space are random variables. That is, if random variables \(X\) and \(Y\) are defined on the same probability space then \(X+Y\), \(X-Y\), \(XY\), and \(X/Y\) are also random variables. Similarly, it is possible to make comparisons such as \(X\ge Y\) for random variables defined on the same probability space. The following example emphasizes what we mean be “defined on the same probability space”.

Example 2.19 Roll a four-sided die twice, and record the result of each roll in sequence. For example, the outcome \((3, 1)\) represents a 3 on the first roll and a 1 on the second; this is not the same outcome as \((1, 3)\). Let \(X_1\) be the result of the first roll, and \(X_2\) the result of the second.
  1. Evaluate \(X_1((3, 1))\) and \(X_2((3,1))\).
  2. If an outcome is represented by \(\omega=(\omega_1, \omega_2)\) (e.g. (3, 1)), specify the functions that define the random variables \(X_1\) and \(X_2\).
  3. Is \(X=X_1 + X_2\) a valid random variable? If so, identify the function that defines it.
  4. Evaluate \(X((3, 1))\).
Solution to Example 2.19
  1. \(X_1((3, 1))=3\) and \(X_2((3,1))=1\).
  2. Recall that there is a single sample space corresponding to the pairs of rolls, rather than a separate sample space for each of the individual rolls. Therefore, random variables need to be defined on the sample space corresponding to pairs of rolls. \(X_1((\omega_1, \omega_2))=\omega_1\) and \(X_2((\omega_1, \omega_2))=\omega_2\). That is, \(X_1\) maps an ordered pair to its first coordinate, and \(X_2\) to its second.
  3. Yes, \(X=X_1 + X_2\) is a valid random variable. For each outcome \((\omega_1, \omega_2)\), \(X\) returns a number: \(X((\omega_1, \omega_2))=X_1((\omega_1, \omega_2)) + X_2((\omega_1, \omega_2))=\omega_1+\omega_2\)
  4. \(X((3, 1)) = X_1((3, 1))+X_2((3,1))=3+1=4\).

The above example probably seems like notational overkill. And of course we could have defined \(X\) directly as \(X((\omega_1,\omega_2))=\omega_1+\omega_2\) without the need for \(X_1, X_2\). But we introduced the example to emphasize that it only makes sense to add random variables if they are defined on the same sample space. Remember that adding two random variables involves adding two functions, in the way that the function \(g = g_1 + g_2\) is defined by \(g(u) = g_1(u) + g_2(u)\). It only makes sense to add two functions together if they have the same inputs.

For example, consider the random variable \(X\) from Example 2.14 and the random variable \(Y\) from Example 2.18. It wouldn’t make much practical sense to consider \(X+Y\), but it would make no mathematical sense. How would \(X+Y\) even be defined? \(X\) is defined for sequences of coin flips like HHTH, while \(Y\) is defined for the shuffles in the matching problem like 2143. If you attempted to add \(X\) and \(Y\), which outcomes would go together? Would you add \(X(HHTH)\) to \(Y(2143)\)? Why not add \(X(HHTH)\) to \(Y(1234)\)? Again, adding two random variables involves adding two functions, and it doesn’t make sense to add those functions if they have different inputs.

As a more practical example, if \(X\) represents SAT math score and \(Y\) represents SAT verbal score then we might be interested in the total score \(X+Y\). Requiring \(X\) and \(Y\) to be defined on the same probability space is like requiring the scores to be measured for the same students. For example, Antwan has both a Math score \(X(\text{Antwan})\) and a Verbal score \(Y(\text{Antwan})\), so we can consider the total score \((X+Y)(\text{Antwan}) = X(\text{Antwan})+Y(\text{Antwan})\). (In statistical terms, the variables are measured for the same observational units.) It wouldn’t make sense to add SAT Math scores from one set of students to SAT verbal scores for a different set of students; for example \(X(\text{Antwan}) + Y(\text{Maria})\) makes no sense.

Example 2.20 Flip a fair coin four times and record the results in order, e.g. HHTT means two heads followed by two tails. Recall that in Section 1.4 we considered the proportion of the flips which immediately follow a H that result in H. Remember that we do not consider this proportion if no flips follow a H, i.e. the outcome is either TTTT or TTTH.

Let:

  • \(Z\) be the number of flips immediately following H.
  • \(Y\) be the number of flips immediately following H that result in H.
  • \(X\) be the proportion of flips immediately following H that result in H.
  1. Is \(X\) a random variable? How does it relate to \(Y\) and \(Z\)?
  2. For each of the possible outcomes in the sample space, find the value of \((Z, Y, X)\).
Solution to Example 2.20
  1. Yes, \(X\) is a random variable because it maps each coin flip sequence to the value of proportion of the flips which immediately follow a H that result in H for that sequence; see the table below. Also, \(X=Y/Z\). Technically \(Y\) and \(Z\) are not defined for outcomes TTTT and TTTH, but we’re ignoring those sequences for the purposes of investigating the proportion of the flips which immediately follow a H that result in H.
  2. In the outcomes below, the flips which follow head are in bold.
Table 2.1: Possible values of (1) \(Z\), the number of flips immediately following H, (2) \(Y\), the number of flips immediately following H that result in H, and (3) \(X\), the proportion of flips immediately following H that result in H, for four flips of a fair coin.
Outcome (\(\omega\)) \(Z\) \(Y\) \(X = Y/Z\)
HHHH 3 3 1
HHHT 3 2 2/3
HHTH 2 1 1/2
HTHH 2 1 1/2
THHH 2 2 1
HHTT 2 1 1/2
HTHT 2 0 0
HTTH 1 0 0
THHT 2 1 1/2
THTH 1 0 0
TTHH 1 1 1
HTTT 1 0 0
THTT 1 0 0
TTHT 1 0 0
TTTH 0 not defined not defined
TTTT 0 not defined not defined

  1. Technically, we have some collection \(\mathcal{F}\) of events of interest, and so we require sets like \(\{X\le x\}\) to be in \(\mathcal{F}\). This requirement is satified by requiring \(X\) to be an \(\mathcal{F}\)-measurable function.

  2. Throughout, we use \(g\) to denote a generic function, and reserve \(f\) to represent a probability density function. Likewise, we represent a generic function argument (or “dummy variable”) with \(u\), since \(x\) is often used to represent possible values of a random variable \(X\); in the RV scenario \(x\) typically represents the output of the function \(X\) rather than the input (which is a sample space outcome \(\omega\).)

  3. It’s easiest if we label a flip of heads as 1 and tails as 0. Represent an outcome \(\omega\) as \((\omega_1, \omega_2, \omega_3, \omega_4)\), where \(\omega_i\in\{0,1\}\) is the result of the \(i\)th flip. Then \(X(\omega)=\sum_{i=1}^{4} \omega_i\) represents the number of heads. For example, outcome HHTH would be represented as \((1, 1, 0, 1)\) and \(X((1, 1, 0, 1)) = 1 + 1 + 0 + 1 = 3\). This could be coded as sum(omega).

  4. Technically, we have some collection \(\mathcal{F}\) of events of interest, and so we require sets like \(\{X\le x\}\) to be in \(\mathcal{F}\). This requirement is satified by requiring \(X\) to be an \(\mathcal{F}\)-measurable function.

  5. \(Y(\omega) = g(X(\omega))\) so \(Y\) maps \(\Omega\) to \(\mathbb{R}\) via the composition of the functions \(g\) and \(X\); that is, \(Y=g\circ X\)