21 Expected Value
- The distribution of a random variable specifies the possible values and the probability of any event that involves the random variable.
- Characteristics of distributions based on long run averages can be defined as “expected” values.
Example 21.1 Recall the matching problem with
The distribution of
x | P(X=x) |
---|---|
0 | 0.3750 |
1 | 0.3333 |
2 | 0.2500 |
4 | 0.0417 |
- Describe two ways for simulating values of
.
- The table below displays 10 simulated values of
. How could you use the results of this simulation to approximate the long run average value of ? How could you get a better approximation of the long run average?
Repetition | Y |
---|---|
1 | 0 |
2 | 1 |
3 | 0 |
4 | 0 |
5 | 2 |
6 | 0 |
7 | 1 |
8 | 1 |
9 | 4 |
10 | 2 |
- Rather than adding the 10 values and dividing by 10, how could you simplify the calculation in the previous part?
- The table below summarizes 24000 simulated values of
. Approximate the long run average value of .
Value of X | Number of repetitions |
---|---|
0 | 8979 |
1 | 7993 |
2 | 6068 |
4 | 960 |
- Recall the distribution of
. What would be the corresponding mathematical formula for the theoretical long run average value of ? This number is called the “expected value” of .
- Is the expected value the most likely value of
?
- Is the expected value of
the “value that we would expect” on a single repetition of the phenomenon?
- Explain in what sense the expected value is “expected”.
Example 21.2
Let
- How could you use simulation to approximate the long run average value of
?
- Suppose the values of
are truncated1 to integers. That is, 0.73 is recorded as 0, 1.15 is recorded as 1, 2.999 is recorded as 2, 3.001 is recorded as 3, etc. The following table summarizes 10000 simulated values of , truncated. Using just these values, how would you approximate the long run average value of ?
Truncated value of X | Number of repetitions |
---|---|
0 | 6302 |
1 | 2327 |
2 | 915 |
3 | 287 |
4 | 94 |
5 | 43 |
6 | 22 |
7 | 5 |
8 | 4 |
9 | 1 |
- How could you approximate the probability that the truncated value of
is 0? 1? 2? Suggest a formula for the (approximate) long run average value of . (Don’t worry if the approximation isn’t great; we’ll see how to improve it.)
- Truncating to the nearest integer turns out not to yield a great approximation of the long run average value of
. How could we get a better approximation?
- Suppose instead of truncating to an integer, we truncate to the first decimal. For example 0.73 is recorded as 0.7, 1.15 is recorded as 1.1, 2.999 is recorded as 2.9, 3.001 is recorded as 3.0, etc. Suggest a formula for the (approximate) long run average value of
.
- We can continue in this way, truncating to the second decimal place, then the third, and so on. Considering what happens in the limit, suggest a formula for the theoretical long run average value of
.
- The expected value (a.k.a. expectation a.k.a. mean), of a random variable
defined on a probability space with measure , is a number denoted representing the probability-weighted average value of . Expected value is defined as - Note well that
represents a single number. - The expected value is the “balance point” (center of gravity) of a distribution.
- The expected value of a random variable
is defined by the probability-weighted average according to the underlying probability measure. But the expected value can also be interpreted as the long-run average value, and so can be approximated via simulation. - Read the symbol
as- Simulate lots of values of what’s inside
- Compute the average. This is a “usual” average; just sum all the simulated values and divide by the number of simulated values.
- Simulate lots of values of what’s inside
Example 21.3
Let
- Donny Dont says
. Do you agree?
- Compute
.
- Compute
.
- Compute
.
- Find the median value (50th percentile) of
. Is the median less than, greater than, or equal to the mean? Why does this make sense?
Example 21.4 Recall Example 15.5 in which we assume that
- Recall from Example 15.5 that
. Evaluate the pmf for and use arithmetic to compute . (This will technically only give an approximation, since there is non-zero probability that , but the calculation will give you a concrete example before jumping to the next part.)
- Use the pmf and infinite series to compute
.
- Interpret
in context.
21.1 “Law of the unconscious statistician” (LOTUS)
Example 21.5
Flip a coin 3 times and let
- Find the distribution of
.
- Compute
.
- How could we have computed
without first finding the distribution of ?
- Is
equal to ?
- The “law of the unconscious statistician” (LOTUS) says that the expected value of a transformed random variable can be found without finding the distribution of the transformed random variable, simply by applying the probability weights of the original random variable to the transformed values.
- LOTUS says we don’t have to first find the distribution of
to find ; rather, we just simply apply the transformation to each possible value of and then apply the corresponding weight for to . - Whether in the short run or the long run, in general
- In terms of expected values, in general
The left side represents first transforming the values and then averaging the transformed values. The right side represents first averaging the values and then plugging the average (a single number) into the transformation formula.
Example 21.6
Let
- Find
using the distribution of and the definition of expected value. Remember: if we did not have the distribution of , we would first have to derive it as in Example 18.7.
- Describe how to use simulation to approximate
, in a way that is analogous to the method in the previous part.
- Find
using LOTUS.
- Describe how to use simulation to approximate
, in a way that is analogous to the method in the previous part.
- Is
equal to ?
Example 21.7
We want to find
We could also round to the nearest integer. Whether we truncate or round won’t matter as we consider what happens in the limit.↩︎