6 Marginal Distributions
- The (probability) distribution of a collection of random variables identifies the possible values that the random variables can take and their relative likelihoods.
- We will see many ways of describing a distribution, depending on how many random variables are involved and their types (discrete or continuous).
- In the context of multiple random variables, the distribution of any one of the random variables is called a marginal distribution.
6.1 Discrete random variables
- The probability distribution of a single discrete random variable
is often displayed in a table (or plot) containing the probability of the event for each possible value . - In some cases, a distribution has a “formulaic” shape. For a discrete random variable
, the “probability mass function” expresses as a function of .
Example 6.1 Roll a fair four-sided die twice. Let
- Construct a table and plot displaying the marginal distribution of
.
- Construct a table and plot displaying the marginal distribution of
.
- Describe the distribution of
in terms of long run relative frequency.
- Describe the distribution of
in terms of relative degree of likelihood.
6.2 Simulating from a marginal distribution
- Any marginal distribution can be represented by a single spinner.
- In principle, there are always two ways of simulating a value
of a random variable .- Simulate from the probability space. Simulate an outcome
from the underlying probability space and set . - Simulate from the distribution. Construct a spinner corresponding to the distribution of
and spin it once to generate .
- Simulate from the probability space. Simulate an outcome
- The second method requires that the distribution of
is known. However, as we will see in many examples, it is common to specify the distribution of a random variable directly without defining the underlying probability space.
Example 6.2 Continuing Example Example 6.1.
Construct a spinner to represent the marginal distribution of
.
Describe two methods for simulating a value of
.
Suppose you also construct a spinner corresponding to the marginal distribution of
. Could you simulate an pair by spinning each of the spinners once? Explain.
6.3 Continuous random variables
- Simulated values of a continuous random variable are usually plotted in a histogram which groups the observed values into “bins” and plots densities or frequencies for each bin.
- In a histogram areas of bars represent relative frequencies; the axis which represents the height of the bars is called “density”.
Example 6.3 A continuous random voltage
Use simulation to approximate
. What do you notice? Why does this make sense?
Compute
, the probability that rounded to two decimal places is equal to 5. Compare with a simulation-based approximation.
Compute
, the probability that rounded to three decimal places is equal to 5. Compare with a simulation-based approximation.
Explain why
. Also explain, why this is not a problem in practical applications.
- The probability that a continuous random variable
equals any particular value is 0. That is, if is continuous then for all . - Even though any specific value of a continuous random variable has probability 0, intervals still can have positive probability.
- In practical applications involving continuous random variables, there is some reasonable degree of precision, “equal to” really means “close to”, and “close to” probabilities do correspond to intervals which can have positive probability.
- The marginal distribution of a continuous random variable can be described by a probability density function, for which areas under the density curve determine probabilities.
6.4 Normal distributions
- Normal distributions follow a particular bell-shaped curve which corresponds to a very specific pattern of variation.
Example 6.4 Recall the meeting problem. Suppose that arrival times (minutes after noon) follow a Normal distribution with mean 30 and standard deviation 10, as represented by the spinner in Figure 5.2. Suppose we spin this spinner many times.
About what percent of values would be below 30? Above 30?
About what percent of values would be between 20 and 30? Between 30 and 40?
How would the shape of the distribution below 30 compare to that above 30?
About what percent of values would be between 10 and 50?
Sketch a plot of the distribution of
. (A very rough sketch is fine for now, but be sure to determine where the density would be higher or lower.) Then run a simulation and sketch the simulated distribution of .
6.5 Percentiles
- A distribution is characterized by its percentiles.
- Roughly, the value
is the th percentile (a.k.a. quantile) of a distribution of a random variable if percent of values of the variable are less than or equal to : . - A spinner basically describes a distribution by specifying all the percentiles. For example,
- The 25th percentile goes 25% of the way around the axis (at “3 o’clock”)
- The 50th percentile goes 50% of the way around the axis (at “6 o’clock”)
- The 75th percentile goes 75% of the way around the axis (at “9 o’clock”)
Example 6.5 Recall the spinner that represents the Normal(30, 10) distribution. According to this distribution:
What percent of values are less than 23.26?
What is the 25th percentile?
What is the 75th percentile?
A value of 40 corresponds to what percentile?
6.6 Transformations
- Many random variables are derived as transformations of other random variables.
- A function of a random variable is a random variable: if
is a random variable and is a function then is a random variable. - In general, the distribution of
will have a different shape than the distribution of .- The exception is when
is a linear rescaling.
- The exception is when
Example 6.6 Let
Let
. Sketch a spinner corresponding to the distribution of .
Sketch a plot of the distribution of
. (A very rough sketch is fine for now, but be sure to determine where the density would be higher or lower.) Then run a simulation and sketch the simulated distribution of .