Chapter 4 Random variables

Chapter 3 developed a general framework for modeling random outcomes and events. This framework can be applied to any set of random outcomes, no matter how complex.

However, many of the random outcomes we are interested in are quantitative, that is, they can be described by a number. Quantitative outcomes are also called “random variables.” In addition to the basic tools of probability developed in Chapter 3, we have some extremely useful specialized tools for random variables. This chapter will develop these tools.

Chapter goals

In this chapter we will learn how to:

  • Calculate and interpret the CDF and PDF of a random variable, or several random variables.
  • Calculate and interpret the expected value of a discrete random variable from its PDF.
  • Calculate and interpret the variance and standard deviation of a discrete random variable from its PDF.
  • Work with common probability distributions including the Bernoulli, binomial, uniform and normal.

4.1 Introduction to random variables

A random variable is a number whose value depends on a random outcome. The idea here is that we are going to use a random variable to describe some (but not necessarily every) aspect of the outcome.

Random variables in roulette

Here are a few random variables we could define in a roulette game:

  • The original outcome \(b\).
  • An indicator for whether a bet on red wins: \[r = I(b \in Red)=\begin{cases}1 & b \in Red\\ 0 & b \notin Red \\ \end{cases}\]
  • The net payout from a $1 bet on red: \[ w_{red} = w_{red}(b) = \begin{cases} 1 & \textrm{ if } b \in Red \\ -1 & \textrm{ if } b \in Red^c \end{cases} \] That is, a player who bets $1 on red wins $1 if the ball lands on red and loses $1 if the ball lands anywhere else.
  • The net payout from a $1 bet on 14: \[ w_{14} = w_{14}(b) = \begin{cases} 35 & \textrm{ if } b = 14 \\ -1 & \textrm{ if } b \neq 14 \end{cases} \] That is, a player who bets $1 on 14 wins $35 if the ball lands on 14 and loses $1 if the ball lands anywhere else.

All of these random variables are defined in terms of the underlying outcome, but we can also define random variables in terms of other random variables. For example, we could have defined \(w_{red}\) as \(w_{red} = 2r-1\).

A random variable is always a function of the original outcome, but for convenience, we usually leave its dependence on the original outcome implicit, and write it as if it were an ordinary variable.

4.1.1 Probability distributions

A random variable has its own sample space (normally \(\mathbb{R}\)) and probability distribution. This probability distribution can be derived from the probability distribution of the underlying outcome.

Probability distributions for roulette

  • The probability distribution for \(b\) is: \[\Pr(b = 0) = 1/37 \approx 0.027\] \[\Pr(b = 1) = 1/37 \approx 0.027\] \[\vdots\] \[\Pr(b = 36) = 1/37 \approx 0.027\] All other values of \(b\) have probability zero.
  • The probability distribution for \(w_{red}\) is: \[\Pr(w_{red} = 1) = \Pr(b \in Red) = 18/37 \approx 0.486\] \[\Pr(w_{red} = -1) = \Pr(b \notin Red) = 19/37 \approx 0.514\] All other values of \(w_{red}\) have probability zero.
  • The probability distribution for \(w_{14}\) is: \[\Pr(w_{14} = 35) = \Pr(b = 14) = 1/37 \approx 0.027\] \[\Pr(w_{14} = -1) = \Pr(b \neq 14) = 36/37 \approx 0.973\] All other values of \(w_{14}\) have probability zero.

Notice that these random variables are related to each other since they all depend on the same underlying outcome. Section 6.1 will explain how we can describe and analyze those relationships.

4.1.1.1 The support

The support of a random variable \(x\) is the smallest3 set \(S_x \subset \mathbb{R}\) such that \(\Pr(x \in S_x) = 1\).

In plain language the support is the set of all values in the sample space that have some chance of actually happening.

The support in roulette

The support is just the set of values with non-zero probability:

  • The support of \(b\) is \(S_{b} = \{0,1,2,\ldots,36\}\).
  • The support of \(w_{Red}\) is \(S_{Red} = \{-1,1\}\).
  • The support of \(w_{14}\) is \(S_{14} = \{-1,35\}\).

The random variables we have considered so far have discrete support. That is, the support is a set of isolated points each of which has a strictly positive probability. But not all random variables have a discrete support. That will complicate the math quite a bit, as we will need to use calculus.

4.1.2 The PDF and CDF

The PDF and CDF are both functions that allow us to describe the probability distribution of a random variable.

4.1.2.1 The PDF of a discrete random variable

We can describe the probability distribution of a random variable with a function called its probability density function (PDF).

The PDF of a discrete random variable is defined as: \[f_x(a) = \Pr(x = a)\] where \(a\) is any number. By convention, we typically use a lower-case \(f\) to represent a PDF, and we use the subscript when needed to clarify which specific random variable we are talking about.

The PDF in roulette

Our three random variables are all discrete, and each has its own PDF:

\[f_b(a) = \Pr(b = a) = \begin{cases} 1/37 & a \in \{0,1,\ldots,36\} \\ 0 & a \notin \{0,1,\ldots,36\} \\ \end{cases}\] \[f_{red}(a) = \Pr(w_{red} = a) = \begin{cases} 19/37 & a = -1 \\ 18/37 & a = 1 \\ 0 & a \notin \{-1,1\} \\ \end{cases}\] \[f_{14}(a) = \Pr(w_{14} = a) = \begin{cases} 36/37 & a = -1 \\ 1/37 & a = 35 \\ 0 & a \notin \{-1,35\} \\ \end{cases}\] Figure 4.1 below shows these three PDFs.
PDFs for the roulette example

Figure 4.1: PDFs for the roulette example

We can calculate any probability from the PDF by simple addition. That is: \[\Pr(x \in A) = \sum_{s \in S_x} f_x(s)I(s \in A)\] where4 \(A \subset \mathbb{R}\) is any event defined for \(x\).

Some event probabilities in roulette

Since the outcome in roulette is discrete, we can calculate any event probability by adding up the probabilities of the event’s outcomes.

The probability of the event \(b \leq 3\) can be calculated: \[\begin{align} \Pr(b \leq 3) &= \sum_{s=0}^{36}f_x(s)I(s \leq 3) \\ &= f_b(0) + f_b(1) + f_b(2) + f_b(3) \\ &= 4/37 \end{align}\]

The probability of the event \(b \in Even\) can be calculated: \[\begin{align} \Pr(b \in Even) &= \sum_{s=0}^{36}f_x(s)I(s \in Even) \\ &= f_b(2) + f_b(4) + \cdots + f_b(36) \\ &= 18/37 \end{align}\]

The PDF of a discrete random variable has several general properties:

  1. It is always between zero and one: \[0 \leq f_x(a) \leq 1\] since it is a probability.
  2. It sums up to one over the support: \[\sum_{a \in S_x} f_x(a) = \Pr(x \in S_x) = 1\] since the support has probability one by definition.
  3. It is strictly positive for all values in the support: \[a \in S_x \implies f_x(a) > 0\] since the support is the smallest set that has probability one.

We can prove these, but I will skip that.

4.1.2.2 The CDF

Another way to describe the probability distribution of a random variable is with a function called its cumulative distribution function (CDF). The CDF is a little less intuitive than the PDF, but it has the advantage that it always has the same definition whether or not the random variable is discrete.

The CDF of the random variable \(x\) is the function \(F_x:\mathbb{R} \rightarrow [0,1]\) defined by: \[F_x(a) = Pr(x \leq a)\] where \(a\) is any number. By convention, we typically use an upper-case \(F\) to indicate a CDF, and we use the subscript to indicate what random variable we are talking about.

The CDF has several properties:

  1. It always lies between zero and one: \[0 \leq F_x(a) \leq 1\] since it is a probability.
  2. It starts at zero and ends at one: \[F_x(-\infty) = \Pr(x \leq -\infty) = 0\] \[F_x(\infty) = \Pr(x \leq \infty) = 1\]
  3. It is non-decreasing. That is, for any \(a_1 \leq a_2\) \[F_x(a_1) \leq F_x(a_2)\] This is because the event \(x \leq a_2\) implies the event \(x \leq a_1\), so it must be at least as probable.
  4. For any \(a_1 < a_2\), \[\Pr(a_1 < x \leq a_2) = F_x(a_2) - F_x(a_1)\]

As I said earlier, the CDF is well-defined and has these properties whether \(x\) is discrete or continuous.

If a random variable is discrete, we can construct its CDF by just adding up the PDF: \[\begin{align} F_x(a) &= \Pr(x \leq a) \\ &= \sum_{s \in S_x} f_x(s)I(s \leq a) \end{align}\] This formula leads to a “stair-step” appearance: the CDF is flat for all values outside of the support, and then jumps up at all values in the support.

CDFs for roulette

  • The CDF of \(b\) is: \[F_b(a) = \begin{cases} 0 & a < 0 \\ 1/37 & 0 \leq a < 1 \\ 2/37 & 1 \leq a < 2 \\ \vdots & \vdots \\ 36/37 & 35 \leq a < 36 \\ 1 & a \geq 36 \\ \end{cases}\]
  • The CDF of \(w_{red}\) is: \[F_{red}(a) = \begin{cases} 0 & a < -1 \\ 19/37 & -1 \leq a < 1 \\ 1 & a \geq 1 \\ \end{cases}\]
  • The CDF of \(w_{14}\) is: \[F_{14}(a) = \begin{cases} 0 & a < -1 \\ 36/37 & -1 \leq a < 35 \\ 1 & a \geq 35 \\ \end{cases}\]
Figure 4.2 below graphs these CDFs.
CDFs for the roulette example

Figure 4.2: CDFs for the roulette example

Notice that they show all of the general properties described above. In addition, they all have a distinctive “stair-step” shape, jumping up at each point in \(S_x\) and flat between those points, This is a general property of CDFs for discrete random variables.

We can also go the other way, and construct the PDF of a discrete random variable from its CDF. Each little jump in the CDF is a point in the support, and the size of the jump is exactly equal to the PDF.

In more formal mathematics, the formula for deriving the PDF of a discrete random variable from its CDF would be written:

\[f_x(a) = \lim_{\epsilon \rightarrow 0} F_x(a) - F_x(a-|\epsilon|)\] but we can just think of it as the size of the jump.

4.1.2.3 Continuous random variables

So far we have considered random variables with a discrete support. However, many random variables of interest have a continuous support: they can take on any real value within some range.

For example, Canada produced 31.251 million metric tons of wheat in 2019. If we think of that number as a random variable, it’s clear that this number could have been 31.252 million metric tons if circumstances were different. It also could have been any number between those numbers, for example 31.2511 million or 31.2517 million.

A continuous random variable has the property that the the probability of any specific value is zero: \[\Pr(x=a) = 0\] Now this creates something of a paradox: by the rules of probability the probability that \(x\) takes on some value is \(\Pr(x \in \mathbb{R}) = 1\) but the probability that \(x\) takes on any specific value is zero. How can this work?

I’ll explain how it works with an example.

The standard uniform distribution

Consider a random variable \(x\) that has the standard uniform distribution. What that means is that:

  1. The support of \(x\) is the range \([0,1]\).
  2. All values in this range are equally likely.
The CDF of the standard uniform distribution is: \[F_x(a) = \Pr(x \leq a) = \begin{cases} 0 & a < 0 \\ a & a \in [0,1] \\1 & a > 1 \\ \end{cases}\] Figure 4.3 below shows the CDF of the standard uniform distribution.
CDF for the standard uniform distribution

Figure 4.3: CDF for the standard uniform distribution

As we have seen, the CDF of a discrete random variable rises in a “stair-step” manner. In contrast, the standard uniform CDF rises smoothly with no jumps. All continuous random variables have a CDF with this property.

Let \(a_1\) and \(a_2\) be any two numbers between 0 and 1, and let \(a_1 < a_2\). Then the probability of \(x\) being between \(a_1\) and \(a_2\) is: \[\Pr(a_1 < x \leq a_2) = F_x(a_2) - F_x(a_1) = a_2 - a_1\] As \(a_2\) gets closer and closer to \(a_1\) this number gets closer and closer to zero, so the probability of \(x\) being exactly \(a_1\) is zero.

The PDF of a continuous random variable is defined as just the derivative of its CDF: \[f_x(a) = \frac{dF_x(a)}{da}\]

The PDF of the standard uniform distribution

The PDF of a standard uniform random variable is: \[f_x(a) = \begin{cases} 0 & a < 0 \\ 1 & a \in [0,1] \\ 0 & a > 1 \\ \end{cases}\] which looks like this:
PDF for the standard uniform distribution

Figure 4.4: PDF for the standard uniform distribution

Now, in order to work with continuous random variables we would need to use integral calculus. Integral calculus is taught in MATH 158, which is not a prerequisite for the course, So:

  • Most of my examples will be for discrete case.
  • I will briefly show you the math for the continuous case, but I will not expect you to do it.
  • Most of the results I give you will apply for both cases.

Integral calculus for continuous random variables

I have defined the PDF for a continuous random variable based on its CDF, but we can also go the other way and calculate the CDF from the PDF. The formula for that calculation is: \[F_x(a) = \int_{-\infty}^a f_x(v)dv\] More generally the probability of \(x\) being between any two numbers is:

\[\Pr(a \leq x \leq b) = F_x(b) - F_x(a) = \int_a^b f_x(v)dv\]

Unless you have taken MATH 152 or MATH 158, you may have no idea what this is or how to solve it. That’s OK! All you need to know for this course is that it can be solved.

4.2 The properties of a random variable

The probability distribution of a random variable is fully described by its PDF or CDF. However, we will often be interested in describing the random variable with a few simple summary numbers.

For example, we might be interested in the most common value (also called the mode), or we might be interested in a “typical” value, or we might be interested in a simple measure of how much the random variable tends to vary. All of these quantities can be defined and calculated from on the PDF or CDF.

4.2.1 The mode

Roughly speaking, the mode of a random variable is its most likely value (i.e., the value with the highest PDF).

The mode in roulette

  • The mode of \(w_{red}\) is \(-1\). That is, losing $1 is the most likely outcome.
  • The mode of \(w_{14}\) is also \(-1\).

What is the mode of \(b\), since it takes on one of 37 equally-likely values? In order to answer that, we would need to define the mode more precisely than just “the most likely outcome.” But this is an introductory course, so let’s just leave it at that.

4.2.2 Range

The range of a random variable is the interval from the minimum value in its support to the maximum value in the support.

The range in roulette

  • The range of \(w_{red}\) is \([-1,1]\).
  • The range of \(w_{14}\) is \([-1,35]\).
  • The range of \(b\) is \([0,36]\).

4.2.3 Medians, quantiles and percentiles

Roughly speaking, the median of a random variable is the value in the “middle”of the distribution, i.e., we have the same probability of being above the median as we do of being below the median.

As with the mode, this simple definition works fine in some cases but not others. But in this case, let’s fine-tune the definition so it always works.

We will start by defining something more general called a quantile or percentile. Let \(x\) be a random variable with CDF \(F_x\), and let \(\alpha\) be any number between zero and one. Then the \(\alpha\) quantile of \(x\) is defined as:

\[q_a(x) = \min\{a: \Pr(x \leq a) \geq \alpha\} = \min\{a: F_x(a) \geq \alpha\}\] The \(\alpha\) quantile of a distribution is also called the \(100*\alpha\) percentile; for example the 0.25 quantile of \(x\) is also called the 25th percentile of \(x\).

Quantiles in roulette

The CDF of \(w_{red}\) is: \[F_{red}(a) = \begin{cases}0 & a < -1 \\ 0.514 & -1 \leq a < 1 \\ 1 & a \geq 1 \\ \end{cases}\]

So its 0.25 quantile (25th percentile) is: \[q_{0.25}(w_{red}) = \min\{a: \Pr(w_{red} \leq a) \geq 0.25\} = \min [-1,1] = -1\] its 0.5 quantile (50th percentile) is: \[q_{0.5}(w_{red}) = \min\{a: \Pr(w_{red} \leq a) \geq 0.5\} = \min [-1,1] = -1\] and its 0.75 quantile (75th percentile) is: \[q_{0.75}(w_{red}) = \min\{a: \Pr(w_{red} \leq a) \geq 0.75\} = \min\{1\} = 1\]

Having precisely defined quantiles and percentiles, we can now precisely define the median: the median is the 0.5 quantile or 50th percentile.

The median in roulette

The median of \(w_{red}\) is just its 0.5 quantile or 50th percentile: \[median(w_{red}) = q_{0.5}(w_{red}) = -1\]

4.2.4 The expected value

Both the median and mode are often interpreted as measures of a random variable’s “central tendency”, or as predictions of its “typical” value. The expected value (also called the mean) is another measure of central tendency.

4.2.4.1 Definition of expected value

The expected value of a random variable \(x\) is written \(E(x)\). When \(x\) is discrete, it is defined as:

\[E(x) = \sum_{a \in S_x} a\Pr(x=a) = \sum_{a \in S_x} af_x(a)\] The expected value is also called the mean, the population mean or the expectation of the random variable.

The expected value in roulette

The expected value of \(b\) is: \[E(b) = 0*\underbrace{f_b(0)}_{1/37} + 1*\underbrace{f_b(1)}_{1/37} + \cdots 36*\underbrace{f_b(36)}_{1/37} = 18\]

The expected value of \(w_{red}\) is: \[E(w_{red}) = -1*\underbrace{f_{red}(-1)}_{19/37} + 1*\underbrace{f_{red}(1)}_{18/37} \approx -0.027\] The expected value of \(w_{14}\) is: \[E(w_{14}) = -1*\underbrace{f_{14}(-1)}_{36/37} + 35*\underbrace{f_{14}(35)}_{1/37} \approx -0.027\] That is, each dollar bet on red leads to an average loss of 2.7 cents for the bettor, as does each dollar bet on 14.

We can think of the expected value as a weighted average of its possible values, with each value weighted by the probability of observing that value.

The expected value for a continuous random variable

When \(x\) is continuous, its expected value is defined as: \[E(x) = \int_{-\infty}^{\infty} af_x(a)da\] Notice that this looks just like the definition for the discrete case, but with the sum replaced by an integral sign.

There is even a general definition that covers both discrete and continuous variables, as well as any mix between them: \[E(x) - \int_{-\infty}^{\infty} a dF_x(a)\] Again, I do not expect you to understand, remember, or use either of these definitions, only to know that they exist.

4.2.4.2 Properties of the expected value

In addition to taking the expected value of \(x\), we can also take the expected value of any function of \(x\): \[E(g(x)) = \sum_{s \in S_x} g(s)\Pr(x = s) = \sum_{s \in S_x} g(s)f_x(s)\] Now, remember that the expected value is a sum, and so it has some of the same properties as sums. In particular, the associative and distributive rules apply, which means: \[E(a + bx) = a + bE(x)\] That is, we can take the expected value “inside” any linear function. This will turn out to be a very handy property.

Unfortunately, this handy property applies only to linear functions. If \(g(\cdot)\) is not a linear function than \(E(g(x)) \neq g(E(x))\). For example: \[E(x^2) \neq E(x)^2\] \[E( 1/x ) \neq 1 / E(x)\] Students frequently make this mistake, so try to avoid it.

Using the linearity of the expected value

Suppose we bet $100 on red. Our net payout will be \(100w_{red}\) so our expected payout will be: \[E(100 w_{red}) = 100 \, E(w_{red}) = 100*(-0.027) = -2.7\] In other words, the average result of a $100 bet on red is a loss of $2.70.

4.2.5 Variance and standard deviation

The mode, median and expected value all aim to describe a typical or central value of the random variable. We are also interested in measures of how much the random variable varies. We have already seen one - the range - but there are others, including the variance and standard deviation.

The variance of a random variable \(x\) is defined as: \[\sigma_x^2 = var(x) = E((x-E(x))^2)\] The standard deviation of a random variable is defined as the (positive) square root of its variance. \[\sigma_x = sd(x) = \sqrt{var(x)}\] Both variance and standard deviation can be thought of as measures of how much \(x\) tends to deviate from its central tendency \(E(x)\).

Variance and standard deviation in roulette

The variance of \(r\) is: \[var(r) = (0-\underbrace{E(r)}_{18/37})^2 *\frac{19}{37} + (1-\underbrace{E(r)}_{18/37})^2 * \frac{18}{37} \approx 0.25\] and its standard deviation is: \[sd(r) = \sqrt{var(r)} \approx 0.5\]

The variance of \(w_{red}\) is: \[var(w_{red}) = (-1-\underbrace{E(w_{red})}_{\approx 0.027})^2 * \frac{19}{37} + (1-\underbrace{E(w_{red})}_{\approx 0.027})^2 * \frac{18}{37} \approx 1.0\] and its standard deviation is \[sd(w_{red}) = \sqrt{var(w_{red})} \approx 1.0\]

The variance of \(w_{14}\) is \[var(w_{14}) = (-1-\underbrace{E(w_{14})}_{\approx 0.027})^2 * \frac{36}{37} + (35-\underbrace{E(w_{14})}_{\approx 0.027})^2 * \frac{1}{37} \approx 34.1\] and its standard deviation is \[sd(w_{14}) = \sqrt{var(w_{14})} \approx 5.8\] That is, a bet on 14 has the same expected payout as a bet on red, but its payout is much more variable.

The variance is the expected value (sum) of a square, which implies several standard properties:

  • It is always non-negative: \[var(x) \geq 0\] \[sd(x) \geq 0\]
  • For any constants \(a\) and \(b\): \[var(a +bx) = b^2 var(x)\] \[sd(a +bx) = b \, sd(x)\]
  • The variance can be written as: \[var(x) = E(x^2) - E(x)^2\]

We can easily derive these properties but we will skip that now.

4.3 Standard distributions

Some probability distributions appear so often in applications that we have given them names. We will go through a few of the most important ones below.

4.3.1 Discrete uniform

The discrete uniform distribution is a distribution with that puts equal probability on every value in a discrete set \(S_x\). Its PDF is: \[f_x(a) = \begin{cases} 1/|S_x| & a \in S_x \\ 0 & a \notin S_x \\ \end{cases}\] Discrete uniform distributions appear in gambling and similar applications.

The discrete uniform distribution in roulette

In our roulette example, the outcome \(b\) has a discrete uniform distribution on \(\Omega = \{0,1,\ldots,36\}\).

4.3.2 Bernoulli

The Bernoulli probability distribution is usually written: \[x \sim Bernoulli(p)\] It has discrete support \(S_x = \{0,1\}\) and PDF: \[f_x(a) = \begin{cases} (1-p) & a = 0 \\ p & a = 1 \\ 0 & a = \textrm{anything else}\\ \end{cases}\] We typically use Bernoulli random variables to model the probability of some event \(A\). If we define \(x\) as the indicator variable \(x=I(A)\), then \(x \sim Bernoulli(p)\) where \(p=\Pr(A)\).

The mean and variance of a \(Bernoulli(p)\) random variable are: \[E(x) = (1-p)*0 + p*1 = p\] \[var(x) = E[(x-E(x))^2] = E[(x-p)^2] = (1-p)(0-p)^2 + p(1-p)^2 = p(1-p)\]

The Bernoulli distribution in roulette

The variable \(r = I(Red)\) has the \(Bernoulli(18/37)\) distribution.

4.3.3 Binomial

The binomial probability distribution is usually written: \[x \sim Binomial(n,p)\] It has discrete support \(S_x = \{0,1,2,\ldots,n\}\) and its PDF is: \[f_x(a) = \begin{cases} \frac{n!}{a!(n-a)!} p^a(1-p)^{n-a} & a \in S_x \\ 0 & \textrm{anything else} \\ \end{cases}\] The binomial distribution is typically used to model frequencies or counts.

Let \((b_1,b_2,\ldots,b_n)\) be a sequence of \(n\) independent random variables from the \(Bernoulli(p)\) distribution and let: \[x = \sum_{i=1}^n b_i\] count up the number of times that \(b_i\) is equal to one (i.e., the event modeled by \(b_i\) happened). Then it is possible to derive the distribution for \(y\), and it turns out to be \(Binomial(n,p)\).

I won’t derive the formula for the binomial PDF, but the intuition is simple: \(\frac{n!}{a!(n-a)!}\) is the number of outcomes in which \(x=a\) and the \(p^a(1-p)^{n-a}\) is the probability of each of those outcomes.

The mean and variance of a binomial random variable are: \[E(x) = np\] \[var(x) = np(1-p)\]

The binomial distribution in roulette

Suppose we play 50 games of roulette, and bet on red in every game. Let \(WIN50\) be the number of times we win.

Since the outcome of a single bet on red is \(r \sim Bernoulli(18/37)\), this means that \(WIN50 \sim Binomial(50,18/37)\).

4.3.4 Uniform and standard uniform

The uniform probability distribution is usually written \[x \sim U(L,H)\] where \(L < H\). It is a continuous probability distribution with support \(S_x = [L,H]\) and PDF: \[f_x(a) = \begin{cases}\frac{1}{H-L} & a \in S_x \\ 0 & \textrm{otherwise} \\ \end{cases}\] The uniform distribution puts equal probability on all values between \(L\) and \(H\). We have already seen the standard uniform distribution, which is just the \(U(0,1)\) distribution.

Uniform distributions are commonly used by computers because:

  • It is easy for a computer to generate a random number from the standard uniform distribution.
  • You can generate a random variable with any probability distribution you like by following these steps:
    1. Generate a random variable \(q \sim U(0,1)\).
    2. Calculate \(x = F^{-1}(q)\) where \(F^{-1}\) is the inverse CDF of the distribution you want.

Every video game you have ever played is constantly generating \(U(0,1)\) random numbers and using them to determine the behavior of non-player characters, the location of resources, etc. Without that element of randomess, these games would be way too predictable to be much fun.

The mean and variance of the \(U(L,H)\) distribution are: \[E(x) = \frac{L+H}{2}\] \[var(x) = \frac{(H-L)^2}{12}\] As with all continuous random variables, these calculations would require integration which is beyond the scope of this course.

4.3.5 Normal and standard normal

The normal distribution is typically written as: \[ x \sim N(\mu,\sigma^2)\] It is a continuous distribution with support \(S_x = \mathbb{R}\) and PDF: \[f_x(a) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(a-\mu)^2}{2\sigma}}\] The normal distribution is also called the Gaussian distribution.

The normal distribution looks strange, but it turns out to be a very important one in statistics for two reasons:

  1. Any linear function of a normally distributed random variable is also normally distributed. That is, suppose that \(x \sim N(\mu,\sigma^2)\) and let \(y = a + bx\) where \(a\) and \(b\) are any constants. Then \(y \sim N(a+b\mu,b^2\sigma^2)\).
  2. A very important result called the Central Limit Theorem tells us that many random variables have a distribution that is well-approximated by the normal distribution. We will discuss this in much more detail later.

The mean and variance of a \(N(\mu,\sigma^2)\) random variable are: \[E(x) = \mu\] \[var(x) = \sigma^2\]

The \(N(0,1)\) distribution is also called the standard normal distribution The standard normal distribution is so useful that we have special symbol for its PDF: \[\phi(a) = \frac{1}{\sqrt{2\pi}} e^{-\frac{a^2}{2}}\] and its CDF: \[\Phi(a) = \int_{-\infty}^a \phi(b)db\] The standard normal CDF \(\Phi(.)\) does not have a closed form solution, but is easy to calculate on a computer and is available as a built-in function in Excel, R or any other program used to analyze data.

Why is this useful? Well remember that linear functions of normal random variables are also normal. This will allow us to calculate the CDF of any \(N(\mu,\sigma^2)\) random variable using the standard normal CDF.

Consider a random variable \(x \sim N(\mu,\sigma^2)\). Define another random variable \(z = \frac{x-\mu}{\sigma}\). Then: \[z \sim N\left(\mu*\frac{1}{\sigma}- \frac{\mu}{\sigma},\sigma^2*\left(\frac{1}{\sigma}\right)^2\right)\] or equivalently \(z \sim N(0,1)\).

This implies: \[\begin{align} F_x(a) &= \Pr\left(x \leq a\right) \\ &= \Pr\left( \frac{x-\mu}{\sigma} \leq \frac{a-\mu}{\sigma}\right)\\ &= \Pr\left( z \leq \frac{a-\mu}{\sigma}\right) \\ &= \Phi\left(\frac{a-\mu}{\sigma}\right) \end{align}\] Since the standard normal CDF is available as a built-in function in Excel or R, so we can use this result to calculate the CDF for any normally distributed random variable.


  1. Technically, it is the smallest closed set, but let’s ignore that for now.↩︎

  2. If you are unfamiliar with the notation here, please refer to Section A.3.4 in the Math Review Appendix.↩︎