4.5 Random Variate Generation
Up to this point we have investigated how to generate numbers between 0 and 1 and how to assess the quality of those randomly generated numbers.
For simulation models we want to be more generally able to simulate observations that appear to be realizations of random variables with known distributions. We now study how this can be done. But before this, let’s see how R implements random variate generation.
4.5.1 Random Generation in R
In the next few sections we will learn results that allow for the simulation of random observations from generic distributions. No matter how the methods work, they have a very simple and straightforward implementation in R.
We have already learned that we can simulate observations from the uniform between zero and one using the code runif(N)
where N
is the number of observations to simulate. We can notice that it is similar to the commands dunif
and punif
we have already seen for the pdf and cdf of the uniform.
Not surprisingly we can generate observations from any random variable using the syntax r
followed by the naming of the variable chosen. So for instance:
runif
generates random observations from the Uniform;rnorm
generates random observations from the Normal;rexp
generates random observations from the Exponential;rbinom
generates random observations from the Binomial;rpois
generates random observations from the Poisson;
Each of these functions takes as first input the number of observations that we want to simulate. They then have additional inputs that can be given, which depend on the random variable chosen and are the same that we saw in the past.
So for instance
rnorm(10, mean = 1, sd = 2)
generates ten observations from a Normal distribution with mean 1 and standard deviation 2.
4.5.2 The Inverse Transform Method
The simplest method to simulate observations from generic random variables is the so-called inverse transform method.
Suppose we are interested in a random variable \(X\) whose cdf is \(F\). We must assume that \(F\) is:
known and written in closed-form;
continuous;
strictly increasing.
Then one can prove that \[ X = F^{-1}(U), \] where \(F^{-1}\) is the inverse of \(F\) and \(U\) is a continuous uniform distribution between zero and one.
The above results gives us the following algorithm to simulate observations from a random variable \(X\) with distribution \(F\):
compute the inverse \(F^{-1}\) of \(F\);
generate independent random observations \(u_1,u_2,\dots,u_N\) from a Uniform between zero and one;
compute \(x_1=F^{-1}(u_1), x_2=F^{-1}(u_2),\dots,x_N=F^{-1}(u_n)\).
Then \(x_1,x_2,\dots,x_N\) are independent random observations of the random variable \(X\).
So the above algorithm can be applied to generate observations from continuous random variables with a cdf in closed-form. Therefore, for instance the above algorithm cannot be straightforwardly used for Normals. However it can be used to simulate Exponential and Uniform distributions. Furthermore, it cannot be directly used to simulate discrete random variables. A simple adaptation of this method, which we will not see here, can however be used for discrete random variables.
4.5.2.1 Simulating Exponentials
Recall that if \(X\) is Exponential with parameter \(\lambda\) then its cdf is \[ F(x)= 1- e^{-\lambda x}, \hspace{1cm} \mbox{for } x\geq 0 \] Suppose we want to simulate observations \(x_1,\dots,x_N\) from such a distribution.
First we need to compute the inverse of \(F\). This means solving the equation: \[ 1-e^{-\lambda x} = u \] for \(x\). This can be done following the steps: \[\begin{eqnarray*} 1-e^{-\lambda x} &=& u\\ e^{-\lambda x} &=& 1 - u\\ -\lambda x &=& \log(1-u)\\ x &=& -\frac{1}{\lambda}\log(1-u) \end{eqnarray*}\] So \(F^{-1}(u)=-\log(1-u)/\lambda\).
Second we need to simulate random uniform observations using
runif
.Last, we apply the inverse function to the randomly simulated observations.
Let’s give the R code.
set.seed(2021)
# Define inverse function
<- function(u,lambda) -log(1-u)/lambda
invF # Simulate 5 uniform observations
<- runif(5)
u # Compute the inverse
invF(u, lambda = 2)
## [1] 0.3000720 0.7657289 0.6183896 0.2404265 0.5057456
First we defined the inverse function of an Exponential with parameter lambda
in invF
. Then we simulated five observations from a uniform. Last we applied the function invF
for a parameter lambda
equal to two. The output are therefore five observations from an Exponential random variable with parameter 2.
4.5.2.2 Simulating Generic Uniforms
We know how to simulate uniformly between 0 and 1, but we do not know how to simulate uniformly between two generic values \(a\) and \(b\).
Recall that the cdf of the uniform distribution between \(a\) and \(b\) is \[ F(x)=\frac{x-a}{b-a}, \hspace{2cm} \mbox{for } a\leq x \leq b \] The inverse transform method requires the inverse of \(F\), which using simple algebra can be computed as \[ F^{-1}(u)=a + (b-a)u \] So given a sequence \(u_1,\dots,u_N\) of random observations from a Uniform between 0 and 1, we can simulate numbers at uniform between \(a\) and \(b\) by computing \[ x_1 = a + (b-a)u_1,\dots, x_N=a+(b-a)u_N \] In R:
set.seed(2021)
<- 2
a <- 6
b + (b-a)*runif(5) a
## [1] 3.805069 5.135119 4.838729 3.526977 4.545295
The code simulates five observations from a Uniform between two and six. This can be equally achieved by simply using:
set.seed(2021)
runif(5, min = 2, max = 6)
## [1] 3.805069 5.135119 4.838729 3.526977 4.545295
Notice that since we fixed the seed, the two methods return exactly the same sequence of numbers.
4.5.3 Simulating Bernoulli and Binomial
Bernoulli random variables represent binary experiments with a probability of success equal to \(\theta\). A simple simulation algorithm to simulate one Bernoulli observation is:
generate \(u\) uniformly between zero and one;
if \(u< \theta\) set \(x=0\), otherwise \(x=1\).
We will not prove that this actually works, but it intuitively does. Let’s code it in R.
set.seed(2021)
<- 0.5
theta <- runif(5)
u <- ifelse(u < theta, 0, 1)
x x
## [1] 0 1 1 0 1
So here we simulated five observations from a Bernoulli with parameter 0.5: the toss of a fair coin. Three times the coin showed head, and twice tails.
From this comment, it is easy to see how to simulate one observation from a Binomial: by simply summing the randomly generated observations from Bernoullis. So if we were to sum the five numbers above, we would get one random observations from a Binomial with parameter \(n=5\) and \(\theta=0.5\).
4.5.4 Simulating Other Distributions
There are many other algorithms that allow to simulate specific as well as generic random variables. Since these are a bit more technical we will not consider them here, but it is important for you to know that we now can simulate basically any random variable you are interested in!