8.6 The Nonparametric Bootstrap

In this section, we describe the easiest and most common form of the bootstrap: the nonparametric bootstrap. As we shall see, the nonparametric bootstrap procedure is very similar to a Monte Carlo simulation experiment. The main difference is how the random data is generated. In a Monte Carlo experiment, the random data is created from a computer random number generator for a specific probability distribution (e.g., normal distribution). In the nonparametric bootstrap, the random data is created by resampling with replacement from the original data.

The procedure for the nonparametric bootstrap is as follows:

  1. Resample. Create \(B\) bootstrap samples by sampling with replacement from the original data \(\{r_{1},\ldots,r_{T}\}\).43 Each bootstrap sample has \(T\) observations (same as the original sample) \[\begin{eqnarray*} \left\{r_{11}^{*},r_{12}^{*},\ldots,r_{1T}^{*}\right\} & = & \mathrm{1st\,bootstrap\,sample}\\ & \vdots\\ \left\{r_{B1}^{*},r_{B2}^{*},\ldots,r_{BT}^{*}\right\} & = & \mathrm{Bth\,boostrap\,sample} \end{eqnarray*}\]
  2. Estimate \(\theta\). From each bootstrap sample estimate \(\theta\) and denote the resulting estimate \(\hat{\theta}^{*}\). There will be \(B\) values of \(\hat{\theta}^{*}\): \(\left\{\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\right\}\).
  3. Compute statistics. Using \(\left\{\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\right\}\) compute estimate of bias, standard error, and approximate 95% confidence interval.

This procedure is very similar to the procedure used to perform a Monte Carlo experiment described in Chapter 7. The main difference is how the hypothetical samples are created. With the nonparametric bootstrap, the hypothetical samples are created by resampling with replacement from the original data. In this regard the bootstrap treats the sample as if it were the population. This ensures that the bootstrap samples inherit the same distribution as the original data – whatever that distribution may be. If the original data is normally distributed, then the nonparametric bootstrap samples will also be normally distributed. If the data is Student’s t distributed, then the nonparametric bootstrap samples will be Student’s t distributed. With Monte Carlo simulation, the hypothetical samples are simulated under an assumed model and distribution. This requires one to specify values for the model parameters and the distribution from which to simulate.

8.6.1 Bootstrap bias estimate

The nonparametric bootstrap can be used to estimate the bias of an estimator \(\hat{\theta}\) using the bootstrap estimates {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)}. The key idea is to treat the empirical distribution (i.e., histogram) of the bootstrap estimates {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)} as an approximate to the unknown distribution of \(\hat{\theta}.\)

Recall, the bias of an estimator is defined as

\[\begin{equation*} E[\hat{\theta}]-\theta. \end{equation*}\]

The bootstrap estimate of bias is given by

\[\begin{align} \widehat{\mathrm{bias}}_{boot}(\hat{\theta},\theta)= \bar{\theta}^{\ast} - \hat{\theta} = \frac{1}{B}\sum_{j=1}^{\ B}\hat{\theta}_{j}^{\ast}-\hat{\theta.}\tag{8.13}\\ \text{(bootstrap mean - estimate)}\nonumber \end{align}\]

The bootstrap bias estimate (8.13) is the difference between the mean of the bootstrap estimates of \(\theta\) and the sample estimate of \(\theta\). This is similar to the Monte Carlo estimate of bias discussed in Chapter 7. However, the Monte Carlo estimate of bias is the difference between the mean of the Monte Carlo estimates of \(\theta\) and the true value of \(\theta.\) The bootstrap estimate of bias does not require knowing the true value of \(\theta\). Effectively, the bootstrap treats the sample estimate \(\hat{\theta}\) as the population value \(\theta\) and the bootstrap mean \(\bar{\theta}^{\ast} = \frac{1}{B}\sum_{j=1}^{\ B}\hat{\theta}_{j}^{\ast}\) as an approximation to \(E[\hat{\theta}]\). Here, \(\widehat{\mathrm{bias}}_{boot}(\hat{\theta},\theta)=0\) if the center of the histogram of {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)} is at \(\hat{\theta}.\)

Given the bootstrap estimate of bias (8.13), we can compute a bias-adjusted estimate:

\[\begin{equation} \hat{\theta}_{adj} = \hat{\theta} - (\bar{\theta}^{\ast} - \hat{\theta}) = 2\hat{\theta} - \bar{\theta}^{\ast} \tag{8.14} \end{equation}\]

if the bootstrap estimate of bias is large, it may be tempting to use the bias-adjusted estimate (8.14) in place of the original estimate. This is generally not done in practice because the bias adjustment introduces extra variability into the estimator. So what use is the bootstrap bias estimate? It provides information to you that your estimate contains bias (or not) and this information can influence your decision making based on the estimate.

8.6.2 Bootstrap standard error estimate

for a scalar estimate \(\hat{\theta}\), the bootstrap estimate of \(\widehat{\mathrm{se}}(\hat{\theta}\)) is given by the sample standard deviation of the bootstrap estimates {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)}:

\[\begin{equation} \widehat{\mathrm{se}}_{boot}(\hat{\theta})=\sqrt{\frac{1}{B-1}\sum_{j=1}^{B}\left(\hat{\theta}_{j}^{\ast} - \bar{\theta}^{\ast} \right)^{2}}.\tag{8.15} \end{equation}\]

where \(\bar{\theta}^{\ast} = \frac{1}{B}\sum_{j=1}^{\ B}\hat{\theta}_{j}^{\ast}\). Here, \(\widehat{\mathrm{se}}_{boot}(\hat{\theta})\) is the size of a typical deviation of a bootstrap estimate, \(\hat{\theta}^{*}\), from the mean of the bootstrap estimates (typical deviation from the middle of the histogram of the bootstrap estimates). This is very closely related to the Monte Carlo estimate of \(\widehat{\mathrm{se}}(\hat{\theta}\)) , which is the sample standard deviation of the estimates of \(\theta\) from the Monte Carlo samples.

For a \(k \times 1\) vector estimate \(\hat{\theta}\), the bootstrap estimate of \(\widehat{\mathrm{var}}(\hat{\theta})\) is

\[\begin{equation} \widehat{\mathrm{var}}(\hat{\theta}) = \frac{1}{B-1}\sum_{j=1}^B (\hat{\theta}_{j}^{\ast} - \bar{\theta}^{\ast})(\hat{\theta}_{j}^{\ast} - \bar{\theta}^{\ast})'. \end{equation}\]

8.6.3 Bootstrap confidence intervals

In addition to providing standard error estimates, the bootstrap is commonly used to compute confidence intervals for scalar parameters. There are several ways of computing confidence intervals with the bootstrap and which confidence interval to use in practice depends on the characteristics of the distribution of the bootstrap estimates {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)}.

If the bootstrap distribution appears to be normal then you can compute a 95% confidence interval for \(\theta\) using two simple methods. The first method mimics the rule-of-thumb that is justified by the CLT to compute a 95% confidence interval but uses the bootstrap standard error estimate (8.15):

\[\begin{equation} \hat{\theta}\pm2\times\widehat{\mathrm{se}}_{boot}(\hat{\theta}) = [\hat{\theta} - 2\times \widehat{\mathrm{se}}_{boot}(\hat{\theta}), ~ \hat{\theta} + 2\times \widehat{\mathrm{se}}_{boot}(\hat{\theta})].\tag{8.16} \end{equation}\]

The 95% confidence interval (8.16) is called the normal approximation 95% confidence interval.

The second method directly uses the distribution of the bootstrap estimates {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)} to form a 95% confidence interval:

\[\begin{equation} [\hat{q}_{.025}^{*},\,\hat{q}_{.975}^{*}],\tag{8.17} \end{equation}\]

where \(\hat{q}_{.025}^{*}\) and \(\hat{q}_{.975}^{*}\) are the 2.5% and 97.5% empirical quantiles of {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)}, respectively. By construction, 95% of the bootstrap estimates lie in the quantile-based interval (8.17). This 95% confidence interval is called the bootstrap percentile 95% confidence interval.

The main advantage of (8.17) over (8.16) is transformation invariance. This means that if (8.17) is a 95% confidence interval for \(\theta\) and \(f(\theta)\) is a monotonic function of \(\theta\) then \([f(\hat{q}_{.025}^{*}),\,f(\hat{q}_{.975}^{*})]\) is a 95% confidence interval for \(f(\theta)\). The interval (8.17) can also be asymmetric whereas (8.16) is symmetric.

The bootstrap normal approximation and percentile 95% confidence intervals (8.16) and (8.17) will perform well (i.e., will have approximately correct coverage probability) if the bootstrap distribution looks normally distributed. However, if the bootstrap distribution is asymmetric then these 95% confidence intervals may have coverage probability different from 95%, especially if the asymmetry is substantial. The basic shape of the bootstrap distribution can be visualized by the histogram of {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)}, and the normality of the distribution can be visually evaluated using the normal QQ-plot of {\(\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\)}. If the histogram is asymmetric and/or not bell-shaped or if the normal QQ-plot is not linear then normality is suspect and the confidence intervals (8.17) and (8.16) may be inaccurate.

If the bootstrap distribution is asymmetric then it is recommended to use Efron’s bias and skewness adjusted bootstrap percentile confidence (BCA) interval. The details of how to compute the BCA interval are a bit complicated, and so are omitted, and the interested reader is referred to Chapter 10 of Hansen (2011) for a good explanation. Suffice it to say that the 95% BCA method adjusts the 2.5% and 97.5% quantiles of the bootstrap distribution for bias and skewness. The BCA confidence interval can be computed using the boot function boot.ci() with optional argument type="bca".44

8.6.4 Performing the Nonparametric Bootstrap in R

The nonparametric bootstrap procedure is easy to perform in R. You can implement the procedure by “brute force” in very much the same way as you perform a Monte Carlo experiment. In this approach you program all parts of the bootstrapping procedure. Alternatively, you can use the R package boot which contains functions for automating certain parts of the bootstrapping procedure.

8.6.4.1 Brute force implementation

In the brute force implementation you progam all parts of the bootstrap procedure. This typically involves three steps:

  1. Sample with replacement from the original data using the R function sample() to create \(\{r_{1}^{*},r_{2}^{*},\ldots,r_{T}^{*}\}\). Do this \(B\) times.
  2. Compute \(B\) values of the statistic of interest \(\hat{\theta}\) from each bootstrap sample giving \(\{\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\}\). \(B=1,000\) for routine applications of the bootstrap. To minimize simulation noise it may be required to have \(B=10,000\) or higher.
  3. Compute the bootstrap bias and SE values from \(\{\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\}\) using the R functions mean() and sd(), respectively.
  4. Compute the histogram and normal QQ-plot of \(\{\hat{\theta}_{1}^{*},\ldots,\hat{\theta}_{B}^{*}\}\) using the R functions hist() and qqnorm(), respectively, to see if the bootstrap distribution looks normal.

Typically steps 1 and 2 are performed inside a for loop or within the R function apply().

In step 1 sampling with replacement for the original data is performed with the R function sample(). To illustrate, consider sampling with replacement from the \(5\times1\) return vector:

r = c(0.1,0.05,-0.02,0.03,-0.04)
r
## [1]  0.10  0.05 -0.02  0.03 -0.04

Recall, in R we can extract elements from a vector by subsetting based on their location, or index, in the vector. Since the vector r has five elements, its index is the integer vector 1:5. Then to extract the first, second and fifth elements of r use the index vector c(1, 2, 5):

r[c(1, 2, 5)]
## [1]  0.10  0.05 -0.04

Using this idea, you can extract a random sample (of any given size) with replacement from r by creating a random sample with replacement of the integers \(\{1,2,\ldots,5\}\) and using this set of integers to extract the sample from r. The R fucntion sample() can be used to do this process. When you pass a positive integer value n to sample(), with the optional argument replace=TRUE, it returns a random sample from the set of integers from \(1\) to n. For example, to create a random sample with replacement of size 5 from the integers\(\{1,2,\ldots,5\}\) use:45

set.seed(123)
idx = sample(5, replace=TRUE)
idx
## [1] 3 3 2 2 3

We can then get a random sample with replacement from the vector r by subsetting using the index vector idx:

r[idx]
## [1] -0.02 -0.02  0.05  0.05 -0.02

This two step process is automated when you pass a vector of observations to sample():

set.seed(123)
sample(r, replace=TRUE)
## [1] -0.02 -0.02  0.05  0.05 -0.02
Example 2.35 (Brute force nonparametric bootstrap of GWN model in R)

Consider using the nonparametric bootstrap to compute estimates of the bias and standard error for \(\hat{\mu}\) in the GWN model for Microsoft. The R code for the brute force “for loop” to implement the nonparametric bootstrap is:

B = 1000 
muhat.boot = rep(0, B) 
n.obs = nrow(msftRetS) 
set.seed(123) 
for (i in 1:B) {   
   boot.data = sample(msftRetS, n.obs, replace=TRUE)   
   muhat.boot[i] = mean(boot.data) 
}

The bootstrap bias estimate is:

mean(muhat.boot) - muhatS
## [1] 0.000483

which is very close to zero and confirms that \(\hat{\mu}\) is unbiased. The bootstrap standard error estimate is

sd(muhat.boot)
## [1] 0.00796

and is equal (to three decimals) to the analytic standard error estimate computed earlier. This confirms that the nonparametric bootstrap accurately computes \(\widehat{\mathrm{se}}(\hat{\mu}).\) Figure 8.2 shows the histogram and normal QQ-plot of the bootstrap estimates created with:

par(mfrow=c(1,2))   
hist(muhat.boot, col="cornflowerblue")   
abline(v=muhatS, col="white", lwd=2)   
qqnorm(muhat.boot, col="cornflowerblue")   
qqline(muhat.boot) 
Histogram and normal QQ-plot of bootstrap estimates of $\mu$ for Microsoft.

Figure 8.2: Histogram and normal QQ-plot of bootstrap estimates of \(\mu\) for Microsoft.

The distribution of the bootstrap estimates looks normal. As a result, the bootstrap 95% confidence interval for \(\mu\) has the form:

se.boot = sd(muhat.boot)
lower = muhatS + se.boot
upper = muhatS - se.boot
ans = cbind(muhatS, se.boot, lower, upper)
colnames(ans)=c("Estimate", "Std Error", "2.5%", "97.5%")
rownames(ans) = "mu"
ans
##    Estimate Std Error   2.5%  97.5%
## mu  0.00915   0.00796 0.0171 0.0012

\(\blacksquare\)

It is important to keep in mind that the bootstrap resampling algorithm draws random samples from the observed data. In the example above, the random number generator in R was initially set with set.seed(123). If the seed is set to a different integer, say 345, then the \(B=1,000\) bootstrap samples will be different and we will get different numerical results for the bootstrap bias and estimated standard error. How different these results will be depends on number of bootstrap samples \(B\) and the statistic \(\hat{\theta}\) being bootstrapped. Generally, the smaller is \(B\) the more variable the bootstraps results will be. It is good practice to repeat the bootstrap analysis with several different random number seeds to see how variable the results are. If the results vary considerably for different random number seeds then it is advised to increase the number of bootstrap simulations \(B\). It is common to use \(B=1,000\) for routine calculations, and \(B=10,000\) for final results.

8.6.4.2 R package boot

The R package boot implements a variety of bootstrapping techniques including the basic non-parametric bootstrap described above. The boot package was written to accompany the textbook Bootstrap Methods and Their Application by (Davison and Hinkley 1997). The two main functions in boot are boot() and boot.ci(), respectively. The boot() function implements the bootstrap for a statistic computed from a user-supplied function. The boot.ci() function computes bootstrap confidence intervals given the output from the boot() function.

The arguments to boot() are:

library(boot)
args(boot)
## function (data, statistic, R, sim = "ordinary", stype = c("i", 
##     "f", "w"), strata = rep(1, n), L = NULL, m = 0, weights = NULL, 
##     ran.gen = function(d, p) d, mle = NULL, simple = FALSE, ..., 
##     parallel = c("no", "multicore", "snow"), ncpus = getOption("boot.ncpus", 
##         1L), cl = NULL) 
## NULL

where data is a data object (typically a vector or matrix but can be a time series object too), statistic is a use-specified function to compute the statistic of interest, and R is the number of bootstrap replications. The remaining arguments are not important for the basic non-parametric bootstrap. The function assigned to the argument statistic has to be written in a specific form, which is illustrated in the next example. The boot() function returns an object of class boot for which there are print and plot methods.

The arguments to boot.ci() are

args(boot.ci)
## function (boot.out, conf = 0.95, type = "all", index = 1L:min(2L, 
##     length(boot.out$t0)), var.t0 = NULL, var.t = NULL, t0 = NULL, 
##     t = NULL, L = NULL, h = function(t) t, hdot = function(t) rep(1, 
##         length(t)), hinv = function(t) t, ...) 
## NULL

where boot.out is an object of class boot, conf specifies the confidence level, and type is a subset from c("norm", "basic", "stud", "perc", "bca") indicating the type of confidence interval to compute. The choices norm, perc, and “bca” compute the normal confidence interval (8.16), the percentile confidence interval (8.17), and the BCA confidence interval, respectively. The remaining arguments are not important for the computation of these bootstrap confidence intervals.

Example 2.39 (Nonparametric bootstrap of GWN model using the R package boot)

To use the boot() function to implement the bootstrap for \(\hat{\mu},\) a function must be specified to compute \(\hat{\mu}\) for each bootstrap sample. The function must have two arguments: x and idx. Here, x represents the original data and idx represents the random integer index (created internally by boot()) to subset x for each bootstrap sample. For example, a function to be passed to boot() for \(\hat{\mu}\) is

mean.boot = function(x, idx) { 
# arguments: 
# x      data to be resampled 
# idx    vector of scrambled indices created by boot() function 
# value: 
# ans   mean value computed using resampled data      
    ans = mean(x[idx])      
    ans 
}

To implement the nonparametric bootstrap for \(\hat{\mu}\) with 999 samples use

set.seed(123)
muhat.boot = boot(msftRetS, statistic = mean.boot, R=999) 
class(muhat.boot) 
## [1] "boot"
names(muhat.boot)
##  [1] "t0"        "t"         "R"         "data"      "seed"      "statistic"
##  [7] "sim"       "call"      "stype"     "strata"    "weights"

The returned object muhat.boot is of class boot. The component t0 is the sample estimate \(\hat{\mu}\), and the component t is a \(999\times1\) matrix containing the bootstrap estimates \(\{\hat{\mu}_{1}^{*},\ldots,\hat{\mu}_{999}^{*}\}\). The print method shows the sample estimate, the bootstrap bias and the bootstrap standard error:

muhat.boot
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = msftRetS, statistic = mean.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original   bias    std. error
## t1*  0.00915 0.000482     0.00758

These statistics can be computed directly from the components of muhat.boot

c(muhat.boot$t0,mean(muhat.boot$t) - muhat.boot$t0, sd(muhat.boot$t))
## [1] 0.009153 0.000482 0.007584

A histogram and nornal QQ-plot of the bootstrap values, shown in Figure 8.3, can be created using the plot method

plot(muhat.boot)
plot method for objects of class `boot`

Figure 8.3: plot method for objects of class boot

Normal, percentile, and bias-adjusted (bca) percentile 95% confidence intervals can be computed using

boot.ci(muhat.boot, conf=0.95, type=c("norm", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = muhat.boot, conf = 0.95, type = c("norm", 
##     "perc", "bca"))
## 
## Intervals : 
## Level      Normal             Percentile            BCa          
## 95%   (-0.0062,  0.0235 )   (-0.0057,  0.0245 )   (-0.0062,  0.0239 )  
## Calculations and Intervals on Original Scale

Because the bootstrap distribution looks normal, the normal, percentile, and bca confidence intervals are very similar.

The GWN model estimate \(\hat{\sigma}\) can be “bootstrapped” in a similar fashion. First, we write a function to compute \(\hat{\sigma}\) for each bootstrap sample

sd.boot = function(x, idx) {    
    ans = sd(x[idx])      
    ans 
}

Then we call boot() with statistic=sd.boot

set.seed(123)
sigmahat.boot = boot(msftRetS, statistic = sd.boot, R=999) 
sigmahat.boot
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = msftRetS, statistic = sd.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original    bias    std. error
## t1*    0.101 -0.000783     0.00764
Bootstrap distribution for $\hat{\sigma}$

Figure 8.4: Bootstrap distribution for \(\hat{\sigma}\)

The bootstrap distribution, shown in Figure 8.4, looks a bit non-normal. The 95% confidences intervals are computed using

boot.ci(sigmahat.boot, conf=0.95, type=c("norm", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = sigmahat.boot, conf = 0.95, type = c("norm", 
##     "perc", "bca"))
## 
## Intervals : 
## Level      Normal             Percentile            BCa          
## 95%   ( 0.0873,  0.1173 )   ( 0.0864,  0.1162 )   ( 0.0892,  0.1200 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

We see that all of these intervals are quite similar.

\(\blacksquare\)

Example 2.44 (Nonparametric bootstrap for example functions)

The real power of the bootstrap procedure comes when we apply it to plug-in estimates of functions like (8.1)-(8.4) for which there are no easy analytic formulas for estimating bias and standard errors. In this situation, the bootstrap procedure easily computes numerical estimates of the bias, standard error, and 95% confidence interval. All we need is an R function for computing the plug-in function estimate in a form suitable for boot(). The R functions for the example functions are:

f1.boot = function(x, idx, alpha=0.05) {
  q = mean(x[idx]) + sd(x[idx])*qnorm(alpha)
  q
}

f2.boot = function(x, idx, alpha=0.05, w0=100000) {
  q = mean(x[idx]) + sd(x[idx])*qnorm(alpha)
  VaR = -w0*q
  VaR
}

f3.boot = function(x, idx, alpha=0.05, w0=100000) { 
    q = mean(x[idx]) + sd(x[idx])*qnorm(alpha)  
    VaR = -(exp(q) - 1)*w0  
    VaR 
}

f4.boot = function(x, idx, r.f=0.0025) {
  SR = (mean(x[idx]) - r.f)/sd(x[idx])
  SR
}

The functions \(f_1\), \(f_2\), and \(f_3\) have the additional argument alpha which specifies the tail probability for the quantile, and the functions \(f_2\), and \(f_3\) have the additional argument w0 for the initial wealth invested. The function \(f_4\) has the additional argument r.f for the risk-free rate.

To compute the nonparametric bootstrap for \(f_1\) with \(\alpha=0.05\) use:

set.seed(123)
f1.bs = boot(msftRetS, statistic = f1.boot, R=999) 
f1.bs
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = msftRetS, statistic = f1.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*   -0.158 0.00177      0.0124

Here, the bootstrap estimate of bias is small and the bootstrap estimated standard error is similar to the delta method and jackknife standard errors.

plot(f1.bs)

The bootstrap distribution looks normal, and the three methods for computing 95% confidence intervals are essentially the same:

boot.ci(f1.bs, type=c("norm", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = f1.bs, type = c("norm", "perc", "bca"))
## 
## Intervals : 
## Level      Normal             Percentile            BCa          
## 95%   (-0.184, -0.135 )   (-0.181, -0.132 )   (-0.189, -0.138 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

To compute the nonparametric bootstrap for \(f_2\) with \(\alpha=0.05\) and \(W_0=\$100,000\) use:

set.seed(123)
f2.bs = boot(msftRetS, statistic = f2.boot, R=999) 
f2.bs
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = msftRetS, statistic = f2.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*    15780    -177        1244

The bootstrap estimate of bias is small and the bootstrap estimated standard error is similar to the delta method and jackknife standard errors.

plot(f2.bs)

The bootstrap distribution looks normal, and the three bootstrap 95% confidence intervals are all very similar:

boot.ci(f2.bs, type=c("norm", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = f2.bs, type = c("norm", "perc", "bca"))
## 
## Intervals : 
## Level      Normal             Percentile            BCa          
## 95%   (13518, 18395 )   (13161, 18089 )   (13801, 18940 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

To compute the nonparametric bootstrap for \(f_3\) with \(\alpha=0.05\) and \(W_0=\$100,000\) use:

set.seed(123)
f3.bs = boot(msftRetC, statistic = f3.boot, R=999) 
f3.bs
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = msftRetC, statistic = f3.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*    14846    -182        1223

The bias is small and the bootstrap estimated standard error is larger than the delta method standard error and is closer to the jackknife standard error.

plot(f3.bs)

The bootstrap distribution looks a little asymmetric and the BCA confidence interval is a bit different from the normal and percentile intervals:

boot.ci(f3.bs, type=c("norm", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = f3.bs, type = c("norm", "perc", "bca"))
## 
## Intervals : 
## Level      Normal             Percentile            BCa          
## 95%   (12631, 17426 )   (12295, 17155 )   (12994, 18061 )  
## Calculations and Intervals on Original Scale
## Some BCa intervals may be unstable

Finally, to compute the bootstrap for \(f_4\) with \(r_f = 0.0025\) use:

set.seed(123)
f4.bs = boot(msftRetS, statistic = f4.boot, R=999) 
f4.bs
## 
## ORDINARY NONPARAMETRIC BOOTSTRAP
## 
## 
## Call:
## boot(data = msftRetS, statistic = f4.boot, R = 999)
## 
## 
## Bootstrap Statistics :
##     original  bias    std. error
## t1*   0.0655 0.00391      0.0739

The bootstrap estimated bias is small and the standard error is similar to the delta method and the jackknife.

plot(f4.bs)

Here, the bootstrap distribution is slightly asymmetric.

boot.ci(f4.bs, conf=0.95, type=c("norm", "perc", "bca"))
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 999 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = f4.bs, conf = 0.95, type = c("norm", "perc", 
##     "bca"))
## 
## Intervals : 
## Level      Normal             Percentile            BCa          
## 95%   (-0.0832,  0.2065 )   (-0.0821,  0.2132 )   (-0.0897,  0.2030 )  
## Calculations and Intervals on Original Scale

Notice that the Percentile and BCA confidence intervals are a bit different from the normal approximation interval.

\(\blacksquare\)

8.6.5 Pros and cons of the nonparametric bootstrap

The nonparametric bootstrap is extremely useful and powerful statistical technique. The main advantages (pros) are:

  • General procedure to estimate bias and standard errors, and to compute confidence intervals, that does not rely on asymptotic distributions.

  • Can be used for scalar and vector estimates.

  • Under general conditions, bootstrap distributions of estimators converge to the asymptotic distribution of the estimators.

  • In certain situations, bootstrap results can be more accurate than analytic results computed from asymptotic distributions.

There are some things to watch out for when using the bootstrap (cons):

  • It be computational intensive but this is really not much of an issue with today’s fast computers.

  • With small \(B\), bootstrap results can vary substantially across simulations with different random number seeds.

  • There are situations where the bootstrap does not work. A leading case is when the bootstrap is applied to a function that can be become unbounded (e.g. a ratio of means when the denominator mean is close to zero). In these cases, a trimmed version of the bootstrap is recommended. See Hansen (2011) for a discussion of this issue.

  • You have to be careful applying the bootstrap to time series data. If there is serial correlation in the data then the nonparametric bootstrap may give poor results. There are versions of the bootstrap (e.g. the block bootstrap) that work with serial correlated data. For example, the tseries function tsbootstrap() is designed to to work with serially correlated time series data.

References

Davison, A., and D. Hinkley. 1997. Bootstrap Methods and Their Application. Cambridge: Cambridge University Press.


  1. We can also bootstrap data from multiple assets by defining \(r_t\) as a vector of asset returns.↩︎

  2. The bootstrap percentile-t confidence interval is another bias and skewness adjusted confidence interval.↩︎

  3. We use set.seed(123) to initialize the random number generator in R and allow for reproducible results.↩︎