Appendix B : Alternative Bootstrap Code
Mosaic Package
library(ggplot2) # graphing functions
library(dplyr) # data summary tools
library(mosaic) # using Mosaic for iterations
# Set default behavior of ggplot2 graphs to be black/white theme
theme_set(theme_bw())
This method uses the mosaic
package and can work very well when everything is in data frames.
# read the Lakes data set
read.csv('http://www.lock5stat.com/datasets/FloridaLakes.csv')
Lakes <-
# create the Estimated Sampling Distribution of xbar
mosaic::do(10000) *
BootDist <- mosaic::resample(Lakes) %>%
summarise(xbar = mean(AvgMercury))
# what columns does the data frame "BootDist" have?
head(BootDist)
## xbar
## 1 0.5166038
## 2 0.5516981
## 3 0.5594340
## 4 0.4775472
## 5 0.5035849
## 6 0.4896226
# show a histogram of the estimated sampling distribution of xbar
ggplot(BootDist, aes(x=xbar)) +
geom_histogram() +
ggtitle('Estimated Sampling distribution of xbar' )
# calculate a quantile-based confidence interval
quantile(BootDist$xbar, c(0.025, 0.975))
## 2.5% 97.5%
## 0.4375425 0.6190613
Base R Code
Here, no packages are used and the steps of the bootstrap are more explicit.
Lakes$AvgMercury
AvgMerc <-10000 ### Numer of iterations, like `R` in `boot`
Boot.Its<-length(AvgMerc)
Sample.Size<-numeric() ### where each estimate is saved
BS.means<-for(j in 1:Boot.Its) BS.means[j]<-mean(sample(AvgMerc, Sample.Size, replace=T))
hist(BS.means)
Then the 95% confidence interval can be found in a similar manner to above.
quantile(BS.means, c(0.025, 0.975))
## 2.5% 97.5%
## 0.4362264 0.6205660