14 One Proportion
14.1 Comparing various confidence intervals
We consider a binomial experiment (i.e., a situation where the data consist in the realization of Bernoulli trials) and the task of building a confidence interval for the probability parameter.
= 30 # sample size
n = 0.1 # true value of the parameter
theta = rbinom(n, 1, theta) # sample
x = sum(x) # number of 1's in the sample y
The Clopper–Pearson interval is “exact”, and in fact conservative in terms of level. We compute below the two-sided 90% confidence interval.
= binom.test(y, n, conf.level = 0.90)
test = round(test$conf.int, 3)
I cat(I)
0.028 0.239
The Wilson interval is “approximate” in terms of level, relying on the normal approximation to the binomial distribution. (The two intervals tend to be closer the larger the sample size is.)
= prop.test(y, n, conf.level = 0.90)
test = round(test$conf.int, 3)
I cat(I)
0.031 0.246
We compare these two intervals in simulations.
= 30
n = 0.1
theta = 1e3 # number of replicates
B = 0 # stores the number of times that the CP interval contains the true value
cp_in = numeric(B) # stores the length of the CP interval
cp_len = 0 # stores the number of times that the W interval contains the true value
w_in = numeric(B) # stores the length of the W interval
w_len for (b in 1:B){
= rbinom(1, n, theta) # we generate y directly to save time
y = binom.test(y, n, conf.level = 0.90)$conf.int # CP interval
cp = cp_in + (cp[1] <= theta)*(cp[2] >= theta)
cp_in = cp[2] - cp[1]
cp_len[b] = prop.test(y, n, conf.level = 0.90)$conf.int # W interval
w = w_in + (w[1] <= theta)*(w[2] >= theta)
w_in = w[2] - w[1]
w_len[b] }
/B # coverage of the CP interval cp_in
[1] 0.938
/B # coverage of the W interval w_in
[1] 0.979
summary(cp_len) # length of the CP interval
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.09503 0.18335 0.21078 0.20463 0.23276 0.30613
summary(w_len) # length of the W interval
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.1110 0.1901 0.2148 0.2096 0.2349 0.3030