14 One Proportion

14.1 Comparing various confidence intervals

We consider a binomial experiment (i.e., a situation where the data consist in the realization of Bernoulli trials) and the task of building a confidence interval for the probability parameter.

n = 30 # sample size
theta = 0.1 # true value of the parameter
x = rbinom(n, 1, theta) # sample
y = sum(x) # number of 1's in the sample

The Clopper–Pearson interval is “exact”, and in fact conservative in terms of level. We compute below the two-sided 90% confidence interval.

test = binom.test(y, n, conf.level = 0.90)
I = round(test$conf.int, 3)
cat(I)

0.028 0.239

The Wilson interval is “approximate” in terms of level, relying on the normal approximation to the binomial distribution. (The two intervals tend to be closer the larger the sample size is.)

test = prop.test(y, n, conf.level = 0.90) 
I = round(test$conf.int, 3)
cat(I)

0.031 0.246

We compare these two intervals in simulations.

n = 30
theta = 0.1
B = 1e3 # number of replicates
cp_in = 0 # stores the number of times that the CP interval contains the true value
cp_len = numeric(B) # stores the length of the CP interval
w_in = 0 # stores the number of times that the W interval contains the true value
w_len = numeric(B) # stores the length of the W interval
for (b in 1:B){
  y = rbinom(1, n, theta) # we generate y directly to save time
  cp = binom.test(y, n, conf.level = 0.90)$conf.int # CP interval
  cp_in = cp_in + (cp[1] <= theta)*(cp[2] >= theta)
  cp_len[b] = cp[2] - cp[1]
  w = prop.test(y, n, conf.level = 0.90)$conf.int # W interval
  w_in = w_in + (w[1] <= theta)*(w[2] >= theta)
  w_len[b] = w[2] - w[1]
}

cp_in/B # coverage of the CP interval

[1] 0.938

w_in/B # coverage of the W interval

[1] 0.979

summary(cp_len) # length of the CP interval

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.09503 0.18335 0.21078 0.20463 0.23276 0.30613

summary(w_len) # length of the W interval

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.1110  0.1901  0.2148  0.2096  0.2349  0.3030