Exercises

Exercise 5.1 The daily water consumption in the homes of a specific population can be assumed normally distributed with \(\sigma = 16.8\) liters. After an intense campaign to reduce the water consumption, we want to estimate the average daily consumption per home at that population.

What sample size is required to obtain a \(\mathrm{CI}_{0.95}(\mu)\) with \(2\) liters of error margin?
If the value of \(\sigma\) increases, would the needed sample size be bigger or smaller than that required in part a?
If the significance level is \(\alpha=0.10,\) would the needed sample size be bigger or smaller than that required in part a?

Exercise 5.2 During the quality control process carried out in a mobile phone company, it turned out that \(6\) out of \(120\) mobiles randomly selected were defective. Obtain a \(99\%\) confidence interval for the proportion of defective mobiles in the manufacturing process of this company.

Exercise 5.3 Assume a sample of size \(n=1\) of a \(\mathcal{U}(0,\theta)\) distribution, where \(\theta\) is unknown. Find the inferior limit of a confidence interval at \(0.95\) confidence level for \(\theta.\)

Exercise 5.4 With the aim of assigning research grants to master students in scientific masters, the Spanish Ministry of Education is analyzing the final grades of students in the scientific BSc’s of the last academic course.

Assuming that the final grades follow approximately a normal distribution. Compute the confidence interval for the mean of the final grades if the following srs of \(15\) students is available: \(6.2,\) \(7.3,\) \(5.5,\) \(6.7,\) \(9.0,\) \(7.1,\) \(5.0,\) \(6.3,\) \(7.2,\) \(7.5,\) \(8.0,\) \(7.9,\) \(6.5,\) \(6.1,\) \(7.0.\)
Compute the confidence interval for the variance of the final grades.
Compute the confidence interval for the mean if the distribution of final grades is unknown and a srs of \(50\) students with \(\bar{X}=6.5\) and \(S'=1.3\) is available.

Exercise 5.5 In order to estimate the variance of the amount of study hours of the students in a master in statistics, a srs of \(20\) students is analysed, resulting a quasistandard deviation of \(2.5\) hours. Determine a confidence interval for the variance at the confidence level \(0.90,\) assuming that the weekly study hours follow a normal distribution.

Exercise 5.6 The monthly gross salary of a certain group of professionals roughly follows a log-normal distribution (Exercise 5.19). A union gathers the following srs of salaries, measured in euros:

\[\begin{align*} 2072, 2726, 2254, 3029, 2283, 2673, 2401, 2463, 2416, 2909,\\ 2385, 2055, 2139, 2553, 2453, 2621, 2808, 3583, 2662, 2732,\\ 1996, 2164, 2036, 2915, 2507, 3245, 2337, 2672, 3089, 3012,\\ 2725, 2467, 2439, 2692, 1793, 3845, 2523, 2585, 1647, 2072,\\ 1988, 2767, 3679, 2280, 2642, 3112, 2512, 2435, 2820, 2142,\\ 3197, 3103, 2130, 3213, 2464, 2243, 2958, 2529, 2160, 2484. \end{align*}\]

The union is interested in reporting the \(95\%\)-confidence interval for the mean to inform its members about the average gross salary in the sector.

Exercise 5.7 The management of a medical clinic wants to estimate the average number of days required for the treatment of the patients with ages between \(25\) and \(34\) years. A srs of \(500\) patients of the clinic with these ages provided \(\bar{X}=5.4\) and \(S'=3.1\) days. Obtain a confidence interval at confidence level \(0.95\) for the mean stay time of the patients.

Exercise 5.8 A poll was made in the fall of 1979 by the Presidential Commission about the retirement policy in the USA. The poll revealed that a large proportion of citizens was very pessimistic about their retirement prospects. When interviewed if they believed that their retirement pension was going to be sufficient, \(62.9\%\) of the \(6100\) interviewed citizens answered negatively. Compute a \(95\%\)-confidence interval for the proportion of citizens that believed that their pension would not be sufficient.

Exercise 5.9 Derive a confidence interval for \(\sigma,\) the standard deviation of a normal population, based on the derivation of Section 5.2.3.

Exercise 5.10 Consider \(X_1,\ldots,X_n\sim \mathcal{N}(\mu,\sigma^2)\) with known \(\sigma^2.\) Are you able to derive the following confidence intervals?

\(\mathrm{CI}_{1-\alpha}(e^{\mu}).\)
\(\mathrm{CI}_{1-\alpha}(a\mu+b).\)
\(\mathrm{CI}_{1-\alpha}(\mu^3).\)
\(\mathrm{CI}_{1-\alpha}(\mu^2).\)
\(\mathrm{CI}_{1-\alpha}(\sin(\mu)).\)

Reflect on your results.

Exercise 5.11 Replicate Figure 5.1 by coding in R the simulation study behind it.

Exercise 5.12 The oxygen consumption rate is a measure of the physiological activity of runners. Two groups of runners have been trained by two methods: one based on continuous training during a certain period of time each day, and another based on intermittent training with the same duration. Samples were taken from the oxygen consumption of the runners trained by both methods, obtaining the following descriptive statistics:

Continuous training	Intermittent training
\(n_1=9\)	\(n_2=7\)
\(\bar{X}_1=43.71\)	\(\bar{X}_2=39.63\)
\(S_1'^2=5.88\)	\(S_2'^2=7.68\)

It is assumed that the measurements come from independent normal populations with equal variance. Compute the confidence interval of the difference of oxygen consumption rates with a confidence level of \(0.95.\)

Exercise 5.13 We want to compare the amount of weekly study hours of the students graduated from a BSc in Statistics and a BSc in Economy. For that, we obtained a srs of \(20\) students graduated from the BSc in Statistics, with \(\bar{X}_1=3\) and \(S_1'=2.5,\) and another srs of \(30\) students graduated from the BSc in Economy, with \(\bar{X}_2=2.8\) and \(S_2'=2.7.\) Assume that the weekly study hours follow a normal distribution.

Compute a \(95\%\)-confidence interval for the ratio of variances of the weekly study hours for both types of students.
Assuming that the variances are equal, obtain a \(95\%\)-confidence interval for the difference of means of weekly study hours of the two types of students.

Exercise 5.14 Consider \((X_{11},\ldots,X_{1n_1})\) and \((X_{21},\ldots,X_{2n_2})\) two independent srs of rv’s with \(\mathcal{N}(\mu_1,\sigma_1^2)\) and \(\mathcal{N}(\mu_2,\sigma_2^2)\) distributions, respectively. Prove or disprove that:

\(\bar{X}_1-\bar{X}_2\) and \(S^2_1/S^2_2\) are independent.
\(\bar{X}_1/\bar{X}_2\) and \(S^2_2/S^2_1\) are independent.
\(\bar{X}_1-\bar{X}_2\) and \(\bar{X}_1+\bar{X}_2\) are independent.
\(S^2_1/S^2_2\) and \(S^2_1S^2_2\) are independent.

Disprove the statements using simulations in R. Prove using mathematical reasoning.

Exercise 5.15 Assume that the final grades of Exercise 5.4 are distributed as \(\Gamma(k,\beta)\) with shape \(k>0\) (known) and scale \(\beta>0\) (unknown).

Find the MLE of \(\beta\) and obtain its asymptotic distribution.
From the obtained asymptotic distribution, find a pivot and construct an asymptotic confidence interval for \(\beta.\)
Estimate the mean final grade and obtain an asymptotic confidence interval at significance level \(\alpha=0.05\) by using the results in parts a and b.
Compute the asymptotic confidence interval obtained in part c if we take a srs of \(50\) students with \(\bar{X}=6.5\) and \(S'=1.3.\) Compare the result with the one obtained in part b of Exercise 5.4. Assume \(k=50.\)

Exercise 5.16 Use R to validate by simulations the claimed coverage \(1-\alpha\) of the confidence interval for:

\(\mu\) given in Section 5.2.1.
\(\mu\) given in Section 5.2.2.
\(\sigma^2\) given in Section 5.2.3.

In order to accomplish the exercise efficiently, create a suitable R function that computes the confidence interval and checks if the parameter belongs to it. Then, approximate the coverage probability by the relative frequency of the event “parameter belongs to the confidence interval” using \(M=1000\) simulations. You may fix \(n=25.\) Set the remaining parameters at your convenience.

Exercise 5.17 The number of needed corrections per page in a book follows a Poisson distribution with parameter \(\lambda.\) After proofreading \(30\) pages at random, an editor annotates the following amount of required corrections: 1, 2, 2, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 2, 0, 1, 1, 2, 1, 0, 0, 0, 0, 2, 0, 2, 0, 0, 1. Compute a \(99\%\)-confidence interval for \(\lambda.\)

Exercise 5.18 The pdf of the Rayleigh distribution is given by

\[\begin{align*} f(x;\lambda)=\frac{x}{\lambda^2}e^{-\frac{x^2}{2\lambda^2}},\quad x>0,\ \lambda>0. \end{align*}\]

Given a srs \((X_1,\ldots,X_n)\) from the distribution:

Find \(\hat{\lambda}_{\mathrm{MLE}}.\)
Derive \(\mathcal{I}(\lambda)\) or \(\hat{\mathcal{I}}(\lambda).\)
Find a pivot and construct an asymptotic confidence interval for \(\lambda.\)

Exercise 5.19 The pdf of the log-normal distribution \(\mathcal{LN}(\mu,\sigma^2)\) is given by

\[\begin{align*} f(x;\mu,\sigma^2)=\frac{1}{x\sqrt{2\pi}\sigma}e^{-\frac{(\log x-\mu)^2}{2\sigma^2}},\quad x>0,\,\mu\in\mathbb{R},\,\sigma>0. \end{align*}\]

Given a srs \((X_1,\ldots,X_n)\) from the distribution:

Find \((\hat{\mu}_{\mathrm{MLE}},\hat{\sigma}^2_{\mathrm{MLE}}).\)
Derive \(\mathcal{I}(\mu)\) and \(\mathcal{I}(\sigma^2).\)
Find pivots and construct asymptotic confidence intervals for \(\mu\) and \(\sigma^2.\)

Exercise 5.20 Use R to validate by simulations the claimed coverage \(1-\alpha\) of the asymptotic confidence interval for:

\(\lambda\) given in Example 5.9.
\(\lambda\) given in Example 5.11.
\(p_A-p_B\) given in Example 5.10.

In order to accomplish the exercise efficiently, create a suitable R function that computes the confidence interval and checks if the parameter belongs to it. Then, approximate the coverage probability by the relative frequency of the event “parameter belongs to the asymptotic confidence interval” using \(M=1000\) simulations. Use:

\(n=n_1=n_2=10\) (non-asymptotic case) and
\(n=n_1=n_2=100\) (asymptotic-like case).

Set the remaining parameters to your convenience.

Exercise 5.21 Perform bootstrap confidence intervals using boot_ci() for:

Part b in Exercise 5.4.
Exercise 5.6.

Exercise 5.22 In a referendum of a certain country, a famous news channel forecasted that option A would obtain \(53.5\%\) of the votes and option B \(46.5\%\) using a poll of \(704\) people. However, a political crisis has been revealed in the last days of the campaign, where no further polls are allowed, which has made the voters of option A less prone to vote. At the referendum day, the news channel interviews \(167\) voters after they had voted. They obtain that \(77\) of them voted for option A and \(90\) for option B. While the counting is being done, the news channel wants to advance if there is an expected flip in the outcome of the referendum. They want to be sure before the official results are announced.

Use boot_ci() to compute bootstrap \(99\%\)-confidence intervals for the percentage of votes for option A:

Before voting day.
At the voting day.

What should the news channel report given the results?

Exercise 5.23 Construct a bootstrap confidence interval based on the moment estimator \(\hat{\theta}_\mathrm{MM}\) in a \(\mathcal{U}(-\theta,\theta)\) distribution with \(\theta>0.\) For that purpose, use boot_ci(). Check with simulations that the coverage probability of the bootstrap interval is close to \(1-\alpha.\)

Exercise 5.24 The pmf of the negative binomial distribution \(\mathrm{NB}(r,p)\) is given by

\[\begin{align*} p(x;r,p)=\binom{x+r-1}{x}p^r(1-p)^x,\quad x=0,1,\ldots,r>0,\, 0<p<1. \end{align*}\]

A negative binomial rv measures the number of failures before observing \(r\) successes in a sequence of independent Bernoulli trials with success probability \(p.\)

Use that \(\mathbb{E}[\mathrm{NB}(r,p)]=r(1-p)/p\) and \(\mathbb{V}\mathrm{ar}[\mathrm{NB}(r,p)]=r(1-p)/p^2\) to derive the moment estimators \(\hat{r}_\mathrm{MM}\) and \(\hat{p}_\mathrm{MM}.\) Then, using boot_ci(), construct a bootstrap confidence interval for \(r\) and check with simulations that the coverage probability of the bootstrap interval is close to \(1-\alpha.\)

Exercise 5.25 Perform Exercise 5.20 using bootstrap confidence intervals instead of asymptotic confidence intervals. For that purpose, use the function boot_ci() conveniently in parts a and b, and modify it appropriately for part c.