A.7 Answer: TW 7 tutorial

Answers for Sect. 7.2

This is just one box of matches (one observation), but the claim is about the population mean. Some boxes will have more than \(45\), and some fewer.
Jake is correct in one sense: You can't have \(0.9\) of a match. But the value is the mean number, and that can be a decimal. Suppose \(10\) boxes had \(49\) matches, and \(10\) boxes had \(50\) matches... is the mean \(49\), or is it \(50\)? Neither are correct; the mean is \(49.5\).
Jake is confusing the sample and population mean. The claim is that the population mean is \(45\). The sample produced a mean of \(44.9\).

Why should the mean of two different things be the same? It's like expecting your height and your Mum's height to be the same: they are both heights, but of different things. Why should they be the same?

Of course, every sample will produce a different sample mean. This sample may just have an unusually low number of matches.
Either (1) The manufacturer is lying; or (2) the manufacturer is not lying, and this sample just happens to have a smaller number of matches: (bad) luck.
A CI gives some indication of the variation implied by the sample.
The standard error for the mean is \(0.124\div\sqrt{25} = 0.0248\). So the approximate \(95\)% CI is: \(44.9 \pm (2\times 0.0248)\), or \(44.9\pm 0.0496\), or from \(44.85\) to \(44.95\).
No. A \(95\)% CI may or may not contain the population mean. Of course, the manufacturer may indeed be lying... but we'd need to be cautious about making such a bold claim on just this evidence. Ideally, we would repeat this study a few times or take a larger sample. But it is looking suspicious...

If we had many, many sets of \(25\) matches boxes, \(95\)% of these sets of \(25\) would have a mean between \(44.85\) and \(44.95\).
\(\bar{x}\) is the mean of the sample, so \(\bar{x} = 44.9\).
\(\mu\) is the mean of the population; the true mean if you like. \(\mu\) is claimed to be \(45\), but the the value of \(\bar{x}\) will, of course, vary.

Answers for Sect. 7.3

Answers implied by H5P.

\(\displaystyle \text{s.e.}(\hat{p}) = \sqrt{ \big(\hat{p}\times(1 - \hat{p}) \big)/n}\), where \(\hat{p}\) is the sample proportion; \(n\) the sample size; "s.e." the "standard error".

Answers for Sect. 7.4

\(123/(404 - 123) = 123/281 = 0.44\).
\(\hat{p} = 123/404 = 0.304455\).
The odds: The likelihood of surviving is about \(0.44\) times the probability of dying (i.e., it is lower). Or: For every \(100\) that die, about \(44\) survive.
No: sampling variation!
\(\text{s.e.}(\hat{p}) = \sqrt{0.304455 \times (1 - 0.304455)/404} = \sqrt{0.00052416} = 0.022894\), or about \(0.023\).
A definition can be found in the textbook Glossary. Essentially, each sample is likely to produce a different value for the sample proportion, \(\hat{p}\) (the estimate of the population proportion, \(p\)), and that is what we mean by "sampling variation".
(Not provided.)
The values of \(\hat{p}\) will have an approximate normal distribution, with a standard deviation equal to standard error (\(0.023\)) and centred around the true proportion \(p\). A \(95\)% CI: \(0.304455 \pm (2\times0.022894)\), or \(0.30446\pm0.04579\), or \(0.26\) to \(0.35\).
One way of writing communicating: “The population proportion of patients surviving after BVM treatment has a \(95\)% chance of lying between \(26\)% and \(35\)%.” This is not strictly correct, but acceptable and very, very commonly used (as explained in the textbook.
Larger, to get a tighter (more precise) CI than the one calculated.
The number of surviving and non surviving both exceed \(5\).

Answers for Sect. 7.5

\(\mu\) is the population mean diameter size of all EB pizzas; \(\bar{x}\) is the mean diameter of the pizzas in the sample.
\(\bar{x} = 11.486\) inches; it's not sensible to quote the diameter to \(0.001\) of an inch; what would be is sensible? We don't know the value of \(\mu\), and we never will. Our best estimate is the value of \(\bar{x}\).
\(s = 0.24658\) inches. It's not sensible to quote the diameter to \(0.001\) of a cm though. \(\sigma\) is the standard deviation of the population. We don't know the value of \(\sigma\), and we never will.
\(\displaystyle\text{s.e.}(\bar{x}) = s/\sqrt{n} = 0.24658/\sqrt{125} = 0.02205\).
The first measures the variation in the diameters of individual pizzas; the second measures the precision of the sample mean when used to estimate the population mean.
Almost certainly not the same. Probably close to \(\bar{x} = 11.486\) inches. More precisely, probably within three standard errors (\(3\times 0.022\)) of \(\bar{x}\).
Normal; mean \(\mu\); std. dev is the standard error of \(0.02205\).
The approximate \(95\)% CI is \(11.486\pm (2\times 0.02205)\) or \(11.486\pm0.044\), which is from \(11.44\) to \(11.53\) inches.
Based on the sample, a \(95\)% confidence interval for the population mean for the pizza diameter is between \(11.44\) and \(11.53\) inches.
\(n > 25\) or \(n\le 25\) and population has normal distribution.
We do not need to assume that \(n > 25\) because we know that it is. We do not require that the sample or the population has a normal distribution. We require that the sample means have an approximate normal distribution, which they will if \(n > 25\). So the CI is statistically valid, and the histogram is not needed.
Population mean diameter probably not \(12\) inches based on the CI.

Answers for Sect. 7.6

\(\sqrt{0.70\times(1 - 0.70)/25} = \sqrt{0.0084} = 0.09165151\), or about \(0.09165\).
\(\sqrt{0.25\times(1 - 0.25)/100} = \sqrt{0.001875} = 0.04330127\), or about \(0.04330\).
\(0.08724964\), or about \(0.08725\).
\(0.0534479\), or about \(0.05345\).

All statistically valid.

Note: Students commonly forget to take the square root.

Note: If you calculator gives an answer something like 1.875 E-03 or similar, it is using scientific notation. It means \(1.875\times 10^{-3}\), or \(0.001875\).