## E.7 Answers to Lecture 7 tutorial

**Return to index of tutor information.**

### E.7.1 Answers to Sect. 7.1

This is just one box (one

*observation*), but the claim is about the*population mean*. Some boxes will have more than 45, and some fewer.Jake is correct in one sense: You can’t have 0.9 of a match. But the value is the

*mean*number, and that*can*be a decimal. Suppose 10 boxes had 49 matches, and 10 boxes had 50 matches…. is the mean 49, or is it 50? Neither are correct; the mean is 49.5.

Jake is confusing the

*sample*and*population*mean. The claim is that the mean is \(45\). The sample produced a mean of \(44.9\). a*sample*mean and a*population*mean. Why should the mean of two different things be the same? It’s like expecting your height and your Mum’s height to be the same: they are both heights, but of different things. Why should they be the same?Of course, every sample will produce a different sample mean. This sample may just have an unusually low number of matches.

The manufacturer is lying, or a sample that just happens to have a smaller number of matches: (bad) luck.

A CI gives some indication of the variation implied by the sample.

The standard error for the mean is \(0.11\div\sqrt{20} = 0.0246\). So the approximate 95% CI is: \(44.9 \pm (2\times 0.0246)\), or \(44.9\pm 0.049\), or from \(44.85\) to \(44.95\).

No. A 95% CI may or may not contain the population mean. Of course, the manufacturer

*may*indeed be lying… but we’d need to be cautious about making such a bold claim on just this evidence. Ideally, we would repeat this study a few times or take a larger sample. But it*is*looking suspicious…If we had many, many sets of 20 matches boxes, 95% of these sets of 20 would have a mean between 44.85 and 44.95.

\(\bar{x}\) is the mean of the sample, so \(\bar{x} = 44.9\).

\(\mu\) is the mean of the population; the true mean if you like. \(\mu\) is

*claimed*to be 45, but the the value of \(\bar{x}\) will, of course, vary.

### E.7.2 Answers to Sect. 7.2

\(\mu\) is the population mean diameter size of

*all*EB pizzas; \(\bar{x}\) is the mean diameter of the pizzas in the sample.\(\bar{x} = 11.486\) inches; It’s not sensible to quote the diameter to \(0.001\) of a cm; what is sensible?

We don’t know the value of \(\mu\), and we never will. Our best

*estimate*is the value of \(\bar{x}\).\(s=0.24658\) inches. It’s not sensible to quote the diameter to \(0.001\) of a cm though.

\(\sigma\) is the standard deviation of the population. We don’t know the value of \(\sigma\), and we never will.

\(\displaystyle\text{s.e.}(\bar{x}) = s/\sqrt{n} = 0.24658/\sqrt{125} = 0.02205\).

The first measures the variation in the diameters of individual pizzas; the second measures the precision of the sample mean when used to estimate the population mean.

Almost certainly not the same. Probably close to \(\bar{x}=11.486\) inches. More precisely, probably within three standard errors (\(3\times 0.022\)) of \(\bar{x}\).

Normal; mean \(\mu\); std. dev is the standard error of 0.02205.

The approximate 95% CI is \(11.486\pm (2\times 0.02205)\) or \(11.486\pm0.044\), which is from 11.44 to 11.53 inches.

Based on the sample, a 95% confidence interval for the population mean for the pizza diameter is between \(11.44\) and \(11.53\) inches.

\(n>25\)

**or**\(n\le 25\) and population has normal distribution.We do not need to

**assume**that \(n>25\) because we know that it is. (We do*not*require that the sample or the population has a normal distribution. We require that the*sample means*have an approximate normal distribution, which they will if \(n>25\).) So the CI is statistically valid.Population mean diameter probably not 12 inches based on the CI.

Compute: \[n=\left(\frac{2\times 0.24658}{0.04}\right)^2 = 152.004\] at least, so we would need 153 pizzas.

### E.7.3 Answers to Sect. 7.3

**Descriptive**: Every subject is treated the same way; we are not comparing two groups that have been treated differently.- \(\mu_d\) is the mean difference in the target population; \(\bar{d}\) is the mean difference in this sample.
- Each measurement is measured after and before on the same subject.
- \(32\); \(12\); \(24\); \(30\); \(8\); \(14\); \(14\); \(28\); \(38\); \(49\).
(Differences in the other direction are also acceptable;
it just changes the signs of these differences snd so on.
**Importantly, the direction should be stated somewhere.**) - It makes more sense to define directions this way,
so that the difference is the
*increase*in 2MWT. - \(\bar{d} = 24.9\); \(s_d=13.03372\)
- \(\text{s.e.}(\bar{d}) = s_d/\sqrt{n} = 13.03372/\sqrt{10} = 4.121623\). This is the standard deviation of the sample mean difference, a measurement of how precisely the sample mean difference measures the population mean difference.
- Almost impossible. Sample means vary every time we take a sample around the true mean difference, with a normal distribution with standard error \(4.12\). Since we don’t know \(\mu\), the best we can say is that the sample mean will vary about our best guess of the population mean; in other words, the sample means vary around \(24.9\) with a standard deviation of about \(4.12\).
- Normal; mean \(\mu\), std deviation is the standard error of 4.121.
- \(24.9\pm (2\times 4.121623)\), or \(24.9\pm 8.243246\), or from \(16.65675\) to \(33.143245\)m.
- We are 95% confident that the population 2MWT
*increases*by a mean amount between \(16.7\) and \(33.1\)m. - Either (or both) of these must be true:
- the population has a normal distribution, and/or
- the sample size is large enough so that the sample means have a normal distribution, so about larger than 25.

- Since \(n<25\), we need to assume the population of differences has a normal distribution. A stem-and-leaf plot suggests this is not unreasonable, so the sample means quite possibly have an approx. normal distribution:
- (Recall we haven’t done hypothesis testing in this context yet!) Looks pretty likely that the 2MWT distances are higher after receiving the implant.
- \(n=(2\times 13.03372\div 5)^2 = 27.18\), so need data from at least \(28\) amputees.