## A.8 Answer: TW 8 tutorial

### Answers for Sect. 8.2

This is just one box o fmatches (one

*observation*), but the claim is about the*population mean*. Some boxes will have more than 45, and some fewer.Jake is correct in one sense: You can't have 0.9 of a match.

But the value is the

*mean*number, and that*can*be a decimal. Suppose 10 boxes had 49 matches, and 10 boxes had 50 matches... is the mean 49, or is it 50? Neither are correct; the mean is 49.5.Jake is confusing the

*sample*and*population*mean. The claim is that the*population*mean is \(45\). The sample produced a mean of \(44.9\).Why should the mean of two different things be the same? It's like expecting your height and your Mum's height to be the same: they are both heights, but of different things. Why should they be the same?

Of course, every sample will produce a different sample mean. This sample may just have an unusually low number of matches.

Either (1) The manufacturer is lying; or (2) the manufacturer is

*not*lying, and this sample just happens to have a smaller number of matches: (bad) luck.A CI gives some indication of the variation implied by the sample.

The standard error for the mean is \(0.124\div\sqrt{25} = 0.0248\). So the approximate 95% CI is: \(44.9 \pm (2\times 0.0248)\), or \(44.9\pm 0.0496\), or from \(44.85\) to \(44.95\).

No. A 95% CI may or may not contain the population mean. Of course, the manufacturer

*may*indeed be lying... but we'd need to be cautious about making such a bold claim on just this evidence. Ideally, we would repeat this study a few times or take a larger sample. But it*is*looking suspicious...If we had many, many sets of 25 matches boxes, 95% of these sets of 25 would have a mean between 44.85 and 44.95.

\(\bar{x}\) is the mean of the sample, so \(\bar{x} = 44.9\).

\(\mu\) is the mean of the population; the true mean if you like. \(\mu\) is

*claimed*to be 45, but the the value of \(\bar{x}\) will, of course, vary.

### Answers for Sect. 8.3

The two groups are completely different.

The

**parameter**of interest is the*difference between the population mean lifetimes*, say \(\mu_R - \mu_F\).The 95% CI is the

*bottom*one: from \(223.34\) to \(346.13\) days.The best of these is Option (e)... but in practice, we usually think about CIs in terms of Option (d).

The CI explanation can be improved by:

- indicating
*which*diet leads to larger average lifetimes; and - providing sample summary info.

Here is a better answer:

"The 95% confidence interval for the difference between the populations mean lifetimes of rats on the restricted diet (sample mean: 968.8 days; std dev: 284.6 days) and on the free-eating diet (684.0 days; std dev: 134.1 days) is that rats on a restricted diet live between \(223.34\) and \(346.13\) days longer."

- indicating
Since the sample is large, we must have that the two samples are independent (which is reasonable). (

**The figure is not needed.**)The boxplots show the variation in the lifetimes of individual rats. The error bar chart displays the variation that the sample means would be expected to show from sample to sample.

### Answers for Sect. 8.4

Some answers embedded.

- See Table A.2.
- Use a side-by-side barchart, for example, if necessary.
- Odds of boys maturing late: \(352 \div (2\,864 - 352) = 0.1401\). Thus boys are 0.1401 times more likely to mature late than not.
- Odds of girls maturing late: \(336 \div (2\,328) = 0.1443\). Thus girls are 0.1443 times more likely to mature late than not.
- Hence, to compare boys to girls: \(0.1401 \div 0.1443 = 0.971\).
- The
**parameter**of interest is the*population odds ratio*of late maturing, comparing boys to girls. - From software: OR is 0.971, and 95% CI is from 0.828 to 1.139.
- See Table A.3. 1, The difference could be explained by sampling variation, or because there is a real difference...

Matured late | Did not mature late | Total | |
---|---|---|---|

Males | 352 | 2512 | 2864 |

Females | 336 | 2328 | 2664 |

Total | 688 | 4840 | 5528 |

Percentage maturing late | Odds maturing late | Sample size | |
---|---|---|---|

Males | 12.3 | 0.1401 | 2864 |

Females | 12.6 | 0.1443 | 2664 |

Odds ratio | 0.97088 |

### Answers for Sect. 8.5

\(n=(2\times 7.145\div 0.5)^2 = 816.8\), so use guesses from 817 students.

### Answers for Sect. 8.6

Researchers (Nataraja et al. 1999) examined the strength of fibre reinforced concrete, by using a study design called an

*experiment*.

In batch 1, a*sample*of size 30 was used; the sample mean number of blows till the first crack appeared in the test cylinders was 98, and the amount of variation in the number of blows was measured using the*standard*deviation as 54.

Because the data are a sample, the sample mean will estimate the population mean with some sampling*error*.A type of study called an

*experiment*compared the handwriting legibility for school children (Ryan et al., 2010) having cerebral palsy when using specialist school furniture with standard school furniture (which acted as a*control*).

They used a*random*sample of size 30 from children registered at their facility in Canada.

The*sample*mean for the difference in legibility was -0.1, and a 95%*confidence*interval was from \(-0.8\) to \(0.6\).

Using the standard equipment, the smallest value recorded for legibility was 19, and the largest was 34, so the*range*was 15.

### Answers for Sect. 8.7

- Observational.
- Relational.
- Two completely separate samples are compared.
- \(-0.774\) to \(-0.560\) inches (differences are Dominos less Eagle Boys).
- The 95% CI for the difference in population means pizza diameters between EB and DOM pizzas from \(0.774\) to \(0.560\) inches, larger for EB.
- Since the sample sizes are large (both 125), we
*do not require that the populations have normal distributions*. - The sample sizes are large (\(n=125\) in each), so we don't need the populations to be normally distributed;
**we don't need the histogram**. - Probably yes. Amount of topping on the pizza? Which
*tastes*better? Whether the samples were randomly selected or not?

### Answers for Sect. 8.8

- \(H_0\): No association between income level and opinion of GM foods in the population;

\(H_1\): An association between income level and opinion of GM foods in the population. - Odds HIE: \(263/151 = 1.742\). HIE is 1.74 times more likely to be for GM foods than against.
- Odds LIE: \(258\div222 = 1.162\). LIE is 1.16 times more likely to be for GM foods than against.
- \(\text{OR}(\text{HIE in favour})\div\text{OR}(\text{LIE in favour}) = 1.742/ 1.162 = 1.5\) (\(1.499\) in table). The odds of HIE being for GM food 1.5 times the odds that a LIE for GM foods.
- From the sample, we estimate the OR in the population to be between \(1.145\) to \(1.961\). (Loosely, though technically incorrect: the true OR is likely to be between 1.145 and 1.961.) Importantly, this interval does not include 1.