A.9 Answer: TW 9 tutorial

Answers for Sect. 9.2

Answers implied by H5P.

Null hypotheses:

Is the mean length of a $12$ -inch sub really $12$ inches? $\mu = 12$ .
Is the mean length of a $12$ -inch sub different for white and wholemeal subs? $\mu_{\text{white}} = \mu_{\text{wholemeal}}$ .
Is the proportion of $12$ -inch subs shorter than $12$ inches different for white, wholemeal subs? $p_{\text{white}} = p_{\text{wholemeal}}$ .
Is the mean length of a $12$ -inch sub longer for white (compared to wholemeal) subs? $\mu_{\text{white}} = \mu_{\text{wholemeal}}$ .

Alternative hypotheses:

Is the mean length of a $12$ -inch sub really $12$ inches? $\mu \ne 12$ .
Is the mean length of a $12$ -inch sub different for white and wholemeal subs? $\mu_{\text{white}} \ne \mu_{\text{wholemeal}}$ .
Is the proportion of $12$ -inch subs shorter than $12$ inches different for white, wholemeal subs? $p_{\text{white}} \ne p_{\text{wholemeal}}$ .
Is the mean length of a $12$ -inch sub longer for white (compared to wholemeal) subs? $\mu_{\text{white}} > \mu_{\text{wholemeal}}$ .

Answers for Sect. 9.3

Repeated-measures: Every subject has two TWMT recorded.
$\mu_d$ is the mean difference in the target population; $\bar{d}$ is the mean difference in this sample.
Each measurement is measured after and before on the same subject.
$32$ ; $12$ ; $24$ ; $30$ ; $8$ ; $14$ ; $14$ ; $28$ ; $38$ ; $49$ . (Differences in the other direction are also acceptable; it just changes the signs of these differences and so on. Importantly, the direction should be stated somewhere.)
It makes more sense to define directions this way, so that the difference is the increase in 2MWT.
$\bar{d} = 24.9$ ; $s_d=13.03372\,\text{m}$ .
$\text{s.e.}(\bar{d}) = s_d/\sqrt{n} = 13.03372/\sqrt{10} = 4.121623\,\text{m}$ .
This is the standard deviation of the sample mean difference, a measurement of how precisely the sample mean difference measures the population mean difference.
Almost impossible. Sample means vary every time we take a sample around the true mean difference, with a normal distribution with standard error $4.12$ . Since we don't know $\mu$ , the best we can say is that the sample mean will vary about our best guess of the population mean; in other words, the sample means vary around $24.9$ with a standard deviation of about $4.12$ .
Normal; mean $\mu$ , std deviation is the standard error of $4.121$ .
$24.9\pm (2\times 4.121623)$ , or $24.9\pm 8.243246$ , or from $16.65675$ to $33.143245\,\text{m}$ .
$H_0$ : $\mu_d = 0$ , differences defined as 'with' minus 'without'. $H_1$ : $\mu_d > 0$ the way I defined differences.
$t = 6.041$ and since $t$ is very large, expect $P$ to be very small.
The differences have just been defined in the opposite directions. (Notice the $P$ -values are the same.)
Very strong evidence exists in the sample (paired $t = 6.041$ ; one-tailed $P$ less than $0.0005$ ) that the population mean 2MWT are higher after receiving the implant compared to without the implant (mean difference: $24.9\,\text{m}$ higher after receiving the implant; standard deviation: $13.034\,\text{m}$ ; $95$ % CI from $16.66$ to $33.14\,\text{m}$ ).
The population of differences has a normal distribution, and/or $n > 25$ or so.
Since $n < 25$ , must assume the population of differences has a normal distribution. The histogram suggests this is not unreasonable.

Answers for Sect. 9.5

The two groups are different rats.
The parameter of interest is the difference between the population mean lifetimes, say $\mu_R - \mu_F$ .
Table not shown.
The $95$ % CI is the bottom one: from $223.34$ to$346.13$ days.
The best of these is Option (e)... but in practice, we usually think about CIs in terms of Option (d) so Option (d) is fine.
The CI explanation can be improved by (i) indicating which diet leads to larger average lifetimes; and (ii) providing sample summary info. Here is a better answer:

"The $95$ % confidence interval for the difference between the populations mean lifetimes of rats on the restricted diet (sample mean: $968.8$ days; std dev: $284.6$ days) and on the free-eating diet ( $684.0$ days; std dev: $134.1$ days) is that rats on a restricted diet live between $223.34$ and $346.13$ days longer."
Boxplots: shows the variation in the lifetimes of individual rats. Error bar chart: displays the variation that the sample means would be expected to show from sample to sample.
The null hypothesis is that there is no difference in the mean lifetimes of the two groups of rats. In symbols (where $\mu$ represents the mean lifetime in the population):

$H_0$ : $\mu_R = \mu_{FE}$ ; and $H_1$ : $\mu_R > \mu_{FE}$ ; one-tailed, because of the RQ.
Either sampling variation explain the difference, or the diets really are different.
Use the not-equal variance (Welch's test) row: $t = 9.161$ ; one-tailed $P < 0.0005$ (test is one-tailed).
$t = (284.73 - 0/31.08 = 9.16$ , as per output.
The evidence contradicts $H_0$ . Very strong evidence exists in the sample ( $t = 9.161$ for two independent samples; $\text{df} = 154.94$ ; one-tailed $P < 0.0005$ ) that the population mean lifetime of rats on a restricted diet (mean lifetime: $968.75$ days; std. dev.: $284.6$ days) is greater than rats on a free-eating diet ( $684.01$ days; $134.1$ ) ( $95$ % CI for the difference from $223.3$ to $346.1$ days).
The the sample means have a normal distribution. Since both sample sizes are greater than $24$ , this is true. The figure suggests not very severe non-normality.
Rats from the same litter are likely to be similar to each other. The litter would probably be the unit of analysis then, not the individual rat.

Answers for Sect. 9.6.1

Before and after measurement on same runner.
Each plasma $\beta$ measurement is measured after and before on the same runner.
No shown.
$\bar{d} = 18.736$ ; $s_d = 8.3297$
$\text{s.e.}(\bar{d}) = s_d/\sqrt{n} = 8.3297/\sqrt{11} = 2.5115$ .
Almost impossible. The sample means would vary every time we took a sample, around the true mean difference with a normal distribution having a standard error of about $2.51$ . Since we don't know the population mean, the best we can say is that the sample mean will vary about our best guess of the population mean. In other words, the sample means will vary around $8.33$ with a standard deviation of about $2.51$ .
$18.736\pm (2\times 2.5115)$ , or $18.736\pm5.023$ , or from $13.7$ to $23.8$ pmol/litre.
$H_0$ : $\mu_d = 0$ , differences defined as 'after' minus 'before'. $H_1$ : $\mu_d > 0$ the way I defined the differences.
We are explicitly seeking to test for an increase, so the test is one-tailed. (Diffs in the other direction are also OK; the signs of the differences and some other subsequent things change.)
$t = 7.46$ and $\text{df} = 11 - 1 = 10$ . Since $t$ is very large, expect $P$ to be very small.
Very strong evidence exists in the sample (paired $t = 7.46$ ; $\text{df} = 10$ ; one-tailed $P$ less than $0.0005$ ) that the population mean $\beta$ plasma concentrations are higher after the race compared to before the race (mean difference: $18.73$ pmol/litre higher after the race; std. dev.: $8.33$ pmol/litre; $95$ % CI from $13.14$ to $24.33$ pmol/litre).
The population of differences has a normal distribution, and/or $n > 25$ or so.
Since $n < 25$ , must assume the population of differences has a normal distribution. The histogram suggests this is not unreasonable.

Answers for Sect. 9.6.2

Because each method is used in each sea state.
$\bar{d} = 0.06167$ ; $s_d = 0.2901$ . The mean difference is positive: Method 1 measurements slightly higher (on average) than Method 2.
$\text{s.e.}({\bar{d}}) = s_d/\sqrt{n} = 0.2901/\sqrt{18} = 0.0684$ . $\text{s.e.}({\bar{d}})$ measures the precision with which the sample mean difference estimates the population mean difference.
$0.06167\pm(2\times 0.0684)$ , which is $0.06167 \pm 0.137$ , or from $-0.075$ to $0.199$ Newton--metres.
$t = 0.9016$ , so $P$ -value large: no evidence of difference.
Since $n < 25s$ , we require that the differences in the population have a normal distribution.
The stem-and-leaf plot of the sample doesn't suggest the population is non-normal; probably OK.

Answers for Sect. 9.6.3

Observational.
Relational.
Two completely different groups of pizzas (stores) are compared.
$-0.774$ to $-0.560$ inches (differences are Dominos less Eagle Boys).
From output, $t = -14.6$ and $P < 0.001$ : very strong evidence of a difference.
Since the sample sizes are large (both $125$ ), we do not require that the populations have normal distributions.
The sample sizes are large ( $n = 125$ in each), so we don't need the populations to be normally distributed; we don't need the histogram.
Probably true. Amount of topping on the pizza? Which tastes better? Whether the samples were randomly selected or not? Are sizes before of after cooking?

Answers for Sect. 9.6.4

$H_0$ : $\mu_E = \mu_U$ (that is, the means are the same in the two populations) and $H_1$ : $\mu_E\ne\mu_U$ , which can also be expressed in words. Two-tailed.
The precision with which the sample mean battery life estimate the population mean battery life.
Use the second row (though it matters little here): $t = -0.486$ ; $\text{df} = 13.1$ and $P = 0.635$ .
The sample presents no evidence ( $t = -0.486$ ; $\text{df} = 13.0$ and $P = 0.635$ ) of a difference in the mean lifetimes (mean difference: $-0.0544$ ; s.e.: $0.112$ ) of the batteries in the population. ( $95$ % CI for the difference from $-0.30$ to $0.19\,\text{mins}$ in favour of Ultracell.)
Both population have a normal distribution, and/or $n > 25$ or so.
Since $n < 25$ for both samples, we must assume the populations both have a normal distribution. Stem-and-leaf plots aren't convincing (possible outliers) but the samples are too small to know anything for sure.

Answer for Sect. 9.6.5

The sample sizes are both much larger than $25$ , so neither sample needs to be normally distributed for statistical validity: the sample means will have an approximate normal distribution. None of the graphs are needed.