30.7 Statistical validity conditions: Two independent means

As usual, these results apply under certain conditions, which are the same as those for forming a CI for the difference between two means.

The test above is statistically valid if one of these conditions is true:

Both sample sizes are at least 25; or
Either sample size is smaller than 25, and both populations have an approximate normal distribution.

The sample size of 25 is a rough figure here, and some books give other values (such as 30). We can explore the histograms of the samples to determine if normality of the populations seems reasonable.

In addition to the statistical validity condition, the test will be

internally valid if the study was well designed; and
externally valid if the sample is a simple random sample and is internally valid.

Example 30.1 (Statistical validity) For the reaction-time data, both samples sizes are \(n=32\). This means that the results will be statistically valid.

Explicitly, the data in each group do not need be normally distributed, since both sample sizes are larger than 25.

Example 30.2 (Gray whales) A study of gray whales (Eschrichtius robustus) measured (among other things) the length of adult whales (Agbayani et al. 2020). The data are shown below.

Sex	Mean (in m)	Standard deviation (in m)	Sample size
Female	12.70	0.611	260
Male	12.07	0.705	139

Are adult female gray whales longer than males, on average?

Let’s define the difference as the mean length of female gray whales minus the mean length of male gray whales. Then we wish to estimate the difference \(\mu_F - \mu_M\), where \(F\) and \(M\) represent female and male gray whales respectively; this is the parameter of interest. The best estimate of this difference is \(\bar{x}_F - \bar{x}_M = 12.70 - 12.07 = 0.63\) m.

The hypotheses are:

\(H_0\): \(\mu_F - \mu_M = 0\)
\(H_1\): \(\mu_F - \mu_M \ne 0\)

We know that the difference between the sample means is likely to vary from sample to sample, and hence it has a standard error.

We cannot easily determine the standard error of this difference from the above information (though it is possible), so we must be given this information: \(\text{s.e.}(\bar{x}_F - \bar{x}_M) = 0.07079\).

The test statistic is

\[ t = \frac{(\bar{x}_F - \bar{x}_M) - (\mu_F - \mu_M)}{\text{s.e.}(\bar{x}_F - \bar{x}_M)} = \frac{0.63 - 0}{0.07079} = 8.90, \] which is very large. This means that the \(P\)-value will be very small (using the 68-95–99.7 rule).

We write:

There is very strong evidence (\(t = 8.90\); two-tailed \(P < 0.001\)) that the mean length of adult gray whales is different for females (mean: 12.70 m; standard deviation: 0.611 m) and males (mean: 12.07 m; standard deviation: 0.705 m; 95% CI for the difference: 0.48 m to 0.77 m).

Since both sample sizes are large, the test is statistically valid.

(Check that you can compute the correct CI!)

References

Agbayani S, Fortune SME, Trites AW. Growth and development of North Pacific gray whales (Eschrichtius robustus). Journal of Mammalogy. 2020;101(3):742–54.