2.4 The t-statistic

Until now, we’ve assumed that we knew the population variances of height among men, \(\sigma_{men}^2\), and among women, \(\sigma_{women}^2\). With this knowledge we can calculate the standard error, \(\sqrt{\frac{\sigma_{men}^2}{n_{men}} + \frac{\sigma_{women}^2}{n_{women}}}\), which allows us to calculate \(Z\)-scores, accompanying p-values, and confidence intervals. In practice, however, we don’t know these population variances and have to estimate it based on the data. The estimators, naturally, are the variances in our samples: \(s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_{i} - \bar{x})^2\).

The estimator for the standard error, \(\sqrt{\frac{\sigma_{men}^2}{n_{men}} + \frac{\sigma_{women}^2}{n_{women}}}\), becomes:

important: equal variances assumed or not?