Chapter 1 The \(t\)-distribution
Before we go too much further into the topic of hypothesis testing, there is one problem we will need to overcome: in the previous topic, when we ascertained the distribution of the sample mean \(\overline{X}\), it was assumed that the population standard deviation \(\sigma\) was known. For example, if we knew that \(X\) was normally distributed with a mean of \(\mu = 5\) and a standard deviation of \(\sigma = 1\), and a random sample of \(n = 72\) was taken, then the distribution of the sample mean would be:
\[\overline{X} \sim N\left(5, \displaystyle \frac{1^2}{72}\right) \approx N\left(5, \displaystyle 0.0139\right).\]
Now the problem here is that in order to arrive at this distribution for \(\overline{X}\), it was assumed that \(\sigma\), the true population standard deviation, was known. However, in practice, it is rarely the case that the true population standard \(\sigma\) is known. The best we normally have is the sample standard deviation which was introduced in Topic 2:
\[s = \sqrt{\displaystyle \frac{1}{n - 1}\sum_{i = 1}^n (x_i - \bar{x})^2},\] where \(s\) is the sample standard deviation, and \(\bar{x}\) is the sample mean.
In order to account for this extra uncertainty introduced by using \(s\) instead of \(\sigma\), rather than using the normal distribution for the sample mean, we will use the Student's \(t\)-distribution, or simply, the \(t\)-distribution.
We can think of the \(t\)-distribution as very similar to the Standard Normal Distribution in many ways. Both distributions are symmetric and bell-shaped, and both have a mean of \(\mu = 0\). To understand the differences, consider the below figure:
The \(t\)-distribution is defined by one parameter: the degrees of freedom (df). In the examples we will be looking at in this topic, the degrees of freedom is equal to \(n - 1\). We can see above that with a small degrees of freedom of df = 1 (blue line), the distribution is more spread out, with fatter tails, when compared with the Standard Normal Distribution (black line). We can also see that the higher the degrees of freedom is (meaning the higher the sample size), the closer the \(t\)-distribution gets to the Standard Normal Distribution.
If a continuous random variable \(T\) follows a \(t\)-distribution, then we can write \(T \sim t_{\text{df}}\). So for example, the distribution of the blue line above is \(T \sim t_1\).
Looking a bit closer at the above figure, we can see the shaded areas representing the area under each respective curve for \(z \geq 2.5\) or \(t \geq 2.5\). The area under the blue curve is obviously the biggest, and the black area obviously the smallest. Recalling that these areas represent probabilities, they can be expressed as follows:
- \(P(Z \geq 2.5) = 0.0062\) for \(Z \sim N(0, 1)\)
- \(P(T \geq 2.5) = 0.1211\) for \(T \sim t_1\)
What does this mean? Well, within the context of the Standard Normal Distribution, a value of 2.5 would be considered quite extreme because the probability of seeing a value of 2.5 or greater is only 0.0062. On the other hand, within the context of the \(t_1\) distribution, a value of 2.5 would not be considered extreme: there is a probability of 0.1211 of seeing a value of 2.5 or greater. For a value to be considered just as extreme (with probability of 0.0062) within the context of the \(t_1\) distribution, it would have to be much larger: 51.33!
We could understand this to mean that the \(t\)-distribution is more conservative than the Standard Normal Distribution in terms of what values are considered extreme. And, that the lower the sample size, the more conservative the \(t\)-distribution becomes. These concepts will be important when it comes to carrying out hypothesis tests.