## 3.1 Explaining the one-sample \(t\)-test results

If we begin by assuming \(H_0\) is true, then we assume we have \(\mu = \mu_0 = 5\). To carry out the \(t\)-test, we use the \(t_{71}\) distribution, which is pictured below:

The above distribution is called the ** distribution under \(H_0\)**. We can see that the mean of the above distribution is at \(t = 0\), which, due to

**, represents the value \(\mu_0 = 5\).**

*standardisation*The ** test statistic** can be thought of as a

**version of the sample mean. The**

*standardised***can be calculated as**

*test statistic*\[t = \displaystyle \frac{\bar{x} - \mu_0}{\text{SE}} = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}, \]

where:

- \(t\) denotes the
*test statistic* - \(\text{SE}\) refers to the
. The standard error is an*Standard Error**estimate*of the*standard deviation of the mean*, which, as we know from the previous topic, is \(\frac{s}{\sqrt{n}}\).

In our example, our test statistic can be calculated as

\[t = \displaystyle \frac{5.13 - 5}{0.5/\sqrt{72}} = 2.2062.\]
Now comes the all-important question: If it were true that \(\mu = 5\), what are the chances that, when we took our sample of 72 patients, we would have seen this sample mean of \(\bar{x} = 5.13\) (which translates to a test statistic of \(t = 2.2062\)), ** or more extreme**? Is our test statistic

**in the context of the above \(t\)-distribution which assumes \(H_0\) is true? Let's have a look:**

*extreme*
As it turns out, our test statistic ** is** fairly

**in the context of this distribution, because the probability of obvserving this test statitsic**

*extreme***is only \(p = 0.0306\). That is:**

*if \(H_0\) is true*- \(P(-2.2062 \leq T \geq 2.062) = 0.0306\)

This probability is our **\(p\)-value**.

In the type of hypothesis test we have done here, we were only interested in whether \(\mu\) was ** different from** 5, which is why we have included the probability of seeing a test statistic at least as extreme as what we have seen

**. That is, greater than 2.062 or less than -2.062. This is called a**

*in either direction***. This point will be further explained in the following sections.**

*two-sided test*Because our \(p\)-value was small, this means we have enough evidence to ** reject \(H_0\)**. Therefore, we have evidence to support the alternative hypothesis that \(\mu \neq 5\), i.e. this result is

**.**

*statistically significant*How small does our \(p\)-value need to be for us to decide that the test statistic is extreme enough for us to reject \(H_0\)? The answer comes in the ** level of significance, \(\alpha\)** (Greek letter, 'alpha'). In general, the standard level of significance is \(\alpha = 0.05\), although other levels of \(\alpha\) can be chosen. That is,

- if \(p < \alpha\), reject \(H_0\)
- if \(p > \alpha\), do not reject \(H_0\)

Note the wording above for when \(p > \alpha\): *do not reject \(H_0\)*. Just because we do not have enough evidence to reject \(H_0\) does not mean we have proven \(H_0\) is true. So it is best to avoid using terms like, *accept \(H_0\)*.

The method we have used above to carry out the hypothesis test is called the ** \(p\)-value approach**.

There is another method we could use called the ** critical region approach**. To understand this, let's consider the question,

*if \(\alpha = 0.05\), how extreme would our test statistic need to be in order to reject \(H_0\)?*To answer this question, we can find the quantiles such that \(P(-t \leq T \geq t) = 0.05\) as represented below:

As we can see, \(P(-1.99 \leq T \geq 1.99) = 0.05\). This means that if our test statistic was in the range \(P(-1.99 \leq T \geq 1.99)\) (i.e., greater than 1.99 or less than -1.99) then we would say it falls in the ** critical region** and we would reject \(H_0\), because any value of \(t\) within this range would result in \(p < 0.05\).