6.5 $p$-value of a test | A First Course on Statistical Inference

6.5 $p$ -value of a test

Example 6.15 In Example 6.14:

Will the data support the decision of the foreman at the significance level $\alpha=0.05$ ?

In this case, the critical value of the normal is $z_{0.05}\approx 1.64$ and the observed value of the statistic was $Z=5/3\approx 1.67>1.64.$ Therefore, at the level of significance $\alpha=0.05$ it is concluded that there is evidence in favor of the foreman decision.
What would be the lowest significance level for which the data will support the foreman decision and, therefore, $H_0: p=0.10$ would be rejected?

The lowest significance level for which we would reject $H_0$ is

$\begin{align*} \mathbb{P}(Z>5/3|p=0.10)\approx0.0485. \end{align*}$

This probability is precisely the level $\alpha$ from which the decision of the test flips. It is the so-called $p$ -value of the test.

The following block advances the key operative relation to use the $p$ -value of a test to emit a decision.

Given a test for $H_0,$ the rejection decision of $H_0$ at significance level $\alpha$ depends on the $p$ -value of the test:

$\begin{align*} \begin{cases} p\text{-value}<\alpha \iff \text{Reject $H_0$ at level $\alpha;$}\\ p\text{-value}\geq\alpha \iff \text{Do not reject $H_0$ at level $\alpha.$} \end{cases} \end{align*}$

Definition 6.5 ( $p$ -value) The $p$ -value of a hypothesis test is defined as the lowest significance level $\alpha$ for which the test rejects the null hypothesis $H_0.$

Remark. The $p$ -value can be informally regarded as a “measure of the degree of compatibility of $H_0$ with the data”. A valid interpretation in terms of probability is a restatement of that in Definition 6.5: “the probability of obtaining a test statistic at least as unfavorable to $H_0$ as the observed one, under $H_0.$ ”

Remark. The following are erroneous interpretations of the $p$ -value: (1) “the probability of $H_0$ being true given the data”; (2) “the probability of $H_0$ vs. $H_1$ ”; (3) “the probability of the data given $H_0$ ”.

Depending on the kind of hypothesis to test, the $p$ -value is computed in a different way. We differentiate three cases:

One-sided right tests; $H_0: \theta=\theta_0$ vs. $H_1:\theta>\theta_0.$ Assume that the test statistic is $T(X_1,\ldots,X_n)$ and the critical region is $\begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)>k\}. \end{align*}$ If the observed value of the test statistic is $T=t,$ then the $p$ -value is $\begin{align*} \text{$p$-value}:=\mathbb{P}(T\geq t|\theta=\theta_0). \end{align*}$
One-sided left tests; $H_0: \theta=\theta_0$ vs. $H_1:\theta<\theta_0.$ In this case the critical region is $\begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)<k\} \end{align*}$ and, therefore, the $p$ -value is $\begin{align*} \text{$p$-value}:=\mathbb{P}(T\leq t|\theta=\theta_0). \end{align*}$
Two-sided tests; $H_0: \theta=\theta_0$ vs. $H_1:\theta\neq\theta_0.$ In this case, the critical region is of the form $\begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)<k_1\ \text{or}\ T(x_1,\ldots,x_n)>k_2\}. \end{align*}$ The $p$ -value is given by $\begin{align*} \text{$p$-value}:=2\min\left\{\mathbb{P}(T\leq t|\theta=\theta_0),\mathbb{P}(T\geq t|\theta=\theta_0)\right\}. \end{align*}$ Observe that if the distribution of the test statistic $T$ is symmetric about $0,$ ⁷⁹ such as the normal or Student’s distribution, then $\mathbb{P}(T\leq t|\theta=\theta_0) = \mathbb{P}(-T \leq t | \theta=\theta_0) = \mathbb{P}(T \geq -t | \theta=\theta_0)$ and the above minimum is not required: $\begin{align*} \text{$p$-value}=2\mathbb{P}(T\leq -|t||\theta=\theta_0)=2\mathbb{P}(T\geq |t||\theta=\theta_0). \end{align*}$

From the above definitions of the $p$ -value, it is clear that the $p$ -value is a function of the observed value of the test statistic $T=t.$ Therefore, the $p$ -value is a rv. In addition, the $p$ -value is uniformly distributed in $[0,1]$ under $H_0$ (see Exercise 6.16). It is also not difficult to see that $p\text{-value}<\alpha$ if and only if the observed test statistic $T=t$ belongs to $C.$

Example 6.16 Assume that in Example 6.3 it has been observed that $Y=3$ of the $n=15$ sampled voters support the candidate. Would that result indicate that the candidate is going to lose the elections (reject $H_0:p=0.5$ ) at significance level $\alpha=0.05$ ?

The hypothesis to test is

$\begin{align*} H_0:p=0.5\quad \text{vs.}\quad H_1:p<0.5. \end{align*}$

Since under $H_0:p=0.5,$ $Y\sim \mathrm{Bin}(n,0.5),$ then the $p$ -value is given by

$\begin{align*} \text{$p$-value} &=\mathbb{P}(Y\leq 3|p=0.5)=\sum_{y=0}^3 \binom{15}{y}(0.5)^{15} \\ &\approx0.018<\alpha=0.05. \end{align*}$

Equivalently, it can be computed as:

pbinom(3, size = 15, prob = 0.5)
## [1] 0.01757813

Therefore, $H_0: p=0.5$ is rejected in favor of $H_1:p<0.5;$ that is, this result indicates that the candidate will not win the elections with a significance level of $\alpha=0.05.$

Example 6.17 It is estimated that a particular flight is profitable if the average occupation rate during a year is at least $60\%.$ An airline is interested in determining whether it is profitable to keep a particular flight operative. For that, they record the occupation rates of $120$ random flights scattered around the year, resulting a mean occupation rate of $58\%$ and a quasistandard deviation of $11\%.$ Considering that the occupation rates (in proportion) have an approximate normal distribution, is there enough evidence to cancel the flight because it is not profitable? Employ a significance level of $\alpha=0.10.$

Let $\mu$ be the average occupation rate of the flight in one year. It is desired to test

$\begin{align*} H_0:\mu=0.6\quad \text{vs.}\quad H_1:\mu<0.6. \end{align*}$

The test statistic is

$\begin{align*} T=\frac{\bar{X}-0.6}{S'/\sqrt{n}}=\frac{0.58-0.6}{0.11/\sqrt{120}}\approx-1.992. \end{align*}$

Under $H_0:\mu=0.6,$ the statistic is distributed as $t_{119}$ :

$\begin{align*} \mathbb{P}(T\leq -1.992|\mu=0.6)\approx0.0239<\alpha=0.10. \end{align*}$

The last probability can be computed as

pt(-1.992, df = 119)
## [1] 0.02433035

Therefore, $H_0:\mu=0.6$ is rejected, that is, the sample indicates that the flight is not profitable.

In this case, the distributions of $T$ and $-T$ are equal!↩︎

6.5 pp-value of a test

6.5 $p$ -value of a test