6.5 \(p\)-value of a test

Example 6.15 In Example 6.14:

  1. Will the data support the decision of the foreman at the significance level \(\alpha=0.05\)?

    In this case, the critical value of the normal is \(z_{0.05}\approx 1.64\) and the observed value of the statistic was \(Z=5/3\approx 1.67>1.64.\) Therefore, at the level of significance \(\alpha=0.05\) it is concluded that there is evidence in favor of the foreman decision.

  2. What would be the lowest significance level for which the data will support the foreman decision and, therefore, \(H_0: p=0.10\) would be rejected?

    The lowest significance level for which we would reject \(H_0\) is

    \[\begin{align*} \mathbb{P}(Z>5/3|p=0.10)\approx0.0485. \end{align*}\]

    This probability is precisely the level \(\alpha\) from which the decision of the test flips. It is the so-called \(p\)-value of the test.

The following block advances the key operative relation to use the \(p\)-value of a test to emit a decision.

Given a test for \(H_0,\) the rejection decision of \(H_0\) at significance level \(\alpha\) depends on the \(p\)-value of the test:

\[\begin{align*} \begin{cases} p\text{-value}<\alpha \iff \text{Reject $H_0$ at level $\alpha;$}\\ p\text{-value}\geq\alpha \iff \text{Do not reject $H_0$ at level $\alpha.$} \end{cases} \end{align*}\]

Definition 6.5 (\(p\)-value) The \(p\)-value of a hypothesis test is defined as the lowest significance level \(\alpha\) for which the test rejects the null hypothesis \(H_0.\)

Remark. The \(p\)-value can be informally regarded as a “measure of the degree of compatibility of \(H_0\) with the data”. A valid interpretation in terms of probability is a restatement of that in Definition 6.5: “the probability of obtaining a test statistic at least as unfavorable to \(H_0\) as the observed one, under \(H_0.\)

Remark. The following are erroneous interpretations of the \(p\)-value: (1) “the probability of \(H_0\) being true given the data”; (2) “the probability of \(H_0\) vs. \(H_1\)”; (3) “the probability of the data given \(H_0\)”.

Depending on the kind of hypothesis to test, the \(p\)-value is computed in a different way. We differentiate three cases:

  1. One-sided right tests; \(H_0: \theta=\theta_0\) vs. \(H_1:\theta>\theta_0.\) Assume that the test statistic is \(T(X_1,\ldots,X_n)\) and the critical region is \[\begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)>k\}. \end{align*}\] If the observed value of the test statistic is \(T=t,\) then the \(p\)-value is \[\begin{align*} \text{$p$-value}:=\mathbb{P}(T\geq t|\theta=\theta_0). \end{align*}\]

  2. One-sided left tests; \(H_0: \theta=\theta_0\) vs. \(H_1:\theta<\theta_0.\) In this case the critical region is \[\begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)<k\} \end{align*}\] and, therefore, the \(p\)-value is \[\begin{align*} \text{$p$-value}:=\mathbb{P}(T\leq t|\theta=\theta_0). \end{align*}\]

  3. Two-sided tests; \(H_0: \theta=\theta_0\) vs. \(H_1:\theta\neq\theta_0.\) In this case, the critical region is of the form \[\begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)<k_1\ \text{or}\ T(x_1,\ldots,x_n)>k_2\}. \end{align*}\] The \(p\)-value is given by \[\begin{align*} \text{$p$-value}:=2\min\left\{\mathbb{P}(T\leq t|\theta=\theta_0),\mathbb{P}(T\geq t|\theta=\theta_0)\right\}. \end{align*}\] Observe that if the distribution of the test statistic \(T\) is symmetric about \(0,\)79 such as the normal or Student’s distribution, then \(\mathbb{P}(T\leq t|\theta=\theta_0) = \mathbb{P}(-T \leq t | \theta=\theta_0) = \mathbb{P}(T \geq -t | \theta=\theta_0)\) and the above minimum is not required: \[\begin{align*} \text{$p$-value}=2\mathbb{P}(T\leq -|t||\theta=\theta_0)=2\mathbb{P}(T\geq |t||\theta=\theta_0). \end{align*}\]

From the above definitions of the \(p\)-value, it is clear that the \(p\)-value is a function of the observed value of the test statistic \(T=t.\) Therefore, the \(p\)-value is a rv. In addition, the \(p\)-value is uniformly distributed in \([0,1]\) under \(H_0\) (see Exercise 6.16). It is also not difficult to see that \(p\text{-value}<\alpha\) if and only if the observed test statistic \(T=t\) belongs to \(C.\)

Example 6.16 Assume that in Example 6.3 it has been observed that \(Y=3\) of the \(n=15\) sampled voters support the candidate. Would that result indicate that the candidate is going to lose the elections (reject \(H_0:p=0.5\)) at significance level \(\alpha=0.05\)?

The hypothesis to test is

\[\begin{align*} H_0:p=0.5\quad \text{vs.}\quad H_1:p<0.5. \end{align*}\]

Since under \(H_0:p=0.5,\) \(Y\sim \mathrm{Bin}(n,0.5),\) then the \(p\)-value is given by

\[\begin{align*} \text{$p$-value} &=\mathbb{P}(Y\leq 3|p=0.5)=\sum_{y=0}^3 \binom{15}{y}(0.5)^{15} \\ &\approx0.018<\alpha=0.05. \end{align*}\]

Equivalently, it can be computed as:

pbinom(3, size = 15, prob = 0.5)
## [1] 0.01757813

Therefore, \(H_0: p=0.5\) is rejected in favor of \(H_1:p<0.5;\) that is, this result indicates that the candidate will not win the elections with a significance level of \(\alpha=0.05.\)

Example 6.17 It is estimated that a particular flight is profitable if the average occupation rate during a year is at least \(60\%.\) An airline is interested in determining whether it is profitable to keep a particular flight operative. For that, they record the occupation rates of \(120\) random flights scattered around the year, resulting a mean occupation rate of \(58\%\) and a quasistandard deviation of \(11\%.\) Considering that the occupation rates (in proportion) have an approximate normal distribution, is there enough evidence to cancel the flight because it is not profitable? Employ a significance level of \(\alpha=0.10.\)

Let \(\mu\) be the average occupation rate of the flight in one year. It is desired to test

\[\begin{align*} H_0:\mu=0.6\quad \text{vs.}\quad H_1:\mu<0.6. \end{align*}\]

The test statistic is

\[\begin{align*} T=\frac{\bar{X}-0.6}{S'/\sqrt{n}}=\frac{0.58-0.6}{0.11/\sqrt{120}}\approx-1.992. \end{align*}\]

Under \(H_0:\mu=0.6,\) the statistic is distributed as \(t_{119}\):

\[\begin{align*} \mathbb{P}(T\leq -1.992|\mu=0.6)\approx0.0239<\alpha=0.10. \end{align*}\]

The last probability can be computed as

pt(-1.992, df = 119)
## [1] 0.02433035

Therefore, \(H_0:\mu=0.6\) is rejected, that is, the sample indicates that the flight is not profitable.


  1. In this case, the distributions of \(T\) and \(-T\) are equal!↩︎