## 6.5$$p$$-value of a test

Example 6.15 In Example 6.14:

1. Will the data support the decision of the foreman at the significance level $$\alpha=0.05$$?

In this case, the critical value of the normal is $$z_{0.05}\approx 1.64$$ and the observed value of the statistic was $$Z=5/3\approx 1.67>1.64.$$ Therefore, at the level of significance $$\alpha=0.05$$ it is concluded that there is evidence in favor of the foreman decision.

2. What would be the lowest significance level for which the data will support the foreman decision and, therefore, $$H_0: p=0.10$$ would be rejected?

The lowest significance level for which we would reject $$H_0$$ is

\begin{align*} \mathbb{P}(Z>5/3|p=0.10)\approx0.0485. \end{align*}

This probability is precisely the level $$\alpha$$ from which the decision of the test flips. It is the so-called $$p$$-value of the test.

The following block advances the key operative relation to use the $$p$$-value of a test to emit a decision.

Given a test for $$H_0,$$ the rejection decision of $$H_0$$ at significance level $$\alpha$$ depends on the $$p$$-value of the test:

\begin{align*} \begin{cases} p\text{-value}<\alpha \iff \text{Reject H_0 at level \alpha;}\\ p\text{-value}\geq\alpha \iff \text{Do not reject H_0 at level \alpha.} \end{cases} \end{align*}

Definition 6.5 ($$p$$-value) The $$p$$-value of a hypothesis test is defined as the lowest significance level $$\alpha$$ for which the test rejects the null hypothesis $$H_0.$$

Remark. The $$p$$-value can be informally regarded as a “measure of the degree of compatibility of $$H_0$$ with the data”. A valid interpretation in terms of probability is a restatement of that in Definition 6.5: “the probability of obtaining a test statistic at least as unfavorable to $$H_0$$ as the observed one, under $$H_0.$$

Remark. The following are erroneous interpretations of the $$p$$-value: (1) “the probability of $$H_0$$ being true given the data”; (2) “the probability of $$H_0$$ vs. $$H_1$$”; (3) “the probability of the data given $$H_0$$”.

Depending on the kind of hypothesis to test, the $$p$$-value is computed in a different way. We differentiate three cases:

1. One-sided right tests; $$H_0: \theta=\theta_0$$ vs. $$H_1:\theta>\theta_0.$$ Assume that the test statistic is $$T(X_1,\ldots,X_n)$$ and the critical region is \begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)>k\}. \end{align*} If the observed value of the test statistic is $$T=t,$$ then the $$p$$-value is \begin{align*} \text{p-value}:=\mathbb{P}(T\geq t|\theta=\theta_0). \end{align*}

2. One-sided left tests; $$H_0: \theta=\theta_0$$ vs. $$H_1:\theta<\theta_0.$$ In this case the critical region is \begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)<k\} \end{align*} and, therefore, the $$p$$-value is \begin{align*} \text{p-value}:=\mathbb{P}(T\leq t|\theta=\theta_0). \end{align*}

3. Two-sided tests; $$H_0: \theta=\theta_0$$ vs. $$H_1:\theta\neq\theta_0.$$ In this case, the critical region is of the form \begin{align*} C=\{(x_1,\ldots,x_n)'\in\mathbb{R}^n:T(x_1,\ldots,x_n)<k_1\ \text{or}\ T(x_1,\ldots,x_n)>k_2\}. \end{align*} The $$p$$-value is given by \begin{align*} \text{p-value}:=2\min\left\{\mathbb{P}(T\leq t|\theta=\theta_0),\mathbb{P}(T\geq t|\theta=\theta_0)\right\}. \end{align*} Observe that if the distribution of the test statistic $$T$$ is symmetric about $$0,$$79 such as the normal or Student’s distribution, then $$\mathbb{P}(T\leq t|\theta=\theta_0) = \mathbb{P}(-T \leq t | \theta=\theta_0) = \mathbb{P}(T \geq -t | \theta=\theta_0)$$ and the above minimum is not required: \begin{align*} \text{p-value}=2\mathbb{P}(T\leq -|t||\theta=\theta_0)=2\mathbb{P}(T\geq |t||\theta=\theta_0). \end{align*}

From the above definitions of the $$p$$-value, it is clear that the $$p$$-value is a function of the observed value of the test statistic $$T=t.$$ Therefore, the $$p$$-value is a rv. In addition, the $$p$$-value is uniformly distributed in $$[0,1]$$ under $$H_0$$ (see Exercise 6.16). It is also not difficult to see that $$p\text{-value}<\alpha$$ if and only if the observed test statistic $$T=t$$ belongs to $$C.$$

Example 6.16 Assume that in Example 6.3 it has been observed that $$Y=3$$ of the $$n=15$$ sampled voters support the candidate. Would that result indicate that the candidate is going to lose the elections (reject $$H_0:p=0.5$$) at significance level $$\alpha=0.05$$?

The hypothesis to test is

\begin{align*} H_0:p=0.5\quad \text{vs.}\quad H_1:p<0.5. \end{align*}

Since under $$H_0:p=0.5,$$ $$Y\sim \mathrm{Bin}(n,0.5),$$ then the $$p$$-value is given by

\begin{align*} \text{p-value} &=\mathbb{P}(Y\leq 3|p=0.5)=\sum_{y=0}^3 \binom{15}{y}(0.5)^{15} \\ &\approx0.018<\alpha=0.05. \end{align*}

Equivalently, it can be computed as:

pbinom(3, size = 15, prob = 0.5)
## [1] 0.01757813

Therefore, $$H_0: p=0.5$$ is rejected in favor of $$H_1:p<0.5;$$ that is, this result indicates that the candidate will not win the elections with a significance level of $$\alpha=0.05.$$

Example 6.17 It is estimated that a particular flight is profitable if the average occupation rate during a year is at least $$60\%.$$ An airline is interested in determining whether it is profitable to keep a particular flight operative. For that, they record the occupation rates of $$120$$ random flights scattered around the year, resulting a mean occupation rate of $$58\%$$ and a quasistandard deviation of $$11\%.$$ Considering that the occupation rates (in proportion) have an approximate normal distribution, is there enough evidence to cancel the flight because it is not profitable? Employ a significance level of $$\alpha=0.10.$$

Let $$\mu$$ be the average occupation rate of the flight in one year. It is desired to test

\begin{align*} H_0:\mu=0.6\quad \text{vs.}\quad H_1:\mu<0.6. \end{align*}

The test statistic is

\begin{align*} T=\frac{\bar{X}-0.6}{S'/\sqrt{n}}=\frac{0.58-0.6}{0.11/\sqrt{120}}\approx-1.992. \end{align*}

Under $$H_0:\mu=0.6,$$ the statistic is distributed as $$t_{119}$$:

\begin{align*} \mathbb{P}(T\leq -1.992|\mu=0.6)\approx0.0239<\alpha=0.10. \end{align*}

The last probability can be computed as

pt(-1.992, df = 119)
## [1] 0.02433035

Therefore, $$H_0:\mu=0.6$$ is rejected, that is, the sample indicates that the flight is not profitable.

1. In this case, the distributions of $$T$$ and $$-T$$ are equal!↩︎