6.5 p-value of a test
Example 6.15 In Example 6.14:
Will the data support the decision of the foreman at the significance level α=0.05?
In this case, the critical value of the normal is z0.05≈1.64 and the observed value of the statistic was Z=5/3≈1.67>1.64. Therefore, at the level of significance α=0.05 it is concluded that there is evidence in favor of the foreman decision.
What would be the lowest significance level for which the data will support the foreman decision and, therefore, H0:p=0.10 would be rejected?
The lowest significance level for which we would reject H0 is
P(Z>5/3|p=0.10)≈0.0485.
This probability is precisely the level α from which the decision of the test flips. It is the so-called p-value of the test.
The following block advances the key operative relation to use the p-value of a test to emit a decision.
Given a test for H0, the rejection decision of H0 at significance level α depends on the p-value of the test:
{p-value<α⟺Reject H0 at level α;p-value≥α⟺Do not reject H0 at level α.
Definition 6.5 (p-value) The p-value of a hypothesis test is defined as the lowest significance level α for which the test rejects the null hypothesis H0.
Remark. The p-value can be informally regarded as a “measure of the degree of compatibility of H0 with the data”. A valid interpretation in terms of probability is a restatement of that in Definition 6.5: “the probability of obtaining a test statistic at least as unfavorable to H0 as the observed one, under H0.”
Remark. The following are erroneous interpretations of the p-value: (1) “the probability of H0 being true given the data”; (2) “the probability of H0 vs. H1”; (3) “the probability of the data given H0”.
Depending on the kind of hypothesis to test, the p-value is computed in a different way. We differentiate three cases:
One-sided right tests; H0:θ=θ0 vs. H1:θ>θ0. Assume that the test statistic is T(X1,…,Xn) and the critical region is C={(x1,…,xn)′∈Rn:T(x1,…,xn)>k}. If the observed value of the test statistic is T=t, then the p-value is p-value:=P(T≥t|θ=θ0).
One-sided left tests; H0:θ=θ0 vs. H1:θ<θ0. In this case the critical region is C={(x1,…,xn)′∈Rn:T(x1,…,xn)<k} and, therefore, the p-value is p-value:=P(T≤t|θ=θ0).
Two-sided tests; H0:θ=θ0 vs. H1:θ≠θ0. In this case, the critical region is of the form C={(x1,…,xn)′∈Rn:T(x1,…,xn)<k1 or T(x1,…,xn)>k2}. The p-value is given by p-value:=2min Observe that if the distribution of the test statistic T is symmetric about 0,79 such as the normal or Student’s distribution, then \mathbb{P}(T\leq t|\theta=\theta_0) = \mathbb{P}(-T \leq t | \theta=\theta_0) = \mathbb{P}(T \geq -t | \theta=\theta_0) and the above minimum is not required: \begin{align*} \text{$p$-value}=2\mathbb{P}(T\leq -|t||\theta=\theta_0)=2\mathbb{P}(T\geq |t||\theta=\theta_0). \end{align*}
From the above definitions of the p-value, it is clear that the p-value is a function of the observed value of the test statistic T=t. Therefore, the p-value is a rv. In addition, the p-value is uniformly distributed in [0,1] under H_0 (see Exercise 6.16). It is also not difficult to see that p\text{-value}<\alpha if and only if the observed test statistic T=t belongs to C.
Example 6.16 Assume that in Example 6.3 it has been observed that Y=3 of the n=15 sampled voters support the candidate. Would that result indicate that the candidate is going to lose the elections (reject H_0:p=0.5) at significance level \alpha=0.05?
The hypothesis to test is
\begin{align*} H_0:p=0.5\quad \text{vs.}\quad H_1:p<0.5. \end{align*}
Since under H_0:p=0.5, Y\sim \mathrm{Bin}(n,0.5), then the p-value is given by
\begin{align*} \text{$p$-value} &=\mathbb{P}(Y\leq 3|p=0.5)=\sum_{y=0}^3 \binom{15}{y}(0.5)^{15} \\ &\approx0.018<\alpha=0.05. \end{align*}
Equivalently, it can be computed as:
Therefore, H_0: p=0.5 is rejected in favor of H_1:p<0.5; that is, this result indicates that the candidate will not win the elections with a significance level of \alpha=0.05.
Example 6.17 It is estimated that a particular flight is profitable if the average occupation rate during a year is at least 60\%. An airline is interested in determining whether it is profitable to keep a particular flight operative. For that, they record the occupation rates of 120 random flights scattered around the year, resulting a mean occupation rate of 58\% and a quasistandard deviation of 11\%. Considering that the occupation rates (in proportion) have an approximate normal distribution, is there enough evidence to cancel the flight because it is not profitable? Employ a significance level of \alpha=0.10.
Let \mu be the average occupation rate of the flight in one year. It is desired to test
\begin{align*} H_0:\mu=0.6\quad \text{vs.}\quad H_1:\mu<0.6. \end{align*}
The test statistic is
\begin{align*} T=\frac{\bar{X}-0.6}{S'/\sqrt{n}}=\frac{0.58-0.6}{0.11/\sqrt{120}}\approx-1.992. \end{align*}
Under H_0:\mu=0.6, the statistic is distributed as t_{119}:
\begin{align*} \mathbb{P}(T\leq -1.992|\mu=0.6)\approx0.0239<\alpha=0.10. \end{align*}
The last probability can be computed as
Therefore, H_0:\mu=0.6 is rejected, that is, the sample indicates that the flight is not profitable.
In this case, the distributions of T and -T are equal!↩︎