28.6 About interpreting \(P\)-values

A \(P\)-value is the likelihood of observing the sample statistic (or something even more extreme) over repeated sampling, under the assumption that the null hypothesis about the population parameter is true.

\(P\)-values can be computed because the sampling distribution often has an approximate normal distribution.

TABLE 28.1: A guideline for interpreting \(P\)-values. \(P\)-values should be interpreted in context.
If the \(P\)-value is… Write the conclusion as…
Larger than 0.10 Insufficient evidence to support \(H_1\)
Between 0.05 and 0.10 Slight evidence to support \(H_1\)
Between 0.01 and 0.05 Moderate evidence to support \(H_1\)
Between 0.001 and 0.01 Strong evidence to support \(H_1\)
Smaller than 0.001 Very strong evidence to support \(H_1\)

Conclusion are always about the population values.

No-one needs \(P\)-values to see if the sample values are the same: We can just look at them, and see.

\(P\)-values are needed to determine what we learn about the unknown population values, based on what we see in the sample values.

Commonly, a \(P\)-value smaller than 5% is considered ‘small,’ but this is arbitrary. More reasonably, \(P\)-values should be interpreted as giving varying degrees of evidence in support of the alternative hypothesis (Table 28.1), but these are only guidelines. Conclusions should be written in the context of the problem. Sometimes, authors will write that the results are ‘statistically significant’ when \(P<0.05\).

Definition 28.3 (\(P\)-value) A \(P\)-value is the likelihood of observing the sample statistic (or something more extreme) over repeated sampling, under the assumption that the null hypothesis about the population parameter is true.

\(P\)-values are never exactly zero. When SPSS reports that ‘\(P=0.000\),’ it means that the \(P\)-value is less than 0.001, which we write as ‘\(P<0.001\).’

jamovi usually reports very small \(P\)-values as ‘\(P<0.001\).’

\(P\)-values are commonly used in research, but they need to be used and interpreted correctly (Greenland et al. 2016). Specifically:

  • A \(P\)-value is not the probability that the null hypothesis is true.
  • A \(P\)-value does not prove anything.
  • A big \(P\)-value does not mean that the null hypothesis \(H_0\) is true, or that \(H_1\) is false.
  • A small \(P\)-value does not mean that the null hypothesis \(H_0\) is false, or that \(H_0\) is true.
  • A small \(P\)-value does not indicate that the results are practically important (Sect. 28.8).
  • A small \(P\)-value does not mean a large difference between the statistic and parameter; it means that the difference could not reasonably be attributed to sampling variation (chance).

Sometimes, the results from hypothesis tests are called “significant” or “statistically significant.”

This means that the \(P\)-value is small (traditionally, but arbitrarily, \(P < 0.05\)), and hence the evidence supports the alternative hypothesis.

To avoid confusion, the word “significant” should be avoided in writing about research unless “statistical significance” is what is actually what is meant. In other situations, consider using words like “substantial.”

References

Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. Statistical tests, \(P\) values, confidence intervals, and power: A guide to misinterpretations. European journal of epidemiology. Springer; 2016;31(4):337–50.