C Informal review on hypothesis testing

The process of hypothesis testing has an interesting analogy with a trial. The analogy helps to understand the elements present in a formal hypothesis test in an intuitive way.284

Hypothesis test Trial
Null hypothesis \(H_0\) The defendant: an individual accused of committing a crime. He285 is backed up by the presumption of innocence, which means that he is not guilty until there is enough evidence to support his guilt.
Sample \(X_1,\ldots,X_n\) Collection of evidence supporting innocence and guilt of the defendant. This evidence contains a certain degree of uncontrollable randomness due to how it is collected and the context regarding the case286.
Test statistic287 \(T_n\) Summary of the evidence presented by the prosecutor and defense lawyer.
Distribution of \(T_n\) under \(H_0\) The judge conducting the trial. He evaluates and measures the evidence presented by both sides and presents a verdict for the defendant.
Significance level \(\alpha\) \(1-\alpha\) is the strength of the evidence required by the judge for condemning the defendant. The judge allows evidence that, on average, condemns \(100\alpha\%\) of the innocents, due to the randomness inherent to the evidence collection process. \(\alpha=0.05\) is considered to be a reasonable level288.
\(p\)-value Decision of the judge that measures the degree of compatibility, in a scale \(0\)\(1\), of the presumption of innocence with the summary of the evidence presented. If \(p\text{-value}<\alpha\), the defendant is declared guilty. Otherwise, he is declared not guilty.
\(H_0\) is rejected The defendant is declared guilty: there is strong evidence supporting his guilt.
\(H_0\) is not rejected The defendant is declared not guilty: either he is innocent or there is no enough evidence supporting his guilt.

More formally, the \(p\)-value of an hypothesis test about \(H_0\) is defined as:

The \(p\)-value is the probability of obtaining a test statistic more unfavorable to \(H_0\) than the observed, assuming that \(H_0\) is true.

Therefore, if the \(p\)-value is small (smaller than the chosen level \(\alpha\)), it is unlikely that the evidence against \(H_0\) is due to randomness. As a consequence, \(H_0\) is rejected. If the \(p\)-value is large (larger than \(\alpha\)), then it is more possible that the evidence against \(H_0\) is merely due to the randomness of the data. In this case, we do not reject \(H_0\).

If \(H_0\) holds, then the \(p\)-value (which is a random variable) is distributed uniformly in \((0,1)\). If \(H_0\) does not hold, then the distribution of the \(p\)-value is not uniform but concentrated at \(0\) (where the rejections of \(H_0\) take place).


Molina-Peralta, I., and E. García-Portugués. 2021. A First Course on Statistical Inference. Lecture notes. https://bookdown.org/egarpor/inference/.

  1. That is not intended to replace a formal introduction to hypothesis tests. The interested reader can found one, e.g., in Chapter 6 in Molina-Peralta and García-Portugués (2021).↩︎

  2. The masculin pronoun in no case indicates gender ascription. It is used as a neutral form and could be substituted for any personal pronoun.↩︎

  3. Think about phenomena that may randomly support defendant’s innoncence or guilt, irrespective of his true condition. For example: spurious coincidences (“happen to be in the wrong place at the wrong time”), lost of evidence during the case, previous past statemets of the defendant, doubtious identification by witness, imprecise witness testimonies, unverificable alibi, etc.↩︎

  4. Usually simply referred to as statistic.↩︎

  5. As the judge must have the power of condemning a guilty defendant. Setting \(\alpha=0\) (no innocents are declared guilt) would result in a judge that systematically declares everybody not guilty. Therefore, a compromise is needed.↩︎