2.4 The likelihood principle

In making inference or decisions about the state of the nature in the Bayesian paradigm, all the relevant experimental information is given by the observed data. Then, the relevance of the likelihood principle. The Bayesian framework is conditional on data, whereas the Frequentist approach is not (J. Berger 1993).

We follow (J. Berger 1993), who in turns followed (Lindley and Phillips 1976), to illustrate the likelihood principle.

We are given a coin such that we are interested in the probability, \(\theta\), of having it come up heads when flipped. It is desired to test \(H_0. \ \theta=1/2\) versus \(H_1. \ \theta>1/2\). An experiment is conducted by flipping the coin (independently) in a series of trials, the results of which is the observation of 9 heads and 3 tails.

This is not yet enough information to specify \(f(y|\theta)\), since the series of trials was not explained. Two possibilities:

  • The experiment consisted of a predetermine 12 flips, so that \(Y=\left[Heads\right]\) would be \(\mathcal{B}(12,\theta)\), then \(l_1(\theta)=f_1(y|\theta)={\binom{{n}}{{y}}}\theta^y(1-\theta)^{n-y}=220\times\theta^9(1-\theta)^{3}.\)

  • The experiment consisted of flipping the coin until 3 tails were observed (\(r=3\)). Then, \(Y\), the number of failures (heads) until getting 3 tails, is \(\mathcal{N}\mathcal{B}(3,1-\theta)\). Then, \(l_2(\theta)=f_2(y|\theta)={\binom{{y+r-1}}{{r-1}}}(1-(1-\theta)^y(1-\theta)^{r}=55\times\theta^9(1-\theta)^{3}.\)

Using a Frequentist approach, the significance level of \(y=9\) using the Binomial model against \(\theta=1/2\) would be:

\[\begin{equation*} \alpha_1=P_{1/2}(Y\geq 9)=f_1(9|1/2)+f_1(10|1/2)+f_1(11|1/2)+f_1(12|1/2)=0.073. \end{equation*}\]

success <- 9 # Number of observed success in n trials
n <- 12 # Number of trials
siglevel <- sum(sapply(9:n, function(y) dbinom(y, n, 0.5)))
paste("Significance level from binomial model is", sep = " ", round(siglevel, 4))
## [1] "Significance level from binomial model is 0.073"

For the Negative Binomial model, the significance level would be:

\[\begin{equation*} \alpha_2=P_{1/2}(Y\geq 9)=f_2(9|1/2)+f_2(10|1/2)+\ldots=0.0327. \end{equation*}\]

success <- 3 # Number of target success (tails)
failures <- 9 # Number of failures
siglevel <- 1 - pnbinom((failures - 1), success, 0.5)
paste("Significance level from negative binomial model is", sep = " ", round(siglevel, 4))
## [1] "Significance level from negative binomial model is 0.0327"

We arrive to a different conclusions using a significance level equal to 5%, whereas we obtain the same outcomes using a Bayesian approach because the kernels of both distributions are the same.


Berger, J. 1993. Statistical Decision Theory and Bayesian Analysis. Third Edition. Springer.
Lindley, D. V., and L. D. Phillips. 1976. “Inference for a Bernoulli Process (a Bayesian View).” American Statistician 30: 112–19.