2.7 A simple working example
We will illustrate some conceptual differences between the Bayesian and Frequentist statistical approaches performing inference given a random sample y=[y1,y2,…,yN], where yiiid∼N(μ,σ2), i=1,2,…,N.
In particular, we set π(μ,σ)=π(μ)π(σ)∝1σ. This is a standard non-informative improper prior (Jeffreys prior, see Chapter 3), that is, this prior is perfectivelly compatible with sample information. In addition, we are assuming independent priors for μ and σ. Then,
π(μ,σ)∝1σ×(σ2)−N/2exp{−12σ2N∑i=1(yi−μ)2}=1σ×(σ2)−N/2exp{−12σ2N∑i=1((yi−ˉy)−(μ−ˉy))2}=1σexp{−N2σ2(μ−ˉy)2}×(σ)−Nexp{−12σ2N∑i=1(yi−ˉy)2}=1σexp{−N2σ2(μ−ˉy)2}×(σ)−(αn+1)exp{−αnˆσ22σ2},
where ˉy=∑Ni=1N, αn=N−1 and ˆσ2=∑Ni=1(yi−ˉy)2N−1.
The first term in the last expression is the kernel of a normal density, μ|σ,y∼N(ˉy,σ2/N). The second term is the kernel of an inverted gamma density (Zellner 1996, p.~ 371), σ|y∼IG(αn,ˆσ2). Therefore, π(μ|σ,y)=(2πσ2/N)−1/2exp{−N2σ2(μ−ˉy)2} and π(σ|y)=2Γ(αn/2)(αnˆσ22)αn/21σαn+1exp{−αnˆσ22σ2}.
Observe that E[μ|σ,y]=ˉy, this is also the maximum likelihood (Frequentist) point estimate of μ in this setting. In addition, the Frequentist (1−α)% confidence interval and the Bayesian (1−α)% credible interval have exactly the same form, ˉy±|zα/2|σN, where zα/2 is the α/2 percentile of a standard normal distribution. However, the interpretations are totally different. The confidence interval has a probabilistic interpretation under sampling variability of ˉY, that is, in repeated sampling (1−α)% of the intervals ˉY±|zα/2|σN would include μ, but given an observed realization of ˉY, say ˉy, the probability of ˉy±|zα/2|σN including μ is 1 or 0, that is why we say a (1−α)% confidence interval. On the other hand, ˉy±|zα/2|σN has a simple probabilistic interpretation in the Bayesian framework, there is a (1−α)% probability that μ lies in this interval.
If we want to get the marginal posterior density of μ,
π(μ|y)=∫∞0π(μ,σ|y)dσ∝∫∞01σ×(σ2)−N/2exp{−12σ2N∑i=1(yi−μ)2}dσ=∫∞0(1σ)N+1exp{−N2σ2∑Ni=1(yi−μ)2N}dσ=[2Γ(N/2)(N∑Ni=1(yi−μ)22N)N/2]−1∝[N∑i=1(yi−μ)2]−N/2=[N∑i=1((yi−ˉy)−(μ−ˉy))2]−N/2=[αnˆσ2+N(μ−ˉy)2]−N/2∝[1+1αn(μ−ˉyˆσ/√N)2]−(αn+1)/2
The fourth line is due to having the kernel of a inverted gamma density with N degrees of freedom in the integral.
The last expression is the kernel of a Student’s t density function with αn=N−1 degrees of freedom, expected value equal to ˉy, and variance ˆσ2N(αnαn−2). Then, μ|y∼t(ˉy,ˆσ2N(αnαn−2),αn).
Observe that a (1−α)% confidence interval and (1−α)% credible interval have exactly the same expression, ˉy±|tαnα/2|ˆσ√N, where tαnα/2 is the α/2 percentile of a Student’s t distribution. But again, the interpretations are totally different.
The mathematical similarity between the Frequentist and Bayesian expressions in this examples are due to using a non-informative improper prior.