2.6 A simple working example

We will illustrate some conceptual differences between the Bayesian and Frequentist statistical approaches by performing inference on a random sample Y=[Y1,Y2,,YN], where YiiidN(μ,σ2) for i=1,2,,N.

In particular, we set π(μ,σ)=π(μ)π(σ)1σ. This is a standard non-informative improper prior (Jeffreys prior, see Chapter 3). That is, this prior is perfectly compatible with the sample information. Additionally, we assume independent priors for μ and σ.

π(μ,σ|y)1σ×(σ2)N/2exp{12σ2Ni=1(yiμ)2}=1σ×(σ2)N/2exp{12σ2Ni=1((yiˉy)(μˉy))2}=1σexp{N2σ2(μˉy)2}×(σ)Nexp{12σ2Ni=1(yiˉy)2}=1σexp{N2σ2(μˉy)2}×(σ)(αn+1)exp{αnˆσ22σ2},

where ˉy=Ni=1yiN, αn=N1, and ˆσ2=Ni=1(yiˉy)2N1.

The first term in the last expression is the kernel of a normal density, μ|σ,yN(ˉy,σ2/N). The second term is the kernel of an inverted gamma density (Zellner 1996), σ|yIG(αn,ˆσ2). Therefore,

π(μ|σ,y)=12πσ2/Nexp{N2σ2(μˉy)2},

and

π(σ|y)=2Γ(αn/2)(αnˆσ22)αn/21σαn+1exp{αnˆσ22σ2}.

Observe that E[μ|σ,y]=ˉy; this is also the maximum likelihood (Frequentist) point estimate of μ in this setting. In addition, the Frequentist (1α)% confidence interval and the Bayesian (1α)% credible interval have exactly the same form, ˉy±|zα/2|σN, where zα/2 is the α/2 percentile of a standard normal distribution. However, the interpretations are entirely different. The confidence interval has a probabilistic interpretation under sampling variability of ˉY: in repeated sampling, (1α)% of the intervals ˉY±|zα/2|σN would include μ. However, given an observed realization of ˉY, say ˉy, the probability of ˉy±|zα/2|σN including μ is either 1 or 0. This is why we refer to it as a (1α)% confidence interval. On the other hand, ˉy±|zα/2|σN has a straightforward probabilistic interpretation in the Bayesian framework: there is a (1α)% probability that μ lies within this interval.

If we want to get the marginal posterior density of μ,

π(μ|y)=0π(μ,σ|y)dσ01σ×(σ2)N/2exp{12σ2Ni=1(yiμ)2}dσ=0(1σ)N+1exp{N2σ2Ni=1(yiμ)2N}dσ=[2Γ(N/2)(NNi=1(yiμ)22N)N/2]1[Ni=1(yiμ)2]N/2=[Ni=1((yiˉy)(μˉy))2]N/2=[αnˆσ2+N(μˉy)2]N/2[1+1αn(μˉyˆσ/N)2](αn+1)/2.

The fourth line arises from the kernel of an inverted gamma density with N degrees of freedom in the integral (Zellner 1996).

The last expression represents the kernel of a Student’s t-distribution with αn=N1 degrees of freedom, expected value equal to ˉy, and variance ˆσ2N(αnαn2). Therefore, μ|yt(ˉy,ˆσ2N(αnαn2),αn).

Observe that a (1α)% confidence interval and a (1α)% credible interval have exactly the same form, ˉy±|tαnα/2|ˆσN, where tαnα/2 is the α/2 percentile of a Student’s t-distribution. However, the interpretations are entirely different.

The mathematical similarity between the Frequentist and Bayesian expressions in this example arises from the use of an improper prior.

Example: Math test

You have a random sample of math scores of size N=50 from a normal distribution, YiN(μ,σ2). The sample mean and variance are equal to 102 and 10, respectively. Assuming an improper prior equal to 1σ, we proceed with the following tasks:

  • Compute the 95% confidence and credible intervals for μ.
  • Determine the posterior probability that μ>103.

Using the fact that μ|yt(ˉy,ˆσ2N(αnαn2),αn), which implies that the confidence and credible intervals for μ are given by:

ˉy±|tαnα/2|ˆσN,

where ˉy=102, ˆσ2=10, and αn=49. Thus, the 95% confidence and credible intervals for μ are the same, namely (101.1,102.9), and the posterior probability that μ>103 is 1.49% given the sample information.

N <- 50 # Sample size
y_bar <- 102 # Sample mean
s2 <- 10 # Sample variance
alpha <- N - 1
serror <- (s2/N)^0.5 
LimInf <- y_bar - abs(qt(0.025, alpha)) * serror
LimInf
## [1] 101.1013
# Lower bound
LimSup <- y_bar + abs(qt(0.025, alpha)) * serror
LimSup
## [1] 102.8987
# Upper bound
y.cut <- 103
P <- 1-metRology::pt.scaled(y.cut, df = alpha, mean = y_bar, sd = serror)
P
## [1] 0.01496694
# Probability of mu greater than y.cut

References

———. 1996. “Introduction to Bayesian Inference in Econometrics.”