Chapter 3 Multiparameter Models
Example 1: Independent Beta-binomial model
Assume an independent binomial model, Ysind∼Bin(ns,θs), i.e. ,p(y∣θ)=S∏s=1p(ys∣θs)=S∏s=1(nsys)θyss(1−θs)ns−ys and assume independent beta priors distribution: p(θ)=S∏s=1p(θs)=S∏s=1θas−1s(1−θs)bs−1Beta(as,bs)I(0<θs<1) Then we have p(θ∣y)∝∏Ss=1Beta(θs∣as+ys,bs+ns−ys).
Example 2: Normal model with unknown mean and variance
Scaled-inverse χ2-distribution: If σ2∼IG(a,b), then σ2∼Inv−χ2(v,s2) where
- a=v/2 and b=vs2/2,
- or equivalently, v=2a and s2=b/a.
Location-scale t-distribution: tv(m,s2)v→∞⟶N(m,s2).
Normal-Inv-χ2 distribution: μ∣σ2→N(m,σ2/k) and σ2∼Inv−χ2(v,s2), then the kernel of this joint density is p(μ,σ2)∝(σ2)−(v+3)/2e−12σ2[k(μ−m)2+vs2] In addition, the marginal distribution for μ is tv(m,s2/k).
Jeffrey prior can be shown to be p(μ,σ2)∝(1/σ2)3/2 but reference prior finds that p(μ,σ2)∝1/σ2 is more appropriate. Under the reference prior, the posterior is p(μ∣σ2,y)∼N(ˉy,σ2/n)σ2∣y∼Inv−χ2(n−1,S2) and the marginal posterior for μ is μ∣y∼tn−1(ˉy,S2/n).
To predict ˜y∼N(μ,σ2), we can write ˜y=μ+ϵ with μ∣σ2,y∼N(ˉy,σ2/n) and ϵ∣σ2,y∼N(0,σ2). Thus ˜y∣σ2,y∼N(ˉy,σ2[1+1/n]) Because σ2∣y∼Inv−χ2(n−1,S2), we have ˜y∣y∼tn−1(ˉy,S2[1+1/n]).
The conjugate prior for μ and σ2 is μ∣σ2∼N(m,σ2/k)σ2∼Inv−χ2(v,s2) where s2 serves as a prior guess about σ2 and v controls how certain we are about that guess. The posterior under the prior is μ|σ2,y∼N(m′,σ2/k′)σ2|y∼lnv−χ2(v′,(s′)2) where k′=k+n, m′=[km+nˉy]/k′ , v′=v+n and v′(s′)2=vs2+(n−1)S2+knk′(ˉy−m)2. The marginal posterior for μ is μ∣y∼tv′(m′,(s′)2/k′)
Example 3: Multinomial-Dirichlet
Suppose Y=(Y1,…,YK)∼Mult(n,π) with pmf p(y)=n!∏Kk=1πykkyk!, let π∼Dir(a) with concentration parmaeter a=(a1,…,aK) where ak>0 for all k.
Dirichlet distribution: The pdf of π is p(π)=1Beta(a)∏Kk=1πak−1k with ∏Kk=1πk=1 and Beta(a) is a multinomial beta function, i.e. Beta(a)=∏Kk=1Γ(ak)Γ(∑Kk=1ak). E(πk)=ak/a0, V(πk)=ak(a0−ak)/a20(a0+1) where a0=∑Kk=1ak.
Marginally, each component of a Dirichlet distribution is a Beta distribution with πk∼Be(ak,a0−ak).
The conjugate prior for a multinomial distribution with unknown probabilty vector π is a Dirichlet distribution. The Jeffery prior is a Dirichlet distribution with ak=0.5 for all k.
The posterior under a Direchlet prior is p(π∣y)∝K∏k=1πak+yk−1k⇒π∣y∼Dir(a+y)
Example 4: Multivariate Normal
p(y)=(2π)−k/2|Σ|−1/2exp(−12(y−μ)TΣ−1(y−μ))
Let Y∼N(μ,Σ) with precision matrix Ω=Σ−1
- If Σk,k′=0, then Yk and Yk′ are independent of each other
- If Ωk,k′=0, then Yk and Yk′ are conditionally independent of each other given Yj for j≠k,k′
Conjugate inference: let Yi∼N(μ,S2) with conjugate prior μ∼Nk(m,C), the posterior μ∣y∼N(m′,C′) where C′=[C−1+nS−1]−1 and m′=C′[C−1m+nS−1ˉy].
Let Σ have an inverse Wishart distribution, i.e. Σ∼IW(v,W−1) with degree of freedom v>K−1 and positive definite scale matrix W. A multivariate generalization of the normal-scaled-inverse-χ2 distribution is the normal-inverse Wishart distribution. For a vector μ⊂RK and K×K matrix Σ, the normal-inverse Wishart distribution is μ∣Σ∼N(m,Σ/c)Σ∼IW(v,W−1) The marginal distribution for μ is a multivariate t-distribution, i.e. μ∼tv−K+1(m,W/[c(v−K+1)]). The posterior distribution is μ∣Σ,y∼N(ˉy,Σ/n)Σ∣y∼IW(n−1,S−1)