4.4 Multivariate linear regression: The conjugate normal-normal/inverse Wishart model

Let’s study the multivariate regression setting where there are M N-dimensional vectors ym, m=1,2,,M such that ym=Xβm+μm, X is the set of common regressors, and μm is the N-dimensional vector of stochastic errors for each equation such that U=[μ1 μ2  μM]MNN,M(0,IN,Σ), that is, a matrix variate normal distribution where Σ is the covariance matrix of each i-th row of U, i=1,2,,N, and we are assuming independece between the rows. Then, vec(U)NN×M(0,ΣIN).19

This framework can be written in matricial form

[y11y12y1My21y22y2MyN1yN2yNM]Y=[x11x12x1Kx21x22x2KxN1xN2xNK]X[β11β12β1Mβ21β22β2MβK1βK2βKM]B+[μ11μ12μ1Mμ21μ22μ2MμN1μN2μNM]U

Therefore, YNN×M(XB,ΣIN),20

p(Y|B,Σ)|Σ|N/2exp{12tr[(YXB)(YXB)Σ1]}=|Σ|N/2exp{12tr[S+(BˆB)XX(BˆB)]Σ1},

where S=(YXˆB)(YXˆB), ˆB=(XX)1XY (see Exercise 7).

The conjugate prior for this models is π(B,Σ)=π(B|Σ)π(Σ) where π(B|Σ)NK×M(B0,ΣV0) and π(Σ)IW(Ψ0,α0), that is,

π(B,Σ)|Σ|K/2exp{12tr[(BB0)V10(BB0)]Σ1}×|Σ|(α0+M+1)/2exp{12tr[Ψ0Σ1]}.

The posterior distribution is given by

π(B,Σ|Y,X)p(Y|B,Σ,X)π(B|Σ)π(Σ)|Σ|N+K+α0+M+12×exp{12tr[(Ψ0+S+(BB0)V10(BB0)+(BˆB)XX(BˆB))Σ1]}. Completing the squares on B and collecting the remaining terms in the bracket yields Ψ0+S+(BB0)V10(BB0)+(BˆB)XX(BˆB)=(BBn)V1n(BBn)+Ψn, where Bn=(V10+XX)1(V10B0+XY)=(V10+XX)1(V10B0+XXˆB),Vn=(V10+XX)1,Ψn=Ψ0+S+B0V10B0+ˆBXXˆBBnV1nBn. Thus, the posterior distribution can be written as π(B,Σ|Y,X)|Σ|K/2exp{12tr[(BBn)V1n(BBn)]Σ1}×|Σ|N+α0+M+12exp{12tr[ΨnΣ1]}. That is π(B,Σ|Y,X)=π(B|Σ,Y,X)π(Σ|Y,X) where π(B|Σ,Y,X)NK×M(Bn,ΣVn) and π(Σ|Y,X)IW(Ψn,αn), where αn=N+α0.

The marginal posterior for B is …

The marginal likelihood is …

The predictive density is …


  1. vec denotes the vectorization operation, and denotes the kronecker product↩︎

  2. We can write down the former expression in a more familiar way using vectorization properties, vec(Y)y=(IMX)Zvec(B)β+vec(U)μ, where yNN×M(Zβ,ΣIN).↩︎