Appendix

Deriving x2|x1 when (x1x2)T is a block-normal multivariate random variable

Recall our block normal system:

[x1x2]N([μ1μ2],[Σ11Σ12Σ21Σ22])

Assuming that Σ11 is invertible (though unless x1 contains degenerate terms, we have nothing to worry about), we then have

p(x2|x1)=p(x1,x2)p(x1)exp(12[[x1μ1x2μ2]T[Σ11Σ12Σ21Σ22]1[x1μ1x2μ2](x1μ1)TΣ111(x1μ1)])

Now, one of the expressions we may use to invert the block covariance matrix is: [Σ11Σ12Σ21Σ22]1=[Σ111+Σ111Σ12(Σ22Σ21Σ111Σ12)1Σ21Σ111Σ111Σ12(Σ22Σ21Σ111Σ12)1(Σ22Σ21Σ111Σ12)1Σ21Σ111(Σ22Σ21Σ111Σ12)1]

Hence,

p(x2|x1)exp(12[(x1μ1)TΣ111Σ12(Σ22Σ21Σ111Σ12)1Σ21Σ111(x1μ1)2(x1μ1)Σ111Σ12(Σ22Σ21Σ111Σ12)1(x2μ2)+(x2μ2)T(Σ22Σ21Σ111Σ12)1(x2μ2)])exp(12[((x2μ2)Σ21Σ111(x1μ1))T(Σ22Σ21Σ111Σ12)1((x2μ2)Σ21Σ111(x1μ1))])

i.e. x2|x1N(μ2+Σ21Σ111(x1μ1),Σ22Σ21Σ111Σ12).