Appendix
Deriving x2|x1 when (x1x2)T is a block-normal multivariate random variable
Recall our block normal system:
[x1x2]∼N([μ1μ2],[Σ11Σ12Σ21Σ22])
Assuming that Σ11 is invertible (though unless x1 contains degenerate terms, we have nothing to worry about), we then have
p(x2|x1)=p(x1,x2)p(x1)∝exp(−12[[x1−μ1x2−μ2]T[Σ11Σ12Σ21Σ22]−1[x1−μ1x2−μ2]−(x1−μ1)TΣ−111(x1−μ1)])
Now, one of the expressions we may use to invert the block covariance matrix is: [Σ11Σ12Σ21Σ22]−1=[Σ−111+Σ−111Σ12(Σ22−Σ21Σ−111Σ12)−1Σ21Σ−111−Σ−111Σ12(Σ22−Σ21Σ−111Σ12)−1−(Σ22−Σ21Σ−111Σ12)−1Σ21Σ−111(Σ22−Σ21Σ−111Σ12)−1]
Hence,
p(x2|x1)∝exp(−12[(x1−μ1)TΣ−111Σ12(Σ22−Σ21Σ−111Σ12)−1Σ21Σ−111(x1−μ1)−2(x1−μ1)Σ−111Σ12(Σ22−Σ21Σ−111Σ12)−1(x2−μ2)+(x2−μ2)T(Σ22−Σ21Σ−111Σ12)−1(x2−μ2)])∝exp(−12[((x2−μ2)−Σ21Σ−111(x1−μ1))T(Σ22−Σ21Σ−111Σ12)−1((x2−μ2)−Σ21Σ−111(x1−μ1))])
i.e. x2|x1∼N(μ2+Σ21Σ−111(x1−μ1),Σ22−Σ21Σ−111Σ12). ◻