Chapter 4 BART
4.1 A BART version of our hierachical trees model
We have a variable of interest for which we assume:
yij=m∑k=1Tree look up function⏞G(Xij⏟Covariates,Tree structure⏞Tk,Terminal node parameters⏞Mk)+ϵij⏟Noise
for observation i=i,…,nj in group j=1,…,a. We also have that:
ϵij∼N(0,τ−1),
where τ−1 is the residual precision. In this setting, Mk will represent the terminal node parameters + the individual group parameters for tree k.
For a single terminal node, let:
Rijk1=Y(1)ij−∑t≠kG(X(1)ij,Tt,Mt)
which represents the partial residuals for observation i, in group j, for tree k in terminal node 1. Now, let
Rj∼={Rij,…,j=1,…,a}
then
Rj∼∼N(μj,τ−1),μj∼N(μ,k1τ−1/m),(m = number of trees)μ∼N(0,k2τ−1/m)
using the same marginalisation as for a single tree:
Rj∼∼MVN(μ1,τ−1(k1M−1MMT+I)),(M = group model matrix)using the same trick as before and Ψ=k1M−1MMT+I:Rj∼∼MVN(0,τ−1(Ψ+k2M−111T)),
which is used yo get the marginal distribution of a new tree. The new posterior updates will be:
μ|⋯∼MVN(1TΨ−1RτΨ−11+k−12M−1,τ−1(1TΨ−11+k−12M−1)),
μj|⋯∼MVN(τ−1(nj+k−11M−1))
The update for τ will be a little different. Let ˆfij be the overall prediction for observation Rij at the current iteration:
τ|⋯∼Ga(n+m+12+α,∑i,j(yij−ˆfij)22+∑j,k(μjk−μk)22+∑j,kμ2k2+β)