Chapter 9 Algorithm to fit HBART (stump)

Algorithm type: Metropolis within GIBBS for a hierachical BART model

Reason: We have closed posteriors for most parameters, but not for the tree structure

Data: Target variable $y$ , groups $j = 1,\dots, J$

Result: Posterior distributions for all parameters

Initialisation;

Hyper-parameters values for $\alpha, \beta, k_2$ ;

Number of groups $J$ ;

Number of trees P;

Total number of nodes $N_p$

Number of observations $N =\sum_{j = 1}^{J} n_j$ ;

Number of iterations I;

define $\mu_{\mu} = 0$ , $\mu^{0}$ , $\tau^{0}$ , and $\mu_j^{0}, j = 1,\dots, J$ , $k_1^0$ , as the initial parameter values
for i from 1 to I do:
- for p from 1 to P do:
- sample $\mu_p$ from the posterior $N(\frac{\mathbf{1}^{T} \Psi^{-1} R_p }{\mathbf{1}^{T} \Psi^{-1} \mathbf{1} + (k_2/P)^{-1}}, \tau^{-1} (\mathbf{1}^{T} \Psi^{-1} \mathbf{1} + (k_2/P)^{-1}))$
  - for j in 1:J do:
    - sample $\mu_{jp}$ from the posterior $MVN(\frac{P \mu /k_1 + \bar R_{pj} n_j}{(n_j + P/k_1)}, \tau^{-1} (n_j + P/k_1))$
  - end
- set $R_{ijp} = Y_{ij} - \sum_{t = 1}^{p} \mu_{jt}$ , or, in other words, the residuals for tree $p$ are the vector $y$ minus the sum of the $\mu_{jp}$ values up to tree p
- end
Define $\hat f_{ij}$ as the current overall prediction for observation $R_{ij}$ , which will be $\sum_{p \neq pp}^{P} \mu_{jp}$ , where pp represents the current tree index.
sample $\tau$ from the posterior $Ga(\frac{N + J N_p + N_p}{2} + \alpha, \frac{\sum_{i= 1}^{N}(y_i - \hat f_i)^2}{2} + \frac{P \sum_{j, l, p}(\mu_{j, l, p} - \mu_{l, p})^2}{2 k_1} + \frac{P \sum_{l, p}\mu_{l, p}^2}{2 k_2} + \beta)$
sample $k_1$ from a Uniform(0, u)
- calculate p_old as the conditional distribution for the current $k_1$
- calculate p_new as the conditional distribution for the newly sampled $k_1$
- sample U from a Unif(0, 1)
- if (a > u) accept the new $k_1$ ; else keep the current $k_1$
end