# Chapter 4 BART

## 4.1 A BART version of our hierachical trees model

We have a variable of interest for which we assume:

$$$y_{ij} = \sum_{k = 1}^{m} \overbrace{\mathbb{G}}^\text{Tree look up function}(\underbrace{X_{ij}}_\text{Covariates}, \overbrace{T_{k}}^\text{Tree structure}, \overbrace{M_{k}}^\text{Terminal node parameters}) + \underbrace{\epsilon_{ij}}_\text{Noise}$$$

for observation $$i = i, \dots, n_j$$ in group $$j = 1, \dots, a$$. We also have that:

$$$\epsilon_{ij} \sim N(0, \tau^{-1}),$$$

where $$\tau^{-1}$$ is the residual precision. In this setting, $$M_{k}$$ will represent the terminal node parameters + the individual group parameters for tree $$k$$.

For a single terminal node, let:

$$$R_{ijk1} = Y_{ij}^{(1)} - \sum_{t \neq k} \mathbb{G}(X_{ij}^{(1)}, T_{t}, M_{t})$$$

which represents the partial residuals for observation i, in group j, for tree k in terminal node 1. Now, let

$$$\underset{\sim}{R_j} = \{R_{ij}, \dots, j = 1,\dots, a \}$$$

then

$$$\underset{\sim}{R_j} \sim N(\mu_j, \tau^{-1}), \\ \mu_j \sim N(\mu, k_1\tau^{-1}/m), \text{(m = number of trees)} \\ \mu \sim N(0, k_2 \tau^{-1}/m)\\$$$

using the same marginalisation as for a single tree:

$$$\underset{\sim}{R_j} \sim MVN(\mu \mathbf{1}, \tau^{-1} (k_1 M^{-1}MM^{T} + \mathbb{I})), \text{(M = group model matrix)}\\ \text{using the same trick as before and } \Psi = k_1 M^{-1}MM^{T} + \mathbb{I}: \\ \underset{\sim}{R_j} \sim MVN(0, \tau^{-1} (\Psi + k_2 M^{-1} \mathbf{1}\mathbf{1}^{T})),$$$

which is used yo get the marginal distribution of a new tree. The new posterior updates will be:

$$$\mu | \dots \sim MVN( \frac{\mathbf{1}^{T} \Psi^{-1} R }{\tau \Psi^{-1} \mathbf{1} + k_2^{-1} M^{-1}}, \tau^{-1} (\mathbf{1}^{T} \Psi^{-1} \mathbf{1} + k_2^{-1} M^{-1})),$$$

$$$\mu_j | \dots \sim MVN( \tau^{-1} (n_j + k_1^{-1} M^{-1}))$$$

The update for $$\tau$$ will be a little different. Let $$\hat f_{ij}$$ be the overall prediction for observation $$R_{ij}$$ at the current iteration:

$$$\tau | \dots \sim Ga( \frac{n + m + 1}{2} + \alpha, \frac{\sum_{i, j} (y_{ij} - \hat f_{ij})^2}{2} + \frac{\sum_{j, k} (\mu_{jk} - \mu_{k})^2}{2} + \frac{\sum_{j, k} \mu_{k}^2}{2} + \beta )$$$