# Chapter 7 Model Comparison with IS^{2}

During this tutorial, we have referred to several model comparison examples, and the importance of model comparison is clear. However, so far we have only used criterion based methods - namely DIC and BPIC, which are simple to compute. In this section, we introduce several further examples of model comparison and introduce Importance Sampling Squared (IS^{2}), which allows the estimation of marginal likelihood estimates (MLE). The utility of MLE, how these are calculated and the importance of priors to MLE is discussed below.

By estimating the marginal likelihood of a model given data, we obtain an unbiased likelihood estimate for the model, which integrates over the prior space. This means that both model fit and model complexity are accounted for. For example, if we estimate a model with a large number of parameters, and compare it to a model with fewer parameters, the complex model will be implicitly “penalised” because the prior space is also larger, which can limit the likelihood. However, if the model with fewer parameters has a poor fit to the data, or parameter estimates are unlikely given the prior, then the complex model could still win if the complexity was necessary. To obtain the marginal likelihood estimate, we use importance sampling at the subject and group level. For this IS^{2} procedure, we first make a proposal distribution (multivariate students t-distribution) of all the estimated cells. On each iteration, we draw a proposal for this distribution, which informs the hyper level, we then use this for importance sampling on the individual level (i.e., for each individual). After sampling, we obtain a vector of marginal likelihood samples, on which we can make inference.

## 7.1 IS^{2}

Let’s look at two examples of IS^{2} in action. The first compares five different LBA models, and the second compares the best fitting models from the DDM, LBA and RDM. It’s important to note the research question when thinking about model comparison, for example, in the first example, we’re most concerned about model inclusion effects, that is, which parameters are most important to the model. In the second example, we’re most concerned with which model best describes the data. You can see that in the first example, scientifically, this has the most relevance, as it allows us to answer a question - a question that we may have had since the outset of the design. However, in the second example, this is more theoretically relevant, where our question may be less about a manipulation and more focused on the theoretical account (model) of data.

### 7.1.1 comparing LBA’s

To run IS^{2}, the code is very simple to follow, it looks like this;

** Note ** Running this code will take a significant amount of computer power (32 cpus) and may take several hours depending on the computer. We recommend sourcing the supplied examples if you are following along with the code. This can be done by saving the samples from OSF to your desktop and using;

`load("Desktop/osfstorage-archive/IS2/is2_lba_B.RData")`

```
rm(list=ls())
source("emc/emc.R")
source("models/RACE/LBA/lbaB.R")
=32
n_cores
print(load("~/Desktop/emc2/models/RACE/LBA/examples/samples/sPNAS_B.RData"))
<- run_IS2(sPNAS_B,
is2_lba_B n_cores = n_cores,IS_samples = 30*n_cores, max_particles = 5000)$finished
save(is2_lba_B,file="is2_lba_B.RData")
```

Evidently, there is only one function that is required to call for IS^{2} - `run_IS2`

. The arguments for IS^{2} are the sampled object, the number of cores `n_cores`

, the number of importance samples `IS_samples`

and the maximum number of particles `max_particles`

. Here, particles refer to the subject level importance sampling. The maximum is set as we include an algorithm that gradually increases the number of particles to try and reach the ideal efficiency. The number of importance samples relates to the group level. The number of IS samples specified here leads to the final vector of that length. For this, we recommend x and y.

```
setwd("~/")
load("~/Desktop/emc2/models/RACE/LBA/examples/samples/is2_lba_B.RData")
median(is2_lba_B)
```

`## [1] 7393.085`

```
load("~/Desktop/emc2/models/RACE/LBA/examples/samples/is2_lba_Bvt0.RData")
median(is2_lba_Bvt0)
```

`## [1] 7436.502`

```
load("~/Desktop/emc2/models/RACE/LBA/examples/samples/is2_lba_Bt0_sv.RData")
median(is2_lba_Bt0_sv)
```

`## [1] 8024.084`

```
load("~/Desktop/emc2/models/RACE/LBA/examples/samples/is2_lba_Bv_sv.RData")
median(is2_sPNAS_Bv_sv)
```

`## [1] 8123.95`

```
load("~/Desktop/emc2/models/RACE/LBA/examples/samples/is2_lba_Bvt0_sv_NOa_n.RData")
median(is2_lba_Bvt0_sv_NOa_n)
```

`## [1] 8166.1`

From this output we see…