6 Testing models and methods

6.0.1 Objectives

  • Understand distinction between model fit and model adequacy
  • Identify and avoid pitfalls in evaluating methods
  • Be able to identify methods that have been tested well.

6.0.2 Model fit and accuracy

When we use models to understand biology, it helps if they are appropriate for the data. Most importantly, this gives meaningful parameter estimates. If the true model is one of constant diversification rates except for a single pulse of extinction at the KT boundary, and the data include sampling only 25% of current diversity, we could fit a logistic diversification model, and it could give us an estimate of carrying capacity, perhaps even complete with uncertainty, but the reality is there is no carrying capacity. If the question were simply about comparing models, a test of whether a logistic or Yule model fits the data best, we will get an answer, but it does not help us understand reality: neither model is correct in our case.

Table 1: Table of results from simulating a 2000 taxon tree under a pure birth model plus one mass extinction, then sampling tips perfectly randomly down to a 500 taxon tree.
deltaAIC birth.rate carrying.capacity
Yule 60.385 0.044 NA
Logistic 0.000 0.064 1000

In the above example, the result shows that the best model is one of logistic growth, with a carrying capacity of 1000. However, remember that the tree used had 2000 tips to start (they were subsampled to get a 500 taxon observed tree). Neither the model nor the parameter estimate is right, so this exercise would tell us little about biology. It is likely publishable.

There are thus three questions to answer when thinking about models:

  1. Are the approximations in my models biologically reasonable?
  2. Which model(s) fit best?
  3. Are my models adequate?