8.6 Summary

I created four classification trees using the ISLR::OJ data set and four regression trees using the ISLR::Carseats data set. Let’s compare their performance.

8.6.1 Classification Trees

The caret::resamples() function summarizes the resampling performance on the final model produced in train(). It creates summary statistics (mean, min, max, etc.) for each performance metric (ROC, RMSE, etc.) for a list of models.

oj_resamples <- resamples(list("Single Tree (caret)" = oj_mdl_cart2,
                               "Bagged" = oj_mdl_bag,
                               "Random Forest" = oj_mdl_rf,
                               "GBM" = oj_mdl_gbm,
                               "XGBoost" = oj_mdl_xgb))
summary(oj_resamples)

## 
## Call:
## summary.resamples(object = oj_resamples)
## 
## Models: Single Tree (caret), Bagged, Random Forest, GBM, XGBoost 
## Number of resamples: 10 
## 
## ROC 
##                     Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## Single Tree (caret) 0.74    0.84   0.86 0.85    0.88 0.92    0
## Bagged              0.79    0.84   0.86 0.86    0.88 0.89    0
## Random Forest       0.80    0.87   0.88 0.87    0.89 0.91    0
## GBM                 0.82    0.87   0.91 0.89    0.91 0.94    0
## XGBoost             0.82    0.88   0.90 0.89    0.92 0.94    0
## 
## Sens 
##                     Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## Single Tree (caret) 0.77    0.85   0.85 0.86    0.89 0.92    0
## Bagged              0.77    0.83   0.83 0.84    0.85 0.92    0
## Random Forest       0.79    0.83   0.83 0.84    0.85 0.92    0
## GBM                 0.79    0.85   0.87 0.86    0.88 0.88    0
## XGBoost             0.81    0.87   0.89 0.88    0.90 0.92    0
## 
## Spec 
##                     Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## Single Tree (caret) 0.61    0.63   0.76 0.73    0.82 0.85    0
## Bagged              0.62    0.67   0.71 0.72    0.76 0.82    0
## Random Forest       0.58    0.65   0.74 0.72    0.81 0.85    0
## GBM                 0.61    0.68   0.79 0.76    0.84 0.91    0
## XGBoost             0.61    0.64   0.78 0.75    0.82 0.88    0

The mean ROC value is the performance measure I used to evaluate the models in train() (you can compare the Mean column to each model’s object (e.g., print(oj_mdl_cart2))). The best performing model on resamples based on the mean ROC score was XGBoost. It also had the highest mean sensitivity. GBM had the highest specificity. Here is a box plot of the distributions.

bwplot(oj_resamples)

One way to evaluate the box plots is with a post-hoc test of differences. The single tree was about the same as the random forest and bagged models. GBM and XGBoost were also statistically equivalent.

(oj_diff <- diff(oj_resamples))

## 
## Call:
## diff.resamples(x = oj_resamples)
## 
## Models: Single Tree (caret), Bagged, Random Forest, GBM, XGBoost 
## Metrics: ROC, Sens, Spec 
## Number of differences: 10 
## p-value adjustment: bonferroni

# summary(oj_diff)
dotplot(oj_diff)

8.6.2 Regression Trees

Here is the summary of resampling metrics for the Carseats models.

cs_resamples <- resamples(list("Single Tree (caret)" = cs_mdl_cart2,
                               "Bagged" = cs_mdl_bag,
                               "Random Forest" = cs_mdl_rf,
                               "GBM" = cs_mdl_gbm,
                               "XGBoost" = cs_mdl_xgb))
summary(cs_resamples)

## 
## Call:
## summary.resamples(object = cs_resamples)
## 
## Models: Single Tree (caret), Bagged, Random Forest, GBM, XGBoost 
## Number of resamples: 10 
## 
## MAE 
##                     Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## Single Tree (caret) 1.37    1.59   1.73 1.70    1.80  2.1    0
## Bagged              0.93    1.13   1.35 1.34    1.57  1.7    0
## Random Forest       0.85    1.04   1.26 1.21    1.39  1.5    0
## GBM                 0.72    0.86   0.92 0.94    1.04  1.1    0
## XGBoost             0.66    0.79   0.90 0.88    0.96  1.1    0
## 
## RMSE 
##                     Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## Single Tree (caret) 1.69    1.90    2.0  2.1     2.2  2.5    0
## Bagged              1.21    1.41    1.7  1.7     2.0  2.1    0
## Random Forest       1.07    1.28    1.6  1.5     1.7  1.9    0
## GBM                 0.93    1.08    1.2  1.2     1.3  1.4    0
## XGBoost             0.86    0.98    1.1  1.1     1.2  1.3    0
## 
## Rsquared 
##                     Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## Single Tree (caret) 0.39    0.50   0.52 0.50    0.53 0.59    0
## Bagged              0.53    0.63   0.68 0.68    0.73 0.78    0
## Random Forest       0.68    0.70   0.76 0.75    0.80 0.83    0
## GBM                 0.82    0.83   0.85 0.85    0.86 0.87    0
## XGBoost             0.84    0.86   0.87 0.86    0.87 0.88    0

The best performing model on resamples based on the mean RMSE score was XGBoost. It also had the lowest mean absolute error (MAE) and highest R-squared. Here is a box plot of the distributions.

bwplot(cs_resamples)

The post-hoc test indicates all models differed from each other except for GBM and XGBoost.

(cs_diff <- diff(cs_resamples))

## 
## Call:
## diff.resamples(x = cs_resamples)
## 
## Models: Single Tree (caret), Bagged, Random Forest, GBM, XGBoost 
## Metrics: MAE, RMSE, Rsquared 
## Number of differences: 10 
## p-value adjustment: bonferroni

# summary(cs_diff)
dotplot(cs_diff)