5 FITTING RANDOM FOREST
See Github for code JohnAtMill
varImpPlot(fit.rf) ##Variable Importance Plot
importance(fit.rf)
## 0 1 MeanDecreaseAccuracy MeanDecreaseGini
## Age 6.628537 2.6221908 6.396106 4.0651846
## Sex 2.126077 0.1640683 2.113024 0.5217098
## ALB 4.050238 4.3900099 5.903508 4.5770239
## ALP 17.049621 20.0925066 21.759320 14.2668463
## ALT 22.306967 15.0344834 22.939803 16.8378335
## AST 23.730627 31.1180542 30.953254 28.3869541
## BIL 6.958578 17.0192625 16.788004 8.9267996
## CHE 11.300715 6.0612761 12.054497 7.6879541
## CHOL 5.891190 5.7992778 8.245351 6.1280041
## CREA 9.771611 4.3855076 10.316794 4.9983135
## GGT 12.588276 16.2410833 18.945824 11.3569600
## PROT 11.744644 0.9581044 10.933361 5.7894273
The importance ranking shows that.AST,ALP,ALT are the important variables by random forest respectively.
<-print(paste("Average of AUC is ", mean(err_vec1))) Average.Auc.rf
## [1] "Average of AUC is 0.984390558461209"
<-print(paste("Average of Miss is ", mean(missclass.rate))) Average.mis.rf
## [1] "Average of Miss is 0.0300734884932523"
<-mean(err_vec1)
AUC.RF<-mean(missclass.rate) miss.rate.RF
Note 3 Similar with logistic model
RandomSorest AUC: 0.98, MisClassification Rate: 0.027↩︎