5 FITTING RANDOM FOREST

See Github for code JohnAtMill

varImpPlot(fit.rf)    ##Variable Importance Plot

importance(fit.rf)  
##              0          1 MeanDecreaseAccuracy MeanDecreaseGini
## Age   6.628537  2.6221908             6.396106        4.0651846
## Sex   2.126077  0.1640683             2.113024        0.5217098
## ALB   4.050238  4.3900099             5.903508        4.5770239
## ALP  17.049621 20.0925066            21.759320       14.2668463
## ALT  22.306967 15.0344834            22.939803       16.8378335
## AST  23.730627 31.1180542            30.953254       28.3869541
## BIL   6.958578 17.0192625            16.788004        8.9267996
## CHE  11.300715  6.0612761            12.054497        7.6879541
## CHOL  5.891190  5.7992778             8.245351        6.1280041
## CREA  9.771611  4.3855076            10.316794        4.9983135
## GGT  12.588276 16.2410833            18.945824       11.3569600
## PROT 11.744644  0.9581044            10.933361        5.7894273

The importance ranking shows that.AST,ALP,ALT are the important variables by random forest respectively.

Average.Auc.rf<-print(paste("Average of AUC is ", mean(err_vec1)))
## [1] "Average of AUC is  0.984390558461209"
Average.mis.rf<-print(paste("Average of Miss is ", mean(missclass.rate)))
## [1] "Average of Miss is  0.0300734884932523"
AUC.RF<-mean(err_vec1)
miss.rate.RF<-mean(missclass.rate)

Note 3 Similar with logistic model


  1. RandomSorest AUC: 0.98, MisClassification Rate: 0.027↩︎