Table 5.
Performance summary of full and reduced variable random forest models on simulated threenorm data sets with d relevant predictors and k noise predictors. Performance metrics were computed using independent test sets of size 1000 (Test) and the out-of-bag data (OOB), and averaged over 20 simulation runs. PCC is the percent of observations correctly classified, sensitivity is the percent of observations in class 1 correctly classified, specificity is the percent of observations in class 2 correctly classified, and AUC is the area under the receiver operating characteristic curve
| d | k | Model | PCC | Sens. | Spec. | AUC | |
|---|---|---|---|---|---|---|---|
| Test | 50 | 150 | Full | 0.805 | 0.807 | 0.806 | 0.893 |
| OOB | 50 | 150 | Full | 0.794 | 0.793 | 0.793 | 0.877 |
| Test | 50 | 150 | Reduced | 0.828 | 0.829 | 0.828 | 0.909 |
| OOB | 50 | 150 | Reduced | 0.840 | 0.838 | 0.842 | 0.909 |
|
| |||||||
| Test | 150 | 50 | Full | 0.768 | 0.763 | 0.777 | 0.859 |
| OOB | 150 | 50 | Full | 0.748 | 0.739 | 0.753 | 0.833 |
| Test | 150 | 50 | Reduced | 0.755 | 0.755 | 0.757 | 0.838 |
| OOB | 150 | 50 | Reduced | 0.795 | 0.798 | 0.790 | 0.865 |