Table 2.
Out of sample RF, the percentage of misclassification for the binary-classification datasets
| SPECTF [80 × 44; 187 × 44] | Spambase [4,601 × 57] | Relathe [1,427 × 4,322] | Ovarian cancer [72 × 592] | |
|---|---|---|---|---|
| GSO | 24.60 (23) | 4.37 ± 0.87 (22) | 14.25 ± 2.66 (28) | 27.78 (3) |
| mRMR Peng | 24.60 (20) | 4.39 ± 0.57 (29) | 16.48 ± 3.26 (28) | 26.39 (15) |
| mRMR Spearman | 23.53 (18) | 4.78 ± 1.67 (27) | 20.14 ± 3.56 (27) | 20.83 (15) |
| Information gain | 24.06 (16) | 4.65 ± 0.59 (29) | 20.63 ± 3.93 (23) | 27.78 (4) |
| RELIEF | 22.99 (27) | 5.61 ± 1.00 (24) | 29.30 ± 3.99 (27) | 29.17 (7) |
| CFS | 25.13 (12) | 5.85 ± 0.53 (17) | 17.18 ± 2.33 (30) | 29.17 (3) |
| CBF | 28.34 (2) | 5.07 ± 0.48 (18) | 16.06 ± 2.30 (27) | 31.94 (3) |
| SIMBA | 25.13 (25) | 4.89 ± 0.83 (25) | 32.25 ± 3.23 (26) | 37.50 (20) |
| LOGO | 25.67 (21) | 4.87 ± 0.97 (28) | 23.66 ± 4.24 (29) | 27.78 (4) |
| L1-LSMI | 22.46 (10) | 4.33 ± 1.10 (17) | 27.54 ± 2.24 (25) | 26.39 (5) |
| IAMB | 28.88 (6) | 4.80 ± 0.89 (24) | 14.73 ± 2.31 (30) | 27.78 (3) |
| HITON | 24.06 (22) | 4.26 ± 0.66 (28) | 28.24 ± 5.26 (23) | 29.17 (3) |
| JMI | 24.06 (30) | 4.63 ± 0.58 (30) | 20.70 ± 3.36 (16) | 18.06 (14) |
| DISR | 24.06 (22) | 5.00 ± 0.83 (29) | 19.86 ± 2.69 (21) | 19.44 (4) |
| QPFS | 24.60 (25) | 4.37 ± 0.74 (28) | 20.21 ± 1.97 (23) | 23.61 (12) |
| CMIM | 24.06 (15) | 4.61 ± 0.88 (27) | 16.06 ± 2.95 (30) | 16.67 (24) |
| CIFE | 23.53 (15) | 5.89 ± 1.61 (30) | 16.83 ± 3.26 (27) | 29.17 (11) |
| MIQ | 24.06 (16) | 4.83 ± 0.84 (19) | 18.52 ± 3.17 (28) | 22.22 (15) |
| SPECCMI | 24.06 (27) | 4.61 ± 1.14 (28) | 20.14 ± 4.67 (29) | 19.44 (14) |
| RRCT | 22.99 (18) | 4.72 ± 0.56 (29) | 13.87 ± 2.48 (27) | 13.89 (20) |
The design matrices are summarized in the form N×M [number of samples × number of features]. When two design matrices are mentioned, the first was used for training (selecting features and training the statistical learner) and the second for testing performance. The presented results are the percentage of misclassification values, and the number in parentheses is the number of features that gave best-performance results searching for results with 1 … min(M,30) features (see Figure 2 for details). When the assessment was done using 10-fold cross validation, the results are presented in the form mean ± SD.