Table 4.
Unweighted | Weighted | Random forest | SVM | |
---|---|---|---|---|
Accuracy | ||||
LOOCV | ||||
Top 1% | 68.5a | 76.5b | 92.7g | 93.5g |
Top 5% | 74.5c | 81.9c | 91.6h | 92.4h |
60:40 Partitions | ||||
Top 1% | 86.3d | 88.2c | 91.6i | 89.7i |
Top 5% | 86.7e | 89.9f | 90.5j | 86.6j |
Sensitivity | ||||
LOOCV | ||||
Top 1% | 74.4 | 76.9 | 87.2 | 84.6 |
Top 5% | 89.7 | 79.5 | 87.2 | 74.4 |
60:40 Partitions | ||||
Top 1% | 74.4 | 76.9 | 82.0 | 65.0 |
Top 5% | 69.2 | 74.4 | 84.6 | 64.1 |
Positive predictive value | ||||
LOOCV | ||||
Top 1% | 43.9 | 53.6 | 70.8 | 75.0 |
Top 5% | 50.7 | 62.0 | 66.7 | 74.4 |
60:40 Partitions | ||||
Top 1% | 52.7 | 57.7 | 68.1 | 66.7 |
Top 5% | 54 | 61.7 | 63.5 | 54.3 |
Notes: Accuracy, sensitivity, and positive predictive value of voting classifiers (unweighted and weighted), random forest and SVM applied to independent data sets from the prostate cancer data set. Features to include in the classifiers were derived using the top 1% or 5% of features based on t-statistics through a jackknife procedure using training sets in leave-one-out cross validation (LOOCV) or multiple random validation (60:40 partitions).
Highest accuracy achieved with 37 features in classifier;
Highest accuracy achieved with 43 features in classifier;
Highest accuracy achieved with 45 features in classifier,
Highest accuracy achieved with 49 features in classifier;
Highest accuracy achieved with 47 features in classifier;
Highest accuracy achieved with 27 features in classifier. The number of features used in developing SVM and random forest classifiers were:
685 features;
2,553 features;
9,890 features;
14,843 features.