Table 3.
Unweighted | Weighted | Random forest | SVM | |
---|---|---|---|---|
Accuracy | ||||
LOOCV | ||||
Top 1% | 99.3a | 100b | 98.7e | 100e |
Top 5% | 99.3a | 100b | 98.7f | 100f |
60:40 partitions | ||||
Top 1% | 98.7c | 100d | 99.3g | 94.6g |
Top 5% | 98.7c | 100d | 100h | 84.6h |
Sensitivity | ||||
LOOCV | ||||
Top 1% | 100 | 100 | 99.2 | 100 |
Top 5% | 100 | 100 | 99.2 | 100 |
60:40 partitions | ||||
Top 1% | 99.2 | 100 | 100 | 94.0 |
Top 5% | 99.2 | 100 | 100 | 82.8 |
Positive predictive value | ||||
LOOCV | ||||
Top 1% | 99.3 | 100 | 99.2 | 100 |
Top 5% | 99.3 | 100 | 99.2 | 100 |
60:40 partitions | ||||
Top 1% | 99.2 | 100 | 99.3 | 100 |
Top 5% | 99.2 | 100 | 100 | 100 |
Notes: Accuracy, sensitivity, and positive predictive value of voting classifiers (unweighted and weighted), random forest and SVM applied to independent data sets from the lung cancer data set. Features to include in the classifiers were derived using the top 1% or 5% of features based on t-statistics through a jackknife procedure using training sets in leave-one-out cross validation (LOOCV) or multiple random validation (60:40 partitions).
Highest accuracy achieved with 37 features in classifier;
Highest accuracy achieved with 23 features in classifier;
Highest accuracy achieved with 15 features in classifier;
Highest accuracy achieved with 49 features in classifier. The number of features used in developing SVM and random forest classifiers were:
452 features;
1,791 features;
4,172 features;
9,628 features.