Skip to main content
. 2011 Apr 27;10:133–147. doi: 10.4137/CIN.S7111

Table 3.

Performance of classifiers applied to independent validation set of lung cancer data set.

Unweighted Weighted Random forest SVM
Accuracy
LOOCV
  Top 1% 99.3a 100b 98.7e 100e
  Top 5% 99.3a 100b 98.7f 100f
60:40 partitions
  Top 1% 98.7c 100d 99.3g 94.6g
  Top 5% 98.7c 100d 100h 84.6h
Sensitivity
LOOCV
  Top 1% 100 100 99.2 100
  Top 5% 100 100 99.2 100
60:40 partitions
  Top 1% 99.2 100 100 94.0
  Top 5% 99.2 100 100 82.8
Positive predictive value
LOOCV
  Top 1% 99.3 100 99.2 100
  Top 5% 99.3 100 99.2 100
60:40 partitions
  Top 1% 99.2 100 99.3 100
  Top 5% 99.2 100 100 100

Notes: Accuracy, sensitivity, and positive predictive value of voting classifiers (unweighted and weighted), random forest and SVM applied to independent data sets from the lung cancer data set. Features to include in the classifiers were derived using the top 1% or 5% of features based on t-statistics through a jackknife procedure using training sets in leave-one-out cross validation (LOOCV) or multiple random validation (60:40 partitions).

a

Highest accuracy achieved with 37 features in classifier;

b

Highest accuracy achieved with 23 features in classifier;

c

Highest accuracy achieved with 15 features in classifier;

d

Highest accuracy achieved with 49 features in classifier. The number of features used in developing SVM and random forest classifiers were:

e

452 features;

f

1,791 features;

g

4,172 features;

h

9,628 features.