Skip to main content
. 2014 May 8;9(5):e96984. doi: 10.1371/journal.pone.0096984

Table 2. The accuracy of four different tree induction models (each ran with four criteria, Accuracy, Gain Ratio, Gini Index and Info Gain) on 11 datasets [original protein features dataset (FCdb) as well as 10 datasets generated by trimming (filtering) the original FCdb dataset by attribute weighting algorithms) computed by 10-fold cross validation.

Decision Tree Decision Tree Parallel Decision Tree Stump Decision Tree Random Forest
Dataset Filtered by Accuracy Gain Ratio Gini Index Info Gain Accuracy Gain Ratio Gini Index Info Gain Accuracy Gain Ratio Gini Index Info Gain Accuracy Gain Ratio Gini Index Info Gain
Chi Squared 99.52 99.52 99.52 99.52 94.22 99.27 99.46 99.47 71.63 48.48 71.18 71.98 88.90 91.72 99.51 99.54
Info Gain 99.33 99.33 99.33 99.33 89.30 99.56 99.51 99.33 71.63 49.80 71.18 71.98 86.18 95.92 99.52 99.44
Deviation 70.73 70.73 70.73 70.73 55.01 70.73 96.78 96.80 50.65 50.73 58.85 58.85 55.13 70.96 96.65 96.80
Gini Index 99.24 99.24 99.24 99.24 89.19 99.39 99.29 99.32 71.63 49.80 71.18 71.98 82.47 94.45 99.43 99.14
Info Gain Ratio 99.43 99.43 99.43 99.43 91.69 99.40 99.35 99.25 71.63 48.48 71.18 71.98 87.37 93.12 99.01 99.33
PCA 70.73 70.73 70.73 70.73 55.01 70.73 96.78 96.80 50.65 50.73 58.85 58.85 55.13 70.96 96.65 96.80
Relief 99.35 99.35 99.35 99.35 87.58 99.31 99.37 99.22 71.63 49.80 71.18 71.98 87.88 96.64 99.24 98.87
Rule 99.47 99.47 99.47 99.47 91.36 99.37 99.24 99.39 71.63 49.80 71.18 71.86 82.51 93.81 99.36 98.86
Uncertainty 99.40 99.40 99.40 99.40 91.70 99.40 99.32 99.25 71.63 49.80 71.18 71.98 88.62 95.01 99.70 99.65
SVM 98.99 98.99 98.99 98.99 93.64 92.48 99.09 99.09 70.99 46.40 70.05 72.03 86.13 89.00 98.56 98.06
FCdb (original protein features dataset) 99.06 96.87 74.36 46.04 76.13 96.87 74.93 74.50 48.48 90.00 58.38 58.38 97.65 99.31 98.34 97.73

This table presents the accuracy percentage of Tree Induction models (Decision Tree, Decision Tree Parallel, Decision Stump, Random Forest and Random Tree) run with four different criteria (Gain Ratio, Information Gain, Gini Index and Accuracy). The lowest and highest accuracies have been highlighted.