Table 1.
Model | Data set | P/N | ACC | SE | SP | AUC | MCC | |
---|---|---|---|---|---|---|---|---|
C4.5 Decision tree | Training set | 43/133 | 176 | 0.95 | 0.90 | 0.96 | 0.95 | 0.86 |
Test set | 20/44 | 64 | 0.92 | 0.83 | 0.98 | 0.95 | 0.83 | |
Logistic regression | Training set | 43/133 | 176 | 0.86 | 0.73 | 0.90 | 0.77 | 0.56 |
Test set | 20/44 | 64 | 0.80 | 0.73 | 0.82 | 0.77 | 0.58 | |
Support vector machine (Polynomial kernel) |
Training set | 43/133 | 176 | 0.85 | 0.74 | 0.87 | 0.76 | 0.61 |
Test set | 20/44 | 64 | 0.83 | 0.80 | 0.84 | 0.77 | 0.50 |
Note: Predictive accuracy was reflected by four indices: sensitivity (), specificity (), overall predictive accuracy (), and Matthews correlation coefficient . The area under the receiver operating characteristic curve (AUC) is a measure of how well a model distinguishes positive and negative data points; the model AUC illustrated a high classification power. FN, false negatives; FP, false positives; , number of data points in the data set; P/N, ratio of positive/negative data points; TN, true negatives; TP, true positives.