Table 3.
Method | Matthews’ correlation coefficient (MCC) | Accuracy | Sensitivity | Specificity | AUC | P-value |
---|---|---|---|---|---|---|
Amino acid composition (AAC) | 0.219 ± 0.024 | 0.609 ± 0.012 | 0.645 ± 0.023 | 0.573 ± 0.019 | 0.641 ± 0.016 | 0.006 |
Dipeptide composition (DPC) | 0.269 ± 0.018 | 0.635 ± 0.009 | 0.635 ± 0.012 | 0.634 ± 0.016 | 0.683 ± 0.009 | 0.491 |
Composition–transition–distribution (CTD) | 0.182 ± 0.030 | 0.591 ± 0.015 | 0.579 ± 0.019 | 0.603 ± 0.020 | 0.621 ± 0.016 | 0.0003 |
Physicochemical properties (PCP) | 0.172 ± 0.020 | 0.585 ± 0.010 | 0.523 ± 0.035 | 0.648 ± 0.027 | 0.620 ± 0.012 | 0.0002 |
Amino acid index (AAI) | 0.228 ± 0.015 | 0.613 ± 0.007 | 0.650 ± 0.018 | 0.577 ± 0.017 | 0.642 ± 0.010 | 0.008 |
Hybrid | 0.218 ± 0.020 | 0.609 ± 0.010 | 0.602 ± 0.014 | 0.616 ± 0.018 | 0.647 ± 0.013 | 0.015 |
Ensemble learning (EL) | 0.298 ± 0.022 | 0.649 ± 0.011 | 0.618 ± 0.018 | 0.679 ± 0.009 | 0.697 ± 0.011 | – |
The first column corresponds to the performance of individual feature group, hybrid feature, and ensemble learning. The column 2–6 respectively represent the MCC, accuracy, sensitivity, specificity, and AUC value, where each value shown as the average ± SD of 10 alternative balanced datasets. The last column represents a pairwise comparison of AUC between EL and the other methods using a two-tailed t-test. P ≤ 0.05 indicates a statistically meaningful difference between EL and the selected composition (shown in bold).