Table 7.
The performance of SVM based models developed using different features on additional dataset.
| Features (parameters) | Mod_AMP_similar Dataset | ||||
|---|---|---|---|---|---|
| Sen | Spc | Acc | MCC | AUROC | |
| Atom composition (g = 0.1, c = 9, j = 1) | 89.47 | 43.68 | 66.58 | 0.37 | 0.77 |
| Diatom composition (g = 0.05, c = 15, j = 2) | 88.42 | 71.58 | 80.00 | 0.61 | 0.88 |
| 2D descriptors (g = 0.1, c = 1, j = 1) | 84.74 | 32.63 | 58.68 | 0.20 | 0.66 |
| Fingerprints (g = 0.001, c = 4, j = 1) | 93.16 | 87.37 | 90.26 | 0.81 | 0.97 |
| Hybrid features (2D + fingerprints) (g = 0.005, c = 7, j = 2) | 84.74 | 58.95 | 71.84 | 0.45 | 0.81 |
| N100C100 Binary profile (only atoms) (g = 0.01, c = 6, j = 1) | 90.51 | 89.44 | 89.66 | 0.80 | 0.97 |
| N100C100 Binary profile (only symbols) (g = 0.005, c = 7, j = 2) | 76.98 | 91.10 | 84.21 | 0.60 | 0.94 |
| N200C200 Binary profile (atom + symbols) (g = 0.005, c = 1, j = 2) | 89.29 | 89.12 | 89.20 | 0.78 | 0.96 |
Sen, Sensitivity; Spc, Specificity; Acc, Accuracy; MCC, Matthew’s Correlation Coefficient; AUROC, Area Under the Receiver Operating Characteristic curve; N100C100, first 100 elements form N-terminus and C-terminus, respectively; N200C200, first 200 elements form N-terminus and C-terminus, respectively.