Table 1.
Features | Threshold | Sensitivity (%) | Specificity (%) | Accuracy (%) | MCC | AUROC | Parameters |
---|---|---|---|---|---|---|---|
AAC | − 0.1 | 94.49 ± 0.80 | 92.38 ± 1.33 | 93.30 ± 0.84 | 0.87 ± 0.01 | 0.98 ± 0.00 | g = 0.001, c = 3, j = 3 |
N5 AAC | 0 | 88.54 ± 0.75 | 90.25 ± 1.87 | 89.44 ± 1.26 | 0.79 ± 0.02 | 0.94 ± 0.00 | g = 0.0005, c = 2, j = 1 |
C5 AAC | 0 | 91.13 ± 1.42 | 92.94 ± 1.20 | 92.08 ± 1.05 | 0.84 ± 0.02 | 0.97 ± 0.00 | g = 0.001, c = 9, j = 1 |
N5C5 AAC | − 0.2 | 93.73 ± 0.60 | 92.83 ± 0.76 | 93.26 ± 0.40 | 0.87 ± 0.00 | 0.98 ± 0.00 | g = 0.0005, c = 1, j = 1 |
DPC | 0 | 93.79 ± 1.12 | 95.68 ± 0.78 | 94.84 ± 0.72 | 0.90 ± 0.01 | 0.99 ± 0.01 | g = 0.0005, c = 1, j = 2 |
N5 DPC | − 0.1 | 83.42 ± 1.77 | 87.73 ± 2.00 | 85.69 ± 1.10 | 0.71 ± 0.02 | 0.93 ± 0.00 | g = 1e−05, c = 9, j = 1 |
C5 DPC | − 0.1 | 90.21 ± 0.91 | 93.62 ± 0.96 | 92.00 ± 0.50 | 0.84 ± 0.01 | 0.97 ± 0.00 | g = 0.0005, c = 1, j = 2 |
N5C5 DPC | − 0.2 | 93.60 ± 0.72 | 92.67 ± 1.16 | 93.11 ± 0.70 | 0.86 ± 0.01 | 0.98 ± 0.00 | g = 0.0001, c = 1, j = 1 |
N5 bin | − 0.1 | 86.91 ± 0.82 | 88.81 ± 1.47 | 87.91 ± 0.73 | 0.76 ± 0.01 | 0.94 ± 0.00 | g = 0.5, c = 2, j = 1 |
C5 bin | − 0.2 | 91.18 ± 0.92 | 86.61 ± 1.68 | 88.80 ± 1.14 | 0.78 ± 0.02 | 0.96 ± 0.00 | g = 0.5, c = 1, j = 2 |
N5C5 bin | 0.2 | 89.20 ± 1.11 | 91.14 ± 1.61 | 90.22 ± 1.05 | 0.80 ± 0.02 | 0.96 ± 0.00 | g = 0.05, c = 1, j = 4 |
N10 bin | − 0.2 | 86.39 ± 2.73 | 89.68 ± 1.79 | 88.42 ± 1.05 | 0.76 ± 0.02 | 0.94 ± 0.01 | g = 0.1, c = 2, j = 2 |
C10 bin | − 0.2 | 79.87 ± 2.30 | 86.49 ± 2.43 | 83.96 ± 1.91 | 0.66 ± 0.03 | 0.90 ± 0.01 | g = 0.05, c = 3, j = 1 |
N10C10 bin | − 0.4 | 86.89 ± 2.70 | 91.62 ± 2.92 | 89.83 ± 1.31 | 0.79 ± 0.02 | 0.96 ± 0.00 | g = 0.1, c = 1, j = 1 |
AAC + motif | − 0.1 | 95.51 ± 0.86 | 95.35 ± 0.85 | 95.42 ± 0.77 | 0.91 ± 0.01 | 0.99 ± 0.00 | g = 0.001, c = 6, j = 1 |
DPC + motif | 0 | 94.15 ± 0.92 | 96.94 ± 0.49 | 95.71 ± 0.38 | 0.91 ± 0.00 | 0.99 ± 0.00 | g = 0.0005, c = 1, j = 2 |
This table shows average performance (mean ± standard deviation) of models on randomly generated training datasets (bagging)
MCC Matthews correlation coefficient, AAC amino acid composition, DPC dipeptide composition, N5 first 5 residues from N terminus, C5 first 5 residues from C terminus, N5C5 first 5 residues from N and C terminus respectively, bin binary profile, AAC + motif amino acid composition with MERCI motif score, DPC + motif dipeptide composition with MERCI motif score, SVM parameters g gamma parameter of the radial basis function, c trade-off between training error and margin, j regularization parameter (cost-factor, by which training errors on positive examples outweigh errors on negative examples)