Skip to main content
. 2018 Jul 3;16:181. doi: 10.1186/s12967-018-1560-1

Table 1.

The performance of SVM-based models developed using various features; models were evaluated on training dataset using fivefold cross-validation (internal cross-validation)

Features Threshold Sensitivity (%) Specificity (%) Accuracy (%) MCC AUROC Parameters
AAC − 0.1 94.49 ± 0.80 92.38 ± 1.33 93.30 ± 0.84 0.87 ± 0.01 0.98 ± 0.00 g = 0.001, c = 3, j = 3
N5 AAC 0 88.54 ± 0.75 90.25 ± 1.87 89.44 ± 1.26 0.79 ± 0.02 0.94 ± 0.00 g = 0.0005, c = 2, j = 1
C5 AAC 0 91.13 ± 1.42 92.94 ± 1.20 92.08 ± 1.05 0.84 ± 0.02 0.97 ± 0.00 g = 0.001, c = 9, j = 1
N5C5 AAC − 0.2 93.73 ± 0.60 92.83 ± 0.76 93.26 ± 0.40 0.87 ± 0.00 0.98 ± 0.00 g = 0.0005, c = 1, j = 1
DPC 0 93.79 ± 1.12 95.68 ± 0.78 94.84 ± 0.72 0.90 ± 0.01 0.99 ± 0.01 g = 0.0005, c = 1, j = 2
N5 DPC − 0.1 83.42 ± 1.77 87.73 ± 2.00 85.69 ± 1.10 0.71 ± 0.02 0.93 ± 0.00 g = 1e−05, c = 9, j = 1
C5 DPC − 0.1 90.21 ± 0.91 93.62 ± 0.96 92.00 ± 0.50 0.84 ± 0.01 0.97 ± 0.00 g = 0.0005, c = 1, j = 2
N5C5 DPC − 0.2 93.60 ± 0.72 92.67 ± 1.16 93.11 ± 0.70 0.86 ± 0.01 0.98 ± 0.00 g = 0.0001, c = 1, j = 1
N5 bin − 0.1 86.91 ± 0.82 88.81 ± 1.47 87.91 ± 0.73 0.76 ± 0.01 0.94 ± 0.00 g = 0.5, c = 2, j = 1
C5 bin − 0.2 91.18 ± 0.92 86.61 ± 1.68 88.80 ± 1.14 0.78 ± 0.02 0.96 ± 0.00 g = 0.5, c = 1, j = 2
N5C5 bin 0.2 89.20 ± 1.11 91.14 ± 1.61 90.22 ± 1.05 0.80 ± 0.02 0.96 ± 0.00 g = 0.05, c = 1, j = 4
N10 bin − 0.2 86.39 ± 2.73 89.68 ± 1.79 88.42 ± 1.05 0.76 ± 0.02 0.94 ± 0.01 g = 0.1, c = 2, j = 2
C10 bin − 0.2 79.87 ± 2.30 86.49 ± 2.43 83.96 ± 1.91 0.66 ± 0.03 0.90 ± 0.01 g = 0.05, c = 3, j = 1
N10C10 bin − 0.4 86.89 ± 2.70 91.62 ± 2.92 89.83 ± 1.31 0.79 ± 0.02 0.96 ± 0.00 g = 0.1, c = 1, j = 1
AAC + motif − 0.1 95.51 ± 0.86 95.35 ± 0.85 95.42 ± 0.77 0.91 ± 0.01 0.99 ± 0.00 g = 0.001, c = 6, j = 1
DPC + motif 0 94.15 ± 0.92 96.94 ± 0.49 95.71 ± 0.38 0.91 ± 0.00 0.99 ± 0.00 g = 0.0005, c = 1, j = 2

This table shows average performance (mean ± standard deviation) of models on randomly generated training datasets (bagging)

MCC Matthews correlation coefficient, AAC amino acid composition, DPC dipeptide composition, N5 first 5 residues from N terminus, C5 first 5 residues from C terminus, N5C5 first 5 residues from N and C terminus respectively, bin binary profile, AAC + motif amino acid composition with MERCI motif score, DPC + motif dipeptide composition with MERCI motif score, SVM parameters g gamma parameter of the radial basis function, c trade-off between training error and margin, j regularization parameter (cost-factor, by which training errors on positive examples outweigh errors on negative examples)