Table 3. Performance of SVM by employing distinct peptide properties during 10-fold cross validation using negative dataset from UniProt.
Training/Testing dataset (T200p+200n) | Validation dataset (V20p+20n) | |||||
---|---|---|---|---|---|---|
Properties | Accuracy | MCC | ROC | Accuracy | MCC | ROC |
AAC | 89.00 | 0.78 | 0.95 | 82.50 | 0.65 | 0.94 |
DPC | 91.00 | 0.82 | 0.95 | 87.50 | 0.75 | 0.94 |
AAC+DPC | 89.80 | 0.80 | 0.96 | 85.71 | 0.72 | 0.95 |
N5Bin | 84.25 | 0.69 | 0.92 | 85.00 | 0.70 | 0.95 |
C5Bin | 86.00 | 0.72 | 0.92 | 77.50 | 0.55 | 0.92 |
N5C5Bin | 87.25 | 0.75 | 0.95 | 90.00 | 0.80 | 0.93 |
Physico | 93.00 | 0.86 | 0.98 | 90.00 | 0.82 | 0.97 |
AAC+DPC+N5C5Bin | 91.00 | 0.82 | 0.96 | 92.50 | 0.86 | 0.95 |
AAC+DPC+N5C5Bin+Physico | 91.25 | 0.83 | 0.96 | 90.00 | 0.80 | 0.95 |
AAC, Amino Acid Composition; DPC, Di Peptide Composition; N5AAC, Amino Acid Composition of 5 N-terminal residues; C5AAC, Amino Acid Composition of 5 C-terminal residues; N5Bin, Binary pattern of 5 N-terminal residues; C5Bin, Binary pattern of 5 C-terminal residues; N5C5Bin, Binary pattern of 5 N and 5 C terminal residues; Physico, top 10 physicochemical properties; SVM, Support Vector Machine; MCC, Mathew’s correlation coefficient; AUC, Area Under the curve;