Skip to main content
. 2015 Nov 26;104(6):753–763. doi: 10.1002/bip.22703

Table 1.

Performance Evaluation During 10‐Fold Cross Validation

S. No. Feature No. of Features PCC
Training/Testing, T683 (10×) Validation, V76
SVM RF IBk K* SVM RF IBk K*
1 Amino acid composition (Mono) 20 0.59 0.61 0.44 0.41 0.64 0.64 0.42 0.41
2 Di‐peptide composition (Di) 400 0.61 0.60 0.47 0.43 0.66 0.62 0.47 0.45
3 C8 Binary profile (C8 Bin) 160 0.56 0.57 0.45 0.42 0.59 0.60 0.43 0.41
4 N8 Binary profile (N8 Bin) 160 0.51 0.54 0.45 0.43 0.48 0.60 0.45 0.43
5 Physicochemical properties (Physico) 315 0.59 0.54 0.46 0.44 0.63 0.68 0.46 0.45
6 Solvent accessibility (SA) 21 0.22 0.20 0.18 0.19 0.21 0.18 0.15 0.16
7 Secondary structure (SS) 3 0.18 0.18 0.16 0.17 0.19 0.16 0.17 0.18
8 1 + 2 420 0.60 0.61 0.47 0.45 0.67 0.62 0.48 0.48
9 3 + 4 320 0.59 0.62 0.51 0.48 0.62 0.65 0.52 0.50
10 1 + 2+5 735 0.63 0.61 0.52 0.51 0.70 0.64 0.54 0.51
11 3 + 4+5 635 0.63 0.60 0.51 0.50 0.72 0.67 0.52 0.50
12 1 + 2+3 + 4 740 0.61 0.62 0.51 0.49 0.67 0.63 0.51 0.50
13 1 + 2+3 + 4+5 1055 0.62 0.61 0.50 0.51 0.66 0.64 0.54 0.53
14 6 + 7 23 0.22 0.20 0.18 0.21 0.23 0.19 0.20 0.18
15 1 + 2+5 + 6+7 758 0.66 0.63 0.55 0.54 0.74 0.68 0.59 0.57
16 3 + 4+5 + 6+7 658 0.65 0.64 0.56 0.55 0.73 0.70 0.58 0.56

10‐Fold cross validation performance of predictive models on AVP dataset of 683 sequences (T683) and evaluation of performance of predictive models on validation dataset of 76 peptides (V76) using SVM, RF, IBk, and K* MLTs.

Abbreviations: SVM: support vector machine; RF: random forest; IBk: instance‐based classifier (Weka); K*: KStar (Weka); T685: Training dataset of 683 AVPs; 10×: 10‐fold cross validation; V76: independent dataset of 76 AVPs.