Skip to main content
. 2017 Jul 6;7(9):2931–2943. doi: 10.1534/g3.117.044024

Table 1. Performance of different predictive models on the training/testing dataset of 737 sequences (T737) during 10-fold cross-validation. Evaluation of the models on an independent validation dataset (V185).

PCC on Training/Testing Sets (T737) and Independent Validation Sets (V185) Using 10nCV
Predictive Model No. siRNA Feature Name No. of Features T737 V185
1 Mononucleotide composition 4 0.53 0.54
2 Dinucleotide composition 16 0.68 0.64
3 Trinucleotide composition 64 0.70 0.66
4 Tetranucleotide composition 256 0.69 0.65
5 Pentanucleotide composition 1024 0.68 0.63
6 Binary 76 0.55 0.56
7 1+2 20 0.67 0.63
8 1+2+3 84 0.70 0.63
9 1+2+3+4 340 0.71 0.65
10 1+2+3+4+5 1364 0.71 0.65
11 1+2+3+4+6 (ASPsiPredSVM) 416 0.71 0.65
12 1+2+3+4+5+6 1440 0.71 0.65
13 Thermodynamic feature 21 0.41 0.30
14 Secondary structure 19 0.24 0.07
15 13+14 40 0.35 0.23
16 12+13 437 0.71 0.65
17 12+14 435 0.71 0.65
18 12+13+14 456 0.71 0.65
19 ASPsiPredmatrix Matrix based Developed on rules-based studies 0.63

PCC, Pearson correlation coefficient; 10nCV, 10-fold cross-validation; T737, training/testing dataset for 10-fold cross-validation; V185, independent validation dataset. PCC is between actual and observed Effmut. Training/testing dataset is used to train different predictive models, while independent validation dataset was not used anywhere during training/testing of algorithm.