Table 1. Performance of different predictive models on the training/testing dataset of 737 sequences (T737) during 10-fold cross-validation. Evaluation of the models on an independent validation dataset (V185).
| PCC on Training/Testing Sets (T737) and Independent Validation Sets (V185) Using 10nCV | ||||
|---|---|---|---|---|
| Predictive Model No. | siRNA Feature Name | No. of Features | T737 | V185 |
| 1 | Mononucleotide composition | 4 | 0.53 | 0.54 |
| 2 | Dinucleotide composition | 16 | 0.68 | 0.64 |
| 3 | Trinucleotide composition | 64 | 0.70 | 0.66 |
| 4 | Tetranucleotide composition | 256 | 0.69 | 0.65 |
| 5 | Pentanucleotide composition | 1024 | 0.68 | 0.63 |
| 6 | Binary | 76 | 0.55 | 0.56 |
| 7 | 1+2 | 20 | 0.67 | 0.63 |
| 8 | 1+2+3 | 84 | 0.70 | 0.63 |
| 9 | 1+2+3+4 | 340 | 0.71 | 0.65 |
| 10 | 1+2+3+4+5 | 1364 | 0.71 | 0.65 |
| 11 | 1+2+3+4+6 (ASPsiPredSVM) | 416 | 0.71 | 0.65 |
| 12 | 1+2+3+4+5+6 | 1440 | 0.71 | 0.65 |
| 13 | Thermodynamic feature | 21 | 0.41 | 0.30 |
| 14 | Secondary structure | 19 | 0.24 | 0.07 |
| 15 | 13+14 | 40 | 0.35 | 0.23 |
| 16 | 12+13 | 437 | 0.71 | 0.65 |
| 17 | 12+14 | 435 | 0.71 | 0.65 |
| 18 | 12+13+14 | 456 | 0.71 | 0.65 |
| 19 | ASPsiPredmatrix | Matrix based | Developed on rules-based studies | 0.63 |
PCC, Pearson correlation coefficient; 10nCV, 10-fold cross-validation; T737, training/testing dataset for 10-fold cross-validation; V185, independent validation dataset. PCC is between actual and observed Effmut. Training/testing dataset is used to train different predictive models, while independent validation dataset was not used anywhere during training/testing of algorithm.