Table 1. Number of sequences taken in testing and training sets.
Training Set | Testing Set | |||||||||||||
Positive | Negative | Positive | Negative | |||||||||||
r | sno | sn | t | SINE | ps | r | sno | sn | t | SINE | ps | |||
Homo sapiens | 584 | 98 | 98 | 98 | 86 | 5 | 250 | 500 | 97 | 97 | 97 | 86 | 5 | 225 |
Canis familiaris | 158 | 34 | 34 | 34 | 0 | 0 | 55 | 159 | 33 | 35 | 35 | 0 | 0 | 53 |
Rattus norvegicus | 195 | 43 | 45 | 44 | 0 | 0 | 60 | 196 | 30 | 30 | 30 | 0 | 0 | 84 |
Drosophila melanogaster | 112 | 12 | 24 | 11 | 25 | 0 | 40 | 113 | 13 | 24 | 11 | 25 | 0 | 40 |
Caenorhabditis elegans | 108 | 5 | 25 | 25 | 25 | 0 | 25 | 109 | 5 | 25 | 25 | 25 | 0 | 25 |
Mus musculus | 348 | 75 | 75 | 75 | 0 | 0 | 123 | 348 | 75 | 75 | 75 | 0 | 0 | 123 |
ps- pseudohairpin sequences from mRNAs, r– rRNA sequences, sno- snoRNA sequences, sn- snRNA sequences, t- tRNA sequences.
All possible sources of redundancy were eliminated while preparing the datasets. The test sets contained never seen before instances.