Table 2.
Exon Prediction Using SVM
True positives detected
|
|||||
---|---|---|---|---|---|
Splice sites | Flanks | Exon bodies | 32/37 | 35/37 | 37/37 |
— | — | — | 1225 | 1225 | 1225 |
— | + | — | 164 | 259 | 668 |
— | — | + | 108 | 232 | 383 |
+ | — | + | 58 | 111 | 180 |
+ | + | + | 19 | 53 | 90 |
Eight human genes (3—42 42- kb, average 13- kb) were scanned for potential exons using criteria relaxed enough to capture all 37 real internal exons (acceptor and donor splice site scores of 70, lengths from 18 to 300 nt); 1225 pseudo exons were thus generated. SVM was asked to classify the candidates as real or pseudo exons. The weights given to various SVM components were varied, resulting in different degrees of success in recognizing the 37 real internal exons. The number of false positives (pseudo exons) chosen as real exons is shown for the inclusion of splice site sequence, flank sequence, and/or exon sequence information. Note that no reading frame information was included.