Skip to main content
. 2009 Oct 28;3:141–154. doi: 10.4137/bbi.s3030

Table 3.

CDS/intron classification by UFM with filters 1, 2 and 3 in CDS and intron sequences of H. sapiens, D. melanogaster and A. thaliana varying between 150 and 400 bp.

Species Filters Seq. Size, bp
150 200 250 300 350 400
H. sapiens 0 CDS* 100 100 100 100 100 100
Intron** 57.7 40.6 27.5 20.5 15 11.7
1 CDS 96.5 96.9 97.5 97.8 98.4 98.9
Intron 47.0 29.8 17.1 10.5 7.0 4.2***
1 + 3 CDS 85.3 87.9 89.2 92.6 93.5 94.5
Intron 28.4 18.3 10.7 7.6 4.8 2.7
2 + 3 CDS 78.1 81.0 82.6 85.0 86.5 87.8
Intron 20.3 11.0 4.6 3.3 1.5 0.8
D. melanogaster 0 CDS 100 99.8 100 99.9 99.9 99.9
Intron 43.6 25.8 12.2 5.6 3.7 2.6
1 CDS 97.7 98.3 98.7 98.5 98.6 98.8
Intron 43.6 25.8 12.2 5.6 3.7 2.6
1 + 3 CDS 87.2 89.7 90.9 92.3 93.4 94.1
Intron 25.6 13.2 8.4 3.9 2.3 1.8
2 + 3 CDS 81.3 82.2 83.6 86.0 87.4 88.5
Intron 21.5 11.5 5.8 3.1 1.8 1.4
A. thaliana 0 CDS 99.7 99.8 99.9 100 99.8 99.9
Intron 27.1 9.1 2.6 0.8 0 0
1 CDS 99.7 99.7 99.8 100 99.8 99.9
Intron 27.1 9.1 2.6 0.8 0 0
1 + 3 CDS 86.9 88.3 90.9 92.7 94 94.4
Intron 16.2 5.3 1.8 0.7 0 0
2 + 3 CDS 76.2 77.0 79.3 83.2 85.3 85.9
Intron 16.2 5.3 1.8 0.7 0 0
*

“CDS” indicates the proportion (%) of CDS that were correctly classified by the corresponding algorithm, i.e. the true positives. The CDS that are not detected, i.e. the false negatives are missing from the CDS output list.

**

“Intron” indicates the proportion (%) of introns that were wrongly classified, i.e. the false positives. The non-coding sequences correctly classified do not appear in the output list. All entries whose values is >0 contain an ORF whose purine bias is typical of a CDS for the size threshold considered.

***

Gray areas indicate cases where the false positive rate of coding ORF diagnosis is below or close to 5%.