Table 2.
Data set | EasyGene | Glim | rbs-Glim | Orpheus | Gm24 | GmS | Gmhmm | Frame |
A'-% found | 98.4 | 98.9/98.9 | 98.9 | 98.0/95.3 | 91.5 | 97.2 | 98.1 | 97.0 |
A'-% exact | 93.8 | 98.9/95.3 | 84.1 | 95.1/92.4 | 41.6 | 88.0 | 85.7 | 93.2 |
B'-% found | 98.4 | 98.5/98.6 | 98.6 | 95.9/96.5 | 90.2 | 96.6 | 97.2 | 96.4 |
T-% found | 98.1(98.0) | 98.3/98.4 | 98.4 | 96.5/95.6 | 89.8 | 96.3 | 97.1 | 96.1 |
Genome | 4145 | 6827/5756 | 5756 | 9333/7543 | 3552 | 4064 | 4230 | 4064 |
zero order | 7 | 169/211 | 211 | 6761/5430 | 6 | 153 | 1459 | 0 |
first order | 7 | 545/723 | 723 | 6836/4804 | 13 | 241 | 830 | 0 |
third order | 1 | 2423/2694 | 2694 | 6582/4817 | 43 | 659 | 866 | 1 |
shadows | 0 | 19/21 | 21 | 22/9 | 1 | 0 | 2 | 0 |
Upper part shows the percentage of genes found exactly (both 5' and 3' end) and partially (only 3' end exact) for different gene finders and sets of high confidence genes in E. coli. For Glimmer and Orpheus, the numbers before the "/" are based exclusively on their ORF scores and recommended threshold whereas the numbers after the "/" are based on their post-processing procedures. The number of genes predicted in the whole genome is also shown. This should be compared to the 4288 annotated genes in E. coli. The lower part of the table shows the number of false positives predicted in random sequences generated by Markov chains of order 0, 1 and 3 and the very last row shows the number of false predictions in the shadows of the high-confidence genes in data set A. All values listed for EasyGene are based on an R-value threshold of R = 2.