Table 2.
Variable | Similarity category/dataset | |||||
---|---|---|---|---|---|---|
10−5 > P > 10−20 | 10−40 > P > 10−80 | 10−120 > P > 10−180 | ||||
Draft | Finish | Draft | Finish | Draft | Finish | |
No. of genes in dataset | 174 | 174 | 151 | 151 | 93 | 93 |
% of fragmented genes | 42 | 0 | 43 | 0 | 55 | 0 |
No. of predicted genes* | 186 | 172 | 205 | 159 | 152 | 104 |
Genes completely covered (%) | 38 | 58 | 48 | 71 | 57 | 73 |
Genes partially covered (%) | 49 | 32 | 51 | 28 | 42 | 27 |
Genes missed (%) | 13 | 10 | 1 | 1 | 1 | 0 |
No. of “extra” predicted genes* | 18 | 14 | 19 | 10 | 8 | 11 |
Sequences were grouped according to the level of similarity between the encoded protein and the available database proteins used in the predictions as described in the legend to Fig. 3. All known genes in the FinishGene set are complete (all coding exons present in a single sequence). Some genes in the DraftGene set represented by multiple “partial genes” in different draft contigs; these are listed as fragmented genes. Known genes were classified as completely covered if all exons were covered by GenomeScan predicted exons; partially covered, if some exons (but not all) were covered by GenomeScan predicted exons; and missed, if no exon was covered by a GenomeScan-predicted exon. GenomeScan predicted genes which did not overlap any known gene are listed as “extra” predicted genes.
Includes predicted partial genes as well as complete genes.