Table II.
Statistics of manual versus automated rice genome annotation
Statistic | Manual | Automated |
---|---|---|
Number of BACs | 286 | 3,617 |
Total length (bp)a | 38,489,150 | 476,318,197 |
Average BAC GC content (%) | 43.5 | 43.5 |
Average intergenic GC content (%) | 40.9 | 41.4 |
Average exon GC content (%) | 54.3 | 53.1 |
Average intron GC content (%) | 38.7 | 38.7 |
No. genesb | 6,717 (1,311) | 78,950 (17,665) |
Average gene sizec | 2,411 (3,111) | 2,519 (3,383) |
Total gene length (bp) | 16,197,893 (42.1%) | 198,925,545 (41.8%) |
Gene density (kb/gene) | 5.7 | 6.2 |
Known/putative genes | 3,668 (54.6%) | 44,287 (56.1%) |
Expressed genes | 702 (10.5%) | 6,468 (8.2%) |
Hypothetical genes | 2,347 (34.9%) | 28,195 (35.7%) |
Total no. of gene modelsd | 7,232 (1,315) | 82,921 (17,730) |
Average exon no. per model | 4.2 | 4.2 |
Average exon size (bp) | 289 | 312 |
Average intron size (bp) | 375 | 364 |
Total length is that for all BAC/PAC clones, including the overlapping regions between clones.
Genes annotated are at the BAC/PAC level, and the numbers include the duplicated genes in the overlap regions. The numbers in parentheses are the numbers of TE-related genes.
Gene size is reported for all genes; within the parentheses is the size of TE-related genes.
Total number of gene models is at the BAC/PAC level and includes the duplicated models in the overlap region. The number of TE-related gene models is within the parentheses.