Table I.
Evidence supporting annotated Arabidopsis gene models
Support was determined by BLAST-based similarity between the current, complete set of gene models/proteins, and the following datasets: non-Arabidopsis proteins parsed from the TIGR nonredundant protein set (February 1, 2003); the Arabidopsis protein set as represented in TIGR's annotation database (February 1, 2003); and Arabidopsis cDNAs, Arabidopsis ESTs, and non-Arabidopsis plant ESTs downloaded from GenBank (January 28, 2003). Note that matches between an Arabidopsis protein and itself were excluded from the count for Arabidopsis protein support.
Each gene was counted only once, in the category associated with the highest overall confidence. Gene models supported by both Arabidopsis cDNA and protein similarity to another organism are considered to be highest confidence. Genes with no EST and no protein support are based solely on gene predictions and are the lowest confidence set.
Non-Arabidopsis Protein | Arabidopsis Protein | No Protein | Total | |
---|---|---|---|---|
Arabidopsis cDNA | 9,017 | 2,055 | 1,664 | 12,736 |
Arabidopsis EST | 2,607 | 1,041 | 1,033 | 4,681 |
Other Plant EST | 3,691 | 1,365 | 1,032 | 6,088 |
No cDNA/EST | 247 | 2,280 | 1,352 | 3,879 |
Total | 15,562 | 6,741 | 5,081 | 27,384 |