Skip to main content
letter
. 2004 Dec;14(12):2424–2429. doi: 10.1101/gr.3158304

Table 1.

Description of gene categories

Category Definition No. of genes in category No. of genes successfully amplified by 5′ RACE (%) No. of RACE sequences that differed from the 5′ gene annotation (%)
EPD Genes in the Eukaryotic Promoter Database having experimentally verified transcriptional start sites 13 13 (100%) 4 (31%)
RefSeq NCBI's curated non-redundant gene set 27 20 (74%) 8 (40%)
B Automated NCBI predictions covered by multiple ESTs 23 15 (65%) 7 (47%)
C Gene predictions which are covered by a single EST only and do not overlap any mRNA, cDNA, ENSEMBLE or GENIE evidence 169 40 (24%) 30 (75%)
D Gene predictions that do not overlap any EST, mRNA, cDNA, ENSEMBLE, or GENIE evidence 68 18 (26%) 12 (67%)
Total 300 106 (35%) 61 (58%)

Three hundred mouse genes or gene predictions were classified into five categories based on the quality of associated evidence. The definition column describes the basis for the classification. Genes in the EPD category have the highest quality evidence and were used as internal positive control for all experiments. Genes in category D were considered to be based on evidence with least amount of confidence. 5′ RACE–PCR was performed on 15 mouse tissues/stages as described in Methods. The number of genes successfully amplified in each category and satisfying the criteria described in the Methods section are listed. The number of 5′ RACE sequences where the reference sequence annotation was found to be incomplete at the 5′ end is shown for each category.