Table 1.
Clusters (with ≥2 ESTs) | Ensembl | SWISS-PROT | InterPro | |
---|---|---|---|---|
TCLAG | 21,478 (13,173) | 11,608 (8048) | 14,131 (9944) | 8015 (6368) |
NCLAG | 46,560 (20,219)a | 2079 (1105) | 6971 (4247) | 2456 (1485) |
UCLAG | 3881 (82) | n.a. | 1236 (52) | 199 (25) |
Numbers refer to overlaps with the current Ensembl gene set (14,364 genes in total), homology to known proteins in the SWISS-PROT Knowledgebase (matching 9791 sequences), and hits with protein domains in the InterPro database (matching 2144 distinct domains). Numbers referring to clusters supported by at least two sequences are marked in bold. The numbers of distinct Ensembl genes overlapping with T-clusters and N-clusters are 9639 (8020) and 1821 (1076), respectively, as some Ensembl genes overlap with more than one EST cluster.
35,660 (17,143) of N-clusters, i.e. 77 percent (85%) respectively, are contributed by only 863 ESTs, which are aligned with 50 to 191 distinct genomic loci.