Table 1.
Data sets | GenBank annotation | ||
---|---|---|---|
Helicobacter pylori (1,667,867 nt) | |||
(+) Coding (722,915) | (−) Coding (780,576) | Other (164,376) | |
DB1 (565,176) | 537,242 (95%) | 8710 (1.5%) | 19,224 (3.5%) |
DB2 (254,346) | 50,294 (21.5%) | 110,225 (43.5%) | 93,827 (37%) |
DB3 (544,572) | 9,152 (2%) | 513,606 (94%) | 21,814 (4%) |
Methanococcus jannaschii (1,664,977 nt) | |||
DB1 (553,666) | 6,997 (1%) | 514,406 (93%) | 32,263 (6%) |
DB2 (187,380) | 28,258 (15%) | 57,650 (31%) | 101,472 (54%) |
DB3 (665,857) | 619,818 (93%) | 4,877 (1%) | 41,162 (6%) |
Upon convergence, sequence segments are collected into three automatically defined data sets: DB1, DB2, and DB3. The sizes of these various data sets annotated as “coding” (+), “reverse coding” (−), or “other” in GenBank are indicated for H. pylori (top) and M. jannaschii (bottom). The numbers corresponding to the cognate DB/annotation matches are in bold. A total of 1,364,094 nt (82%) and of 1,406,903 nt (84.5%) are classified (i.e., collected in DB 1–3) for H. pylori and M. jannaschii, respectively.