Table 1. Start codon usage distribution of the predicted putative coding ORFs.
ENSEMBL annotation | All predictions | Matching annotated | Extensions | Truncations | Novel | |
---|---|---|---|---|---|---|
Salmonella | ||||||
ATG | 4093 (88.0%) | 2942 (86%) | 2576 (91.2%) | 166 (58%) | 139 (64%) | 61 (62%) |
GTG | 429 (9.20%) | 344 (10%) | 206 (7.4%) | 64 (24%) | 52 (24%) | 18 (18%) |
TTG | 126 (2.70%) | 135 (4%) | 40 (1.4%) | 51 (18%) | 25 (12%) | 19 (20%) |
E. coli | ||||||
ATG | 3747 (90.1%) | 2776 (87%) | 2591 (91%) | 76 (42%) | 49 (70%) | 60 (62%) |
GTG | 386 (9.2%) | 284 (9%) | 209 (7%) | 44 (25%) | 9 (13%) | 22 (22%) |
TTG | 71 (2.0%) | 142 (4%) | 55 (2%) | 59 (33%) | 12 (17%) | 16 (16%) |
Bacillus | ||||||
ATG | 3253 (77.7%) | 2502 (73%) | 2176 (80%) | 141 (41%) | 75 (56%) | 110 (49%) |
GTG | 386 (9.2%) | 413 (12%) | 237 (9%) | 108 (31%) | 29 (22%) | 39 (17%) |
TTG | 529 (12.6%) | 520 (15%) | 320 (12%) | 94 (29%) | 30 (22%) | 76 (34%) |
The predicted ORFs in all three species follow the starts codon usage distributions of the corresponding species annotation. In case of Salmonella and E. coli, only ORFs from the high confident set were considered.