Table 1. Identification of genes and reading frame lengtha.
677 previously annotated reading frames | |
+, 6 | Intergenic hits to a hypothetical protein |
MPN605(MP236–237), MPN482(MP359–360), MPN418(MP421–422), MPN388(MP450–451 MPN270(MP564–565), MPN254(MP579.1), | |
(MP313–314)b, (MP365–366)b, (MP383–384)b, (MP384–385)b | |
+, 4 | New complete protein genes |
MPN069(MP085–086) 50S ribosomal protein L33 | |
MPN495(MP346.1) PTS pentitol phosphotransferase EIIB | |
MPN296(MP540–541) 30S ribosomal protein S21 | |
MPN242(MP590–591) SecG | |
+, 2 | Short hypothetical proteins |
–, 1 | the original MP237 was a too short, different reading frame and was deleted |
688 protein reading frames (after our re-annotation) | |
Re-examination of protein reading frame lengthsc | |
+, 12 | N-terminal extensionsd |
MPN118(MP037), MPN077(MP078), MPN033(MP121), MPN661(MP181), MPN651(MP191), MPN475(MP365), MPN448(MP392), MPN396(MP443), MPN395(MP444), MPN345(MP492), MPN336(MP501), MPN306(MP531) | |
For MPN033(MP121) (uracil phosphotransferase; P75081) and MPN395(MP444) (adenine phosphoribosyltransferase) the 2-dimensional gel molecular weights confirm the predicted extension | |
+, 4 | C-terminal extensionsd |
MPN111(MP044), MPN108(MP047), MPN032(MP122), MPN520(MP322) | |
–, 8 | Proteins shortened at the N-terminus |
The following protein reading frames are shorter (N-terminus begins later) than the previously annotated M.pneumoniae GenBank annotation | |
MPN073(MP082), MPN643(MP199), MPN639(MP203), MPN611(MP231), MPN444(MP395), MPN432(MP408), MPN320(MP517), MPN170(MP662) |
aAll intergenic regions between any of the previously annotated protein reading frames were re-screened applying sequence analysis to identify hitherto overlooked reading frames (top). Similarly, previously unrecognized extensions became apparent by sequence comparison as well as shortened reading frames (bottom).
bThese four reading frames contain in-frame stops and are not counted.
cData of these reading frame modifications were shared with SwissProt and either are or will very soon be updated in SwissProt.
dThe C-terminal extensions are supported by sequence alignment to related protein reading frames from other organisms. However, they are only possible with frame shifting or mutation of stop codons. This indicates either pseudogenes or sequencing errors in these regions. In addition to the N- and C-terminal extensions listed, there is a potential intergenic extension. Adjacent ORFs MPN347(MP490) and MPN345(MP492) may be connected with MPN346(MP491) to form one gene via the intergenic regions, but this would again require some frame shifting [hsdR restriction enzyme (pseudo)gene, sequencing error or gene fragments].