Abstract
The ThyX class of thymidylate synthases was previously characterized by a common ThyX motif, RHRX7S. We report bacterial ThyX sequences having distinctive ThyX motifs, suggesting a more general ThyX motif, R/THRX7-8S. One ThyX sequence has an intein in its ThyX motif that was shown to do protein splicing and a group II intron in its gene, suggesting a hot spot for these self-splicing mobile elements.
The recently discovered ThyX class of thymidylate synthases is not similar to ThyA in its structure (16) or in its reductive mechanism (10, 19), revealing complexity in the evolution of thymidine production in DNA synthesis. ThyX exists in organisms lacking ThyA and shows a sporadic phylogenetic distribution indicating lateral gene transfers (19). Its presence in many pathogenic bacteria and absence in humans make ThyX an attractive target for potential antibacterial drugs. To realize this potential, knowledge is needed regarding important structural elements including the common ThyX motif found in previously known ThyX sequences.
Inteins and introns are rare and have not been found previously with ThyX. An intein is a protein intervening sequence that can self-excise and concomitantly splice together its flanking extein sequences (21). Many inteins also harbor an endonuclease domain that initiates intein homing, which confers genetic mobility on the intein (3, 5) and explains its sporadic phylogenetic distribution (20). Group II introns are another type of self-splicing mobile element, and most bacterial group II introns encode a reverse transcriptase-like (RTL) protein (23) that assists intron splicing and mobility, including retrotransposition (2, 11, 12). Group II introns are believed to be evolutionary ancestors of nuclear spliceosomal introns (4), but they are strongly excluded from conserved protein-coding genes in bacteria (6-8), although some bacteriophage genes encode both inteins and group I introns (9, 13, 14). Surprisingly, a bacterial ribonucleotide reductase (RIR)-encoding gene was found recently to encode multiple inteins and group II introns (18). To explore and understand this new phenomenon, we searched for similar inteins and introns in related genes.
An intein- and intron-encoding thyX gene was found during a BLAST search (1) of the GenBank database. This gene is from the oceanic N2-fixing cyanobacterium Trichodesmium erythraeum. As illustrated in Fig. 1, the three exon and extein coding sequences are 196, 80, and 444 bp long, respectively, and they together predicted a 240-amino-acid ThyX sequence that is very similar to known ThyX sequences. The intron was identified by its strong sequence similarity to known group II introns in this organism, which included the T.er.I4 intron in an RIR-encoding gene (18) and the Tr.e.I1 intron in an intergenic sequence (6). It is more than 80% identical to the other introns in the ∼680-nucleotide folded region (data not shown), although it lacks the RTL coding sequence present in domain IV of the other introns.
The intein was recognized by its intein sequence motifs (Fig. 2), and its boundaries were readily identified through comparisons with inteinless ThyX sequences. The intein clearly has sequence motifs (A, B, F, and G) for a splicing domain, but it either lacks or has incomplete sequence motifs (C, D, E, and H) for a homing endonuclease domain. It showed less than 15% sequence identity and no insertion site similarity to other known inteins. For example, it showed 14% sequence identity and 28% sequence similarity to the Synechocystis sp. strain PCC6803 DnaB intein. Nevertheless, it showed efficient protein splicing in a recombinant protein in Escherichia coli (Fig. 3).
The T. erythraeum thyX gene is only the second bacterial gene (except for phage genes) known to encode inteins and introns, following an RIR-encoding gene (18). It is not certain why these two genes appear to be hot spots for intein and intron insertions, which most likely involve intein homing and group II intron retrotransposition. The ThyX intein likely had a homing endonuclease domain and later lost it, on the basis of observation of putative remnants of this domain in the intein. The ThyX intron and the T.er.I4 intron of the RIR-encoding gene, showing strong sequence identity, are likely related through recent retrotransposition, and it is tempting to speculate that the RTL-less ThyX intron is assisted in retrotransposition by the RTL protein encoded in the T.er.I4 intron. It is interesting that the thyX gene and the RIR-encoding gene both encode proteins involved in nucleic acid metabolism, because such genes have been recognized as favored homes of inteins and introns for various reasons (8, 17). We further noticed that the ThyX intein is inserted in the conserved ThyX motif, which prompted further analysis of ThyX motifs.
New ThyX sequences having distinctive ThyX motifs were found through BLAST searches of the GenBank database. Although not having inteins or introns, they showed ThyX motifs that are similar to but distinct from the previously defined ThyX motif RHRX7S (Table 1 and Fig. 4). Most of the new ThyX sequences (T. erythraeum, Nostoc punctiforme, Nostoc sp. strain PCC7120, Synechococcus sp. strain WH8102, mycobacteriophage Bxz1, Streptomyces coelicolor, and Streptomyces avermitilis MA-4680) are readily identified because they are more than 20% identical and 30% similar to the functionally identified Helicobacter pylori J99 ThyX protein and to the structurally determined Thermotoga maritima ThyX protein. ThyX sequences from the two thermophilic sources (Thermosynechococcus elongatus and thermophilic bacteriophage RM378), however, were less easy to identify because of their striking differences from other ThyX sequences at the N and C termini (Fig. 4B). Nevertheless, they are approximately 18% identical and 34% similar to T. erythraeum ThyX over four-fifths of the T. erythraeum ThyX sequence, which is quite significant in light of the generally low levels of sequence conservation among ThyX proteins. For example, the ThyX sequences of two cyanobacterial species (T. erythraeum and Synechocystis sp. strain PCC6803) are only 17% identical and 33% similar. The shortened N-terminal sequence of the T. elongatus and thermophilic bacteriophage RM378 ThyX proteins, relative to that of other ThyX proteins, may be compensated for by the extended C-terminal sequence, although the extended C-terminal sequence does not show similarity to that of other ThyX sequences. T. elongatus ThyX is also the only recognizable thymidylate synthase (which is functionally essential) that could be predicted from the complete genome sequence of this organism.
TABLE 1.
ThyX motif | Organism | GenBank accession number |
---|---|---|
THRX8S | Trichodesmium erythraeum | NZ_AABK02000059 |
Nostoc punctiforme | ZP_00106512 | |
Nostoc sp. strain PCC7120 | BAB72473 | |
Synechococcus sp. strain WH 8102 | ZP_00114730 | |
THRX7S | Thermosynechococcus elongatus | NP_681946 |
Thermophilic bacteriophage RM 378 | NP_835722 | |
RHRX8S | Mycobacteriophage Bxz1 | NP_818164 |
Streptomyces coelicolor | 086840 | |
Streptomyces avermitilis MA-4680 | NP_823693 | |
RHRX7S | Synechocystis sp. strain PCC6803 | NP_440394 |
Helicobacter pylori J99 | NP_224139 | |
Thermotoga maritima | NP_228259 |
These ThyX sequences, when grouped by their ThyX motifs as in Table 1, showed higher sequence identities within a group than between groups. In particular, the ThyX sequences of four cyanobacteria (T. erythraeum, N. punctiforme, Nostoc sp. strain PCC7120, and Synechococcus sp. strain WH8102), having the same ThyX motif, THRX8S, have more than 60% sequence identity with each other. But they show less than 20% sequence identity with the ThyX sequences of other cyanobacteria (e.g., Synechocystis sp. strain PCC6803) that have a different ThyX motif. The previously identified ThyX motif RHRX7S has the widest distribution, encompassing bacteria, archaea, and eucarya; therefore, it likely is the original ThyX motif present in the common ancestor of ThyX proteins. The new ThyX motifs THRX8S and THRX7S, found in some cyanobacterial species, likely represent later divergence from the presumed original ThyX motif, and they could also have been acquired through lateral gene transfer. Interestingly, three of the four distinctive ThyX motifs have been found with bacteriophage, which could have facilitated lateral transfer of thyX genes.
We suggest a more general ThyX motif, R/THRX7-8S, in order to accommodate all four of the distinctive ThyX motifs (Table 1). While this report was under review, others reported the functional characterization of two residues of the ThyX motif RHRX7-8S located at the catalytic site (15). The absolutely conserved S residue (S84) was shown to function as a nucleophile, and the first R residue (R74) was shown to participate in flavin adenine dinucleotide and dUMP binding. But our findings show that the first R residue is not absolutely conserved. However, when a ThyX motif begins with T instead of an R, it either has an R immediately before a T (in T. erythraeum, N. punctiforme, Nostoc sp. strain PCC7120, and Synechococcus sp. strain WH8102) or has a second R internally (in T. erythraeum and thermophilic bacteriophage RM378), and this could suggest functional replacement of alternative R residues.
Acknowledgments
This work was supported by research grants from the National Science and Engineering Research Council of Canada.
REFERENCES
- 1.Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Belfort, M., and P. S. Perlman. 1995. Mechanisms of intron mobility. J. Biol. Chem. 270:30237-30240. [DOI] [PubMed] [Google Scholar]
- 3.Belfort, M., and R. J. Roberts. 1997. Homing endonucleases: keeping the house in order. Nucleic Acids Res. 25:3379-3388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Cavalier-Smith, T. 1991. Intron phylogeny: a new hypothesis. Trends Genet. 7:145-148. [PubMed] [Google Scholar]
- 5.Cooper, A. A., and T. H. Stevens. 1995. Protein splicing: self-splicing of genetically mobile elements at the protein level. Trends Biochem. Sci. 20:351-356. [DOI] [PubMed] [Google Scholar]
- 6.Dai, L., N. Toor, R. Olson, A. Keeping, and S. Zimmerly. 2003. Database for mobile group II introns. Nucleic Acids Res. 31:424-426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dai, L., and S. Zimmerly. 2002. Compilation and analysis of group II intron insertions in bacterial genomes: evidence for retroelement behavior. Nucleic Acids Res. 30:1091-1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Edgell, D. R., M. Belfort, and D. A. Shub. 2000. Barriers to intron promiscuity in bacteria. J. Bacteriol. 182:5281-5289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ghim, S. Y., S. K. Choi, B. S. Shin, and S. H. Park. 1998. An 8 kb nucleotide sequence at the 3′ flanking region of the sspC gene (184 degrees) on the Bacillus subtilis 168 chromosome containing an intein and an intron. DNA Res. 5:121-126. [DOI] [PubMed] [Google Scholar]
- 10.Giladi, M., G. Bitan-Banin, M. Mevarech, and R. Ortenberg. 2002. Genetic evidence for a novel thymidylate synthase in the halophilic archaeon Halobacterium salinarum and in Campylobacter jejuni. FEMS Microbiol. Lett. 216:105-109. [DOI] [PubMed] [Google Scholar]
- 11.Ichiyanagi, K., A. Beauregard, S. Lawrence, D. Smith, B. Cousineau, and M. Belfort. 2002. Retrotransposition of the Ll.LtrB group II intron proceeds predominantly via reverse splicing into DNA targets. Mol. Microbiol. 46:1259-1272. [DOI] [PubMed] [Google Scholar]
- 12.Lambowitz, A. M., M. G. Caprara, S. Zimmerly, and P. S. Perlman. 1999. Group I and group II ribozymes as RNPs: clues to the past and guides to the future, p. 451-485. In R. F. Gesteland, T. R. Cech, and J. F. Atkins (ed.), The RNA world. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
- 13.Lazarevic, V. 2001. Ribonucleotide reductase genes of Bacillus prophages: a refuge to introns and intein coding sequences. Nucleic Acids Res. 29:3212-3218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lazarevic, V., B. Soldo, A. Dusterhoft, H. Hilbert, C. Mauel, and D. Karamata. 1998. Introns and intein coding sequence in the ribonucleotide reductase genes of Bacillus subtilis temperate bacteriophage SPβ. Proc. Natl. Acad. Sci. USA 95:1692-1697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Leduc, D., S. Graziani, G. Lipowski, C. Marchand, P. Le Marechal, U. Liebl, and H. Myllykallio. 2004. Functional evidence for active site location of tetrameric thymidylate synthase X at the interphase of three monomers. Proc. Natl. Acad. Sci. USA 101:7252-7257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lesley, S. A., P. Kuhn, A. Godzik, A. M. Deacon, I. Mathews, A. Kreusch, G. Spraggon, H. E. Klock, D. McMullan, T. Shin, J. Vincent, A. Robb, L. S. Brinen, M. D. Miller, T. M. McPhillips, M. A. Miller, D. Scheibe, J. M. Canaves, C. Guda, L. Jaroszewski, T. L. Selby, M. A. Elsliger, J. Wooley, S. S. Taylor, K. O. Hodgson, I. A. Wilson, P. G. Schultz, and R. C. Stevens. 2002. Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline. Proc. Natl. Acad. Sci. USA 99:11664-11669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Liu, X. Q. 2000. Protein-splicing intein: genetic mobility, origin, and evolution. Annu. Rev. Genet. 34:61-76. [DOI] [PubMed] [Google Scholar]
- 18.Liu, X. Q., J. Yang, and Q. Meng. 2003. Four inteins and three group II introns encoded in a bacterial ribonucleotide reductase gene. J. Biol. Chem. 278:46826-46831. [DOI] [PubMed] [Google Scholar]
- 19.Myllykallio, H., G. Lipowski, D. Leduc, J. Filee, P. Forterre, and U. Liebl. 2002. An alternative flavin-dependent mechanism for thymidylate synthesis. Science 297:105-107. [DOI] [PubMed] [Google Scholar]
- 20.Perler, F. B. 2002. InBase: the intein database. Nucleic Acids Res. 30:383-384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Perler, F. B., E. O. Davis, G. E. Dean, F. S. Gimble, W. E. Jack, N. Neff, C. J. Noren, J. Thorner, and M. Belfort. 1994. Protein splicing elements: inteins and exteins—a definition of terms and recommended nomenclature. Nucleic Acids Res. 22:1125-1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wu, H., M. Q. Xu, and X. Q. Liu. 1998. Protein trans-splicing and functional mini-inteins of a cyanobacterial dnaB intein. Biochim. Biophys. Acta 1387:422-432. [DOI] [PubMed] [Google Scholar]
- 23.Zimmerly, S., G. Hausner, and X. Wu. 2001. Phylogenetic relationships among group II intron ORFs. Nucleic Acids Res. 29:1238-1250. [DOI] [PMC free article] [PubMed] [Google Scholar]