Abstract
A coenzyme B12-dependent ribonucleotide reductase was purified from the archaebacterium Thermoplasma acidophila and partially sequenced. Using probes derived from the sequence, the corresponding gene was cloned, completely sequenced, and expressed in Escherichia coli. The deduced amino acid sequence shows that the catalytic domain of the B12-dependent enzyme from T. acidophila, some 400 amino acids, is related by common ancestry to the diferric tyrosine radical iron(III)-dependent ribonucleotide reductase from E. coli, yeast, mammalian viruses, and man. The critical cysteine residues in the catalytic domain that participate in the thiyl radical-dependent reaction have been conserved even though the cofactor that generates the radical is not. Evolutionary bridges created by the T. acidophila sequence and that of a B12-dependent reductase from Mycobacterium tuberculosis establish homology between the Fe-dependent enzymes and the catalytic domain of the Lactobacillus leichmannii B12-dependent enzyme as well. These bridges are confirmed by a predicted secondary structure for the Lactobacillus enzyme. Sequence similarities show that the N-terminal domain of the T. acidophila ribonucleotide reductase is also homologous to the anaerobic ribonucleotide reductase from E. coli, which uses neither B12 nor Fe cofactors. A predicted secondary structure of the N-terminal domain suggests that it is predominantly helical, as is the domain in the aerobic E. coli enzyme depending on Fe, extending the homologous family of proteins to include anaerobic ribonucleotide reductases, B12 ribonucleotide reductases, and Fe-dependent aerobic ribonucleotide reductases. A model for the evolution of the ribonucleotide reductase family is presented; in this model, the thiyl radical-based reaction mechanism is conserved, but the cofactor is chosen to best adapt the host organism to its environment. This analysis illustrates how secondary structure predictions can assist evolutionary analyses, each important in “post-genomic” biochemistry.
Keywords: post-genomic science, evolution
The deoxyribonucleotides required for DNA replication are biosynthesized via a pathway involving reduction of the corresponding ribonucleotides, a step catalyzed by the enzyme ribonucleotide reductase (RNR) (for reviews, see refs. 1 and 2). Ribonucleotide reduction presents to the biochemist one of the more interesting evolutionary conundrums in central metabolism. The reduction pathway is almost certainly as old as the divergence of archaebacteria, eubacteria, and eukaryotes (3), as it is found in all modern organisms studied to date (1–4). It is unlikely that this reflects convergent evolution, however. Direct removal of oxygen from a ribonucleotide has none of the chemical inevitability that would make different organisms invent it independently (4). Further, it would seem to be strategically acceptable to prepare deoxyribose derivatives more simply by an aldol condensation between acetaldehyde and the appropriate derivatives of glyceraldehyde (4, 5).
The catalysts for the reduction, the RNRs themselves, are not obviously homologous in various organisms, however. Although they all generate a 3′-radical on their nucleotide substrate (6), RNRs from different organisms differ greatly in how they initiate the radical reaction (1–4). At least four radical generation schemes are known. The RNR from humans generates the radical using a non-heme iron in a separate protein cofactor. Certain RNRs from eubacteria use non-heme iron as well. Other eubacterial RNRs use vitamin B12, however, whereas still others have a poorly defined manganese cofactor or employ glycyl radicals in the reaction (1–4).
Different classes of RNRs have intriguing sequence “motifs” involving cysteines that appear to be important for the catalysis (in Escherichia coli, Cys-439, the putative radical site, and Cys-225 and Cys-462, which delivers two electrons and a proton) (7–9). These motifs offer tantalizing suggestions that all RNRs are related by common ancestry but underwent divergent evolution so massive that only traces of evidence for homology remain in the sequences themselves (1, 2, 6–8). These motifs are inadequate to provide a statistically significant case for homology, however, and motifs are notoriously inadequate for confirming homology in general (10).
To better understand the evolutionary history of the RNR complex of proteins, we have examined RNRs from archaebacteria. We report here that the RNRs from three organisms in two branches of the Archaea kingdom, Halobacterium cutirubrum, Haloferax volcanii, and Thermoplasma acidophila, all require B12 as a cofactor. Further, we have purified the RNR from T. acidophila, cloned and sequenced the gene, and expressed the archaebacterial RNR in active form in E. coli. Through these analyses and the detection of sequence “bridges” connecting various classes of RNRs, we have convincingly confirmed the suggestions of Stubbe (1) and Reichard (2) that the core catalytic subunits of many (and perhaps all) RNRs are homologous. Recently developed tools for predicting secondary structure from sequence data (11) also confirmed suggestions of long-distance homology (12).
MATERIALS AND METHODS
RNR assays were performed with 2, 8-3H-ADP [NEN, 10 μM, 0.5 μCi (1 Ci = 37 GBq)] as substrate in 50 μl of Tris·HCl buffer (100 mM Tris·HCl 1 mM dGTP 10 mM MgSO4 10 mM DTT 100 μM B12). Reactions were started by adding a mixture of nucleotides, effectors, and coenzymes to protein (≈0.5 μg); incubated (1 h, 55°C); and terminated (10–15% conversion) by boiling in a water bath (10 min). The mixtures were cooled to room temperature and diluted with 1 volume of Tris·OH buffer (100 mM, pH 10.0). Reactants and products were dephosphorylated (alkaline phosphatase, 1 unit, 1 h, 37°C), diluted with cold 2′-deoxyadenosine (50 nmol), separated by silica gel thin layer chromatography (MeOH saturated with potassium borate tetrahydrate), and counted.
Cells (≈2 g/10 liters) of T. acidophila (DSM 1728) were grown at 55°C to late log phase (13), harvested by centrifugation (10 min at 2500 × g), washed [25 ml of 100 mM (NH4)2SO4/25 mM KH2PO4/10 mM MgSO4/5 mM CaCl2, pH 1.5)], and stored at −80°C. To isolate RNR, cells (2.5 g) were freeze-thawed (two times, 5 min per cycle) in Tris (25 ml of 100 mM Tris, pH 8.0, with HCl), and sonicated (5 min). The supernatant (by centrifugation for 10 min at 30,000 × g) was treated (10 min at 4°C) with a 20% volume of cold polyethylene glycol 8000 (200 g/kg) in Tris buffer. The pellet (resulting from centrifugation for 10 min at 4°C at 30,000 × g) was dissolved in Tris. The solution was applied to a Q-Sepharose column (2.5 × 30 cm, pre-equilibrated with Tris·HCl buffer, 100 mM, pH 8.0), and the column was eluted with a linear gradient of NaCl (0–1 M, 250 ml/h) in Tris. Fractions with highest activity were treated at 3°C with (NH4)2SO4 (saturated at 4°C, pH 7.0, with NaOH) to 30% saturation and incubated (10 min), and the pellet (centrifugation for 10 min at 30,000 × g) was redissolved in Tris (5 ml). This solution was applied to a hydroxyapatite column (2.5 × 5 cm, Bio-Rad, pre-equilibrated with Tris). The column was washed (Tris, 5 column volumes) and eluted with a K2HPO4 gradient (0–250 mM in Tris, RNR elution at ≈100 mM phosphate). Fractions containing enzyme were treated with a saturated (NH4)2SO4 solution (to 55% saturation) and incubated overnight. The protein was recovered by centrifugation (10 min at 30,000 × g), redissolved in Tris (10 mM), and fractionated by HPLC gel filtration (LKB GlasPak TSK-G3000SW, 8 × 300 mm) at room temperature (10 mM Tris, pH 8.0, 0.5 ml/min). Fractions with activity were concentrated (ultrafiltration, Centricon 10, for 30 min at 3500 × g) to 0.75 ml. The RNR was essentially pure by gel electrophoresis (90-fold purification). An additional 1.1- to 1.3-fold purification was possible with HPLC-DEAE anion exchange chromatography (Biogel DEAE TSK 5PW, Bio-Rad), and the RNR was eluted (at room temperature, 0.75 ml/min) with a gradient of (NH4)2SO4 (0–350 mM over 20 min, Tris) (Table 1).
Table 1.
Step | Activity, nmol/h | Volume, ml | Total protein, mg | Specific activity, nmol/h·mg | Yield, % | Purification factor |
---|---|---|---|---|---|---|
Crude extract | 2050 | 22 | 246 | 8.3 | 100 | 1 |
PEG precipitation | 1590 | 10 | 115 | 13.8 | 78 | 1.7 |
Q-Sepharose | 2310 | 40 | 41 | 56 | 113 | 6.7 |
(NH4)2SO4 precipitate | 2160 | 5 | 11.3 | 191 | 105 | 23 |
BioGel HTP | 2060 | 0.5 | 4.48 | 460 | 100 | 55 |
HPLC sizing | 255 | 2.1 | 0.33 | 773 | 12.4 | 93 |
HPLC-DEAE | 109 | 2.75 | 0.12 | 908 | 5.3 | 110 |
From a 10-liter culture of T. acidophila in Tris·HCl buffer (100 mM, pH 8.0, 3°C). Sonicated extracts from 2 g of cells were fractionated (4-8%, wt/vol) with polyethylene glycol (PEG) 8000. The enriched protein was loaded onto a 150-ml Q-Sepharose column and eluted with a linear gradient of NaCl (0-1 M). After fractionation [30% saturation (NH4)2SO4], protein was loaded onto a 25-ml BioGel HTP column and eluted with a linear gradient of potassium phosphate (0-0.25 M, in 100 mM Tris·HCl, pH 8.0). RNR was precipitated [30-55% saturated (NH4)2SO4], and the concentrated protein was purified by HPLC size exclusion chromatography (LKB TSK-G3000, 8 × 300 mm, room temperature).
Peptide were obtained from purified RNR by CNBr digestion. Protein following the hydroxyapatite column was precipitated [(NH4)2SO4], the precipitate was dissolved in trifluoroacetic acid (TFA; 100 μl, 70% in water), and the mixture was treated with CNBr (few grains) and incubated (12 h at room temperature in the dark). The mixture was diluted (water, 0.6 ml) and dried (vacuum). The residue was dissolved in TFA (0.1 ml, 50%), and the peptides were resolved by C18-HPLC chromatography (Vydac 218TP54, A = 0.1% TFA in water, B = 0.05% TFA in CH3CN, 0–60% B over 60 min, then 60–100% B over 15 min, detection at 215 nm) (14) and then sequenced (each 2 μg) by automated Edman degradation. An N-terminal decapeptide and an interior sequence of 23 amino acids was obtained.
A subgenomic library for T. acidophila was constructed in pUC18 (15). A nondegenerate 62-mer hybridization probe (RnrP1), based on the internal peptide sequence, was used to Southern blot the library, yielding a 2.8-kbp HindIII fragment containing the N-terminal region of the RNR gene. A second probe (RnrP3) was designed from the DNA sequence of this fragment and used to recover a 3.8-kbp HindIII fragment containing the C-terminal end of the gene. Chromosomal DNA from T. acidophila was digested separately with EcoRI, HindIII, or BamHI. Fragments were resolved by agarose electrophoresis (1%). Long nondegenerate probes were designed from peptide sequences obtained by Edman degradation using codons chosen based on an analysis of the existing data base of T. acidophila encoded sequence, and used in Southern blots. Probe RnpP1 (5′-AAGATAGATGGCAAGACGTTCACGGATGTGGCGAAGTCGTACATACTGTACAGGGAGAAGAG, from the N-terminal sequence, hybridized to 10-, 2.8-, and 1.7-kbp fragments in the three digests, respectively). The 1.7-kbp BamHI fragment was extracted from the gel, mixed with pUC18 digested with BamHI, ligated, and used to transform JM105 cells, which were screened by colony hybridization with RnrP1 at 50°C. Ten plasmids were isolated from the positive clones and restriction-mapped, identifying eight with the 1.7-kbp segment inserted in both orientations. The insert was sequenced and compared with the peptide sequences obtained above.
The RNR from T. acidophila was heterologously expressed in E. coli. Chromosomal DNA (3 ng) from T. acidophila containing the RNR gene was amplified (PCR, Amplitaq) (16). The DNA was digested with NotI and NdeI, the fragments were resolved by agarose (1%) gel electrophoresis, and the 2.7-kbp band was ligated into a pET23b vector, similarly digested, behind its lac promoter. The product was used to transform JM105 cells. Plasmid was isolated from positive colonies, and the RNR gene was sequenced. The RNR gene was induced (isopropyl β-[sca]d-thiogalactoside) at 37°C (4 h, with time points) and purified as above.
Secondary structure predictions were made using the heuristics developed in these laboratories (17) and implemented in the DARWIN server (http://cbrg.inf.ethz.ch/).
RESULTS
Based on size exclusion chromatography, active RNR isolated from T. acidophila was estimated to have a molecular weight of 100,000. SDS/PAGE gave a similar a weight, indicating that the active protein is a monomer, as is the B12-dependent RNRs from Lactobacillus (7), Thermus aquaticus X-1 (18), and Anabaena sp. (7119) (19). However, like the Fe-dependent RNR from E. coli and mammals, the preferred substrates are nucleoside diphosphates.
The temperature optimum for the reduction is 55°C, corresponding to the habitat temperature of T. acidophila. Reduction of ADP follows pseudo Michaelis–Menten kinetics (a Km for ADP of 64 μmol at 100 μM coenzyme B12 and 1 mM 2′-dGTP). Reduction of ADP depends on added DTT as a reducing agent and dGTP as an effector; TTP is a less potent effector, whereas dATP, ATP, and dCTP show no effector potency (data not shown). The enzyme requires added 5′-deoxyadenosyl cobalamin, and catalyzes release of tritium from 5′-[3H]deoxyadenosyl cobalamin to water, establishing that the enzyme uses B12 (20). Hydroxyurea (50 mM), an inhibitor of iron-dependent RNRs, does not inhibit the RNR. These data suggest that the Thermoplasma RNR is not iron-dependent.
Because the RNRs from Halobacterium cutirubrum and Haloferax volcanii were instable at [NaCl] lower than those found physiologically, we were unable to obtain these proteins homogenous in active form. Assays on partially purified fractions suggested that these RNRs also use B12 as a cofactor. Coenzyme B12 (but not vitamin B12) stimulated the activity of both proteins, and neither was inhibited by hydroxyurea (data not shown). TTP increased the rate of nucleotide reduction in both cases as well.
The recombinant RNR expressed in E. coli had a specific activity (1100 nmol/h·mg) slightly higher than the highest obtained with the protein isolated directly from T. acidophila cells. This specific activity is rather low; it is not clear whether this indicates a correspondingly low rate in vivo or whether the in vitro assay conditions fail to reproduce in vivo conditions adequately.
The N-terminal untranslated region contains an A+T rich putative box A 24 bases upstream from a putative box B (not shown), fitting the consensus archaebacterial box A and box B sequences (21). A ribosome binding site complementary to the 16S rRNA from T. acidophila (22) lies close to the initiating ATG. The archaebacterial protein was readily expressed by E. coli (23). The open reading frame encodes a protein with a calculated isoelectric point of 7.05 and molecular weight of 97,000, in good agreement with the biochemical experiments.
Most remarkable is the sequence of the T. acidophila RNR protein (Fig. 1). The protein displays sequence similarity that indisputably establishes homology of one segment to the glycyl radical-dependent anaerobic RNR (residues 1–150) from E. coli, and of another segment to the iron-dependent RNR from E. coli and mammals (residues 250–680). The second homology is supported by 25% sequence identity in the catalytic domain that includes the redox active cysteines of the Fe-dependent enzyme (Cys-225 and Cys-462), and Cys-439, the putative thiyl radical (6). Homology was also indicated by the correspondence between the predicted secondary structure of the RNR with the experimental structure of the Fe-dependent RNR from E. coli (24). Further, the regions of sequence similarity corresponded to domains observed in the crystal structure of the Fe-dependent RNR. Thus, the entire alpha–beta domain of the E. coli RNR is found in homologous form in the RNR from T. acidophila, except for the final two helix-strand elements, which may dock the radical-generating domain of the heterodimer in the E. coli enzyme (24). The T. acidophila RNR presumably has its radical-generating domain fused in one peptide chain (see below) and therefore does not need docking elements.
Pursuing this observation, the database was probed using the sequence from the Lactobacillus RNR (B12-dependent). This sequence is similar (30.9% identical, similarity score 957, PAM 126, over 700 amino acids) to an open reading frame from mycobacteriophage L5, which infects the Gram-positive bacterium Mycobacterium tuberculosis. This similarity indisputably establishes homology between Lactobacillus and Mycobacterium RNRs. The Mycobacterium sequence was successfully aligned as well with the T. acidophila RNR sequence (Fig. 1). While no similarity exists to adequately establish homology between the B12-dependent Lactobacillus sequence and any of the Fe-dependent RNR sequences, the statistically significant connections of the Lactobacillus sequence via the Mycobacterium sequence to the T. acidophila sequence permits a “bridge” (27) to be made between the catalytic domain of the B12-dependent Lactobacillus RNR and the catalytic domains of the Fe-dependent E. coli (Fig. 1). This bridge was confirmed by the correspondence between the predicted secondary structure obtained from the Lactobacillus–Mycobacterium pair with the secondary structure determined by crystallography in the core domain of the E. coli RNR (24). Thus, the additional sequences unify the catalytic domains of all RNRs except the anaerobic enzyme from E. coli (dependent on the glycyl radical). This implies in turn that the preference for substrate (NDP versus NTP for the Lactobacillus RNR), which is presumably specified in the catalytic domain, is not evolutionarily conserved but rather diverged during the divergence of the catalytic domain from a common ancestor.
Anaerobic RNR from E. coli (using a glycyl radical) does not stand in evolutionary isolation, however. The N-terminal domain of the anaerobic RNR was significantly similar in sequence to the N-terminal region of T. acidophila RNR, and the Fe-dependent RNRs also displayed significant similarity to the last part of this domain (T. acidophila residues 110–150). Further, a predicted secondary structure of the N-terminal domain suggests that it is predominantly helical, as is the domain in the E. coli enzyme (24). This prediction therefore suggests an evolutionary bridge between the glycyl radical, dinuclear iron, and T. acidophila B12-dependent RNRs, at least in the N-terminal domain.
A search of the sequence data base left only two regions of the T. acidophila RNR without obvious homologs. The first is a 100-aa segment (residues 150–247) connecting the first (presumably regulatory) domain to the catalytic domain. DARWIN failed to find any sequence in the data base alignable to this segment. This region lies in a region of the fold that should tolerate insertions, and its presence does not compromise the conclusion that the T. acidophila B12-dependent and Fe-dependent RNRs are homologous.
DARWIN also found no statistically significant similarities between any protein in the data base and residues 650–857. Based on analogy with other RNRs, the region was tentatively assigned as a B12-binding domain. No significant alignment could be obtained with this domain in the sequence of the B12-dependent RNR from Lactobacillus, however. Nor was statistically significant sequence similarity found to B12-binding domains from other B12-dependent proteins. The region contained, however, two signature sequences (DXHXXG and SXL, but not GG) found in the B12-binding domain of an entirely different B12-dependent enzyme, the methionine synthase from E. coli, proposed from the crystal structure to be important for cofactor binding (29, 30).
DISCUSSION
There has been much discussion concerning biochemical markers that might be used to define distant evolutionary relationships between macromolecules, especially after the similarities in sequences that might provide a statistically significant statement of homology have vanished. It is generally accepted that the tertiary structure of a protein is conserved after sequence similarity has vanished, and can be used as an indicator of homology (31). Unfortunately, the tertiary structure is a difficult feature of a biomolecule to determine experimentally. Therefore, other properties, including mechanism, substrate specificity, stereospecificity, quaternary structure, codon selection, regulatory patterns, disulfide connectivity, the conservation of short sequence motifs, and the order of presumed catalytic residues in the polypeptide chain, have been used to define family relationships among proteins for which three-dimensional models do not exist (32). We have argued against (4) and others have argued for (2, 6) evolutionary homology within the ribonucleotide reductase family based on these and other nonstructural grounds.
Similar discussions surround the assignment of “primitive” biomolecular behaviors to a common ancestor in a protein family. The most direct approach requires reconstructing the ancestral protein in the laboratory and studying its behavior (33). This is, unfortunately, also experimentally challenging, and many have suggested alternative criteria for inferring primitive behaviors. With RNRs, some have argued that glycyl radical-dependent RNRs are primitive (2), as these operate anaerobically, and life presumably originated in an anaerobic environment before oxygen emerged 2.5 billion years ago (4). Others suggest that the Fe-dependent enzymes are primitive, because ancient organisms lived on an earth that was warmer, and Fe-dependent enzymes avoid unstable cofactors (34). B12 might also be viewed as a cofactor that contains elements of structure that might have arisen prebiotically (35), implying perhaps that B12-dependent RNRs are ancient (4).
Some time ago, we systematically examined biochemical data to learn whether we could find examples of nonstructural behaviors of proteins that could be shown to have diverged within a pair of proteins that was indisputably homologous by sequence analysis (32). Many examples established that most, if not all, of these nonstructural features can diverge in a biomolecular family well before sequence similarity ceases to provide a reliable indicator of homology. This implies, in turn, that nonstructural features will not generally be reliable indicators of homology (or nonhomology) in proteins that have no statistically significant sequence similarity.
The reason for this is simple enough to state in its general form. Most nonstructural properties of a protein are intricately tied to its selected function, and diverge (or are conserved) in response to changes in (or conservation of) the environmental parameters to which natural selection responds. Nonstructural features are unreliable as evolutionary markers for the same reason that the molecular clock often fluctuates episodically (36), metabolic behaviors of microorganisms are poor indicator of phylogeny (37), and functional macroscopic features of an animal are not as useful as nonfunctional features for deducing patterns of ancestry (38). Natural selection perturbs selected behaviors in a fashion that is difficult to describe using analytically simple models.
The ribonucleotide reductase provides a fascinating illustration of these problems. The choice of cofactor to generate a radical in a RNR proves to be an inadequate criterion for assessing homology. Instead, the radical generator appears to have diverged to reflect the availability of the cofactor in the environment. When vitamin B12 is scarce, selection pressure has evidently favored its elimination as a cofactor. Accordingly, plants, which lack B12, have a non-B12 RNR (39), as do mammals, which do not biosynthesize B12. Both rely on dinuclear iron as the ultimate source of the radical in the RNR reaction, which in turn requires oxygen, which is present in the environments of plants and animals. Under anaerobic conditions, E. coli, which also lacks B12 biosynthesis (and which has alternative pathways to biosynthesize essential biomolecules such as methionine if B12 is unavailable), dinuclear iron is not an option as a cofactor; hence, the use of a glycyl radical RNR. In contrast, in a metabolic context where vitamin B12 is plentiful, selection pressure evidently favor its use. Archaebacteria are rich in B12; they biosynthesize it and use it to biosynthesize components of their membranes. Thus, a purely adaptive model explains the choice of cofactor in the RNR family, which in turn makes the choice of cofactor a poor indicator of homology.
At least two of the three critical thiols (Cys-439 and Cys-462) are conserved and embedded in a homologous protein scaffolding in all of the RNRs discussed here. These cysteines are believed to participate in a reaction mechanism involving a thiyl radical at position 439 (2, 6). This suggests that a thiyl mechanism was the initial solution chosen by nature, and that this solution was sufficiently efficient and superior to others that nature might have tried to have persisted throughout the period of natural history that separates archaebacteria, eubacteria, and eukaryotes. The alternative explanation, that nature tried no other mechanistic alternatives, cannot be ruled out, however. Thus, it will be interesting to learn whether a similar evolutionary connection might be found that will join the catalytic domains of the anaerobic E. coli RNR and the manganese-dependent enzymes into the evolutionary family defined by the RNR from T. acidophila and its homologs.
Cys-225, which participates in the delivery of the two electrons and a proton to carbon-2 of the substrate, remains paradoxical. It is found in a region of acceptable sequence similarity between the T. acidophila enzyme and the E. coli aerobic enzyme. It is not aligned, however, in the Lactobacillus and Mycobacterium phage sequences (Fig. 1). DARWIN will exclude a motif from an alignment if it does not meet rigorous standards of significance. Even by eye, however, no cysteine residue is found 20–40 residues upstream of the start of the global alignment (E. coli position 249 in Fig. 1) in either the phage or the Lactobacillus sequence. The first conserved cysteine appears 108 residues (for the Lactobacillus sequence) and 126 residues (for the phage sequence) before the start of the global alignment (sequences not shown). Both cysteines are embedded in a segment indicative of an active site. If the universal mechanism is to hold, these cysteines must be the equivalent of Cys-225 in the E. coli RNR, with an additional segment of approximately 100 aa predicted to be spliced into the folded structure between these cysteines and the remainder of the barrel.
In this context, it is important to note that although the Mycobacterium phage RNR is presumably B12-dependent based on its sequence, this has not been established experimentally. Further, an RNR from M. tuberculosis, a presumed host of the phage, is an iron-dependent RNR (40). This itself is intriguing, as viruses generally share metabolic similarities with their hosts. Still more intriguing is the fact that the M. tuberculosis RNR is most closely related to the nrdEF gene from Salmonella typhimurium (41), which in turn is unusual as an organism for having two RNRs belonging to the same biochemical class (42).
A structural model for the divergence of cofactor usage is also apparent from this model. The final two alpha–beta units in the Thermoplasma RNR, the putative sites for docking of the Fe-subunit in the E. coli enzyme (24), are missing from the catalytic domain of the Thermoplasma enzyme. Replacing it is the putative B12-binding domain fused as part of a single polypeptide chain. The Fe-radical is some distance from the critical thiol residues at the active center in the E. coli RNR, implying that long range electron transfer is necessary for the reaction to proceed (43). Tyr-730 and Tyr-731, conserved in the Fe-dependent enzymes, have been proposed to be involved in this process. These tyrosines are not found in the T. acidophila RNR. His-707, the presumptive ligand to cobalt in the B12 cofactor, comes close to the place where these tyrosines might be aligned, however. It is intriguing to speculate that the distance over which the electron is transferred in the T. acidophila enzyme is not as long as in the Fe-dependent enzymes, because the radical generating unit, here the B12, can approach the active center of the catalytic domain more closely.
Two tools used in this evolutionary analysis should be applicable generally. First, while experimental determination of folded structure remains difficult, prediction of folded structure has become almost routine when more than a single homologous sequence is available (11, 25, 26). Analyses of patterns of variation and conservation among homologous sequences have proven to provide reliable predictions of secondary structure, as demonstrated by bona fide predictions made and announced before an experimental structure is known (11, 25, 26). Although predictions with smaller numbers of sequences are less reliable (and, normally, insufficiently reliable to serve as a starting point for building tertiary structural models), they can indicate whether two proteins are homologous following the loss of clear sequence similarity (12). In the RNR family, predictions of secondary structure suggest that at least part of the helical first domain of the Thermoplasma and glycyl radical RNR is homologous to analogous domain of the Fe-dependent RNRs. Other predictions confirm the proposal that the Lactobacillus enzyme is homologous to the Fe-dependent RNRs in the central barrel of the catalytic domain. The second prediction finds close analogy as a prediction problem to the prediction of phospho-β-d-galactosidase barrel (44), where prediction was possible for the conserved, core elements of the beta barrel but not for the external elements. Predicted secondary structures should become important tools for detecting long distance homology, especially as genome sequencing projects make sequence data more abundant.
The second evolutionary tool illustrated by this analysis is the use of sequence bridges to connect distant sequences into “connected” components (12). Here, the sequences from Thermoplasma and Mycobacterium establish an evolutionary bridge that argues for homology between the B12-dependent enzyme from Lactobacillus and the Fe-dependent enzyme from Escherichia. A simple comparison of the sequences of the enzymes from Lactobacillus and Escherichia individually is unable to detect statistically significant sequence similarity. The theoretical reasons for such bridges remain unclear, even as their value in an evolutionary analysis is obvious. As genomic data bases become completed, bridges should become easier to find.
The B12 binding site of the T. acidophila RNR is not, however, securely identified. Sequence motifs (such as the DXHXXG sequence) are generally unreliable in identifying long distance homology (10). They become reliable only when embedded in a secondary structure prediction that shows that the same motif in two protein families is flanked by the same secondary structural elements. The best prediction assigns an alpha–beta structure in this region (as is seen in the B12-binding domain of methionine synthase), but with a large segment (some 70 aa) inserted between the critical motifs, and little correspondence between predicted and experimental secondary structure elements elsewhere. A prediction based on a single sequence is, however, little more than an educated guess (45). Therefore, further sequence data will be necessary to learn whether the B12-binding domain of the T. acidophila RNR is homologous to the same domain in methionine synthase. With the completion of archaebacterial genomes imminent, these additional data should soon be available.
Acknowledgments
We are indebted to Prof. Joanne Stubbe and her coworkers (Massachusetts Institute of Technology) for providing labeled 5′-deoxyadenosyl cobalamin and for many discussions, and to Prof. Peter Reichard and two referees for helpful comments. We are indebted to the Swiss National Science Foundation and Sandoz for partial support of this work.
Footnotes
Abbreviation: RNR, ribonucleotide reductase.
References
- 1.Stubbe J. J Biol Chem. 1990;266:5329–5332. [PubMed] [Google Scholar]
- 2.Reichard P. Science. 1993;260:1773–1777. doi: 10.1126/science.8511586. [DOI] [PubMed] [Google Scholar]
- 3.Harder J. FEMS Microbiol Rev. 1993;12:273–292. doi: 10.1111/j.1574-6976.1993.tb00023.x. [DOI] [PubMed] [Google Scholar]
- 4.Benner S A, Ellington A D, Tauer A. Proc Natl Acad Sci USA. 1989;86:7054–7058. doi: 10.1073/pnas.86.18.7054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hoffee O M, Rosen O M, Horecker B L. J Biol Chem. 1965;240:1512–1516. [PubMed] [Google Scholar]
- 6.Mao S S, Holler T P, Yu G X, Bollinger J M J, Booker S, Johnston M I, Stubbe J. Biochemistry. 1992;31:9733–9743. doi: 10.1021/bi00155a029. [DOI] [PubMed] [Google Scholar]
- 7.Booker S, Stubbe J. Proc Natl Acad Sci USA. 1993;90:8352–8356. doi: 10.1073/pnas.90.18.8352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lin A-N, Ashley G W, Stubbe J. Biochemistry. 1987;26:6905–6909. doi: 10.1021/bi00396a006. [DOI] [PubMed] [Google Scholar]
- 9.Sun X, Harder J, Krook M, Jörnvall H, Sjöberg B-M. Proc Natl Acad Sci USA. 1993;90:577–581. doi: 10.1073/pnas.90.2.577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bork P. Curr Opin Struct Biol. 1992;2:413–421. [Google Scholar]
- 11.Benner S A, Gerloff D L, Jenny T F. Science. 1994;265:1642–1644. doi: 10.1126/science.8085149. [DOI] [PubMed] [Google Scholar]
- 12.Benner S A, Cohen M A, Gonnet G H, Berkowitz D B, Johnsson K. In: The RNA World. Gesteland R, Atkins J, editors. Plainview, NY: Cold Spring Harbor Lab. Press; 1993. pp. 27–70. [Google Scholar]
- 13.Darland G, Brock T D. Science. 1970;170:1416–1418. doi: 10.1126/science.170.3965.1416. [DOI] [PubMed] [Google Scholar]
- 14.Matsudaira P. Methods Enzymol. 1990;182:602–613. doi: 10.1016/0076-6879(90)82047-6. [DOI] [PubMed] [Google Scholar]
- 15.Yanish-Perron C, Vieira J, Messing J. Gene. 1985;33:103–119. doi: 10.1016/0378-1119(85)90120-9. [DOI] [PubMed] [Google Scholar]
- 16.Innis M A, Gelfand D H. PCR Protocols: A Guide to Methods and Applications. New York: Academic; 1990. [Google Scholar]
- 17.Benner S A, Badcoe I, Cohen M A, Gerloff D L. J Mol Biol. 1994;235:926–958. doi: 10.1006/jmbi.1994.1049. [DOI] [PubMed] [Google Scholar]
- 18.Sando G N, Hogenkamp H P C. Biochemistry. 1973;12:3316–3322. doi: 10.1021/bi00741a025. [DOI] [PubMed] [Google Scholar]
- 19.Gleason F K, Frick T D. J Biol Chem. 1980;255:7728–7733. [PubMed] [Google Scholar]
- 20.Abeles R H, Beck W S. J Biol Chem. 1967;242:3589–3593. [PubMed] [Google Scholar]
- 21.Hain J, Reiter W-D, Zillig W. Nucleic Acids Res. 1992;20:5423–5428. doi: 10.1093/nar/20.20.5423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ree H K, Cao K, Thurlow D L, Zimmermann R A. Can J Microbiol. 1989;35:124–133. doi: 10.1139/m89-019. [DOI] [PubMed] [Google Scholar]
- 23.Sutherland K J, Henneke C M, Towner P, Hough D W, Danson M J. Eur J Biochem. 1990;194:839–844. doi: 10.1111/j.1432-1033.1990.tb19477.x. [DOI] [PubMed] [Google Scholar]
- 24.Uhlin U, Eklund H. Nature (London) 1994;370:533–539. doi: 10.1038/370533a0. [DOI] [PubMed] [Google Scholar]
- 25.Benner S A. In: Protein Engineering: A Guide to Design and Production. Craik C S, Cleland J, editors. New York: Wiley–Liss; 1996. pp. 71–99. [Google Scholar]
- 26.Benner, S. A., Chelvanayagam, G. & Turcotte, M. (1996) Chem. Rev., in press. [DOI] [PubMed]
- 27.Gonnet G H, Cohen M A, Benner S A. Science. 1992;256:1443–1445. doi: 10.1126/science.1604319. [DOI] [PubMed] [Google Scholar]
- 28.Eriksson S, Sjöberg B-M, Jörnvall H, Carlquist M. J Biol Chem. 1986;261:1878–1882. [PubMed] [Google Scholar]
- 29.Drennan C L, Matthews R G, Ludwig M L. Curr Opin Struct Biol. 1994;4:919–929. doi: 10.1016/0959-440x(94)90275-5. [DOI] [PubMed] [Google Scholar]
- 30.Drennan C L, Huang S, Drummond J T, Matthews R G, Ludwig M L. Science. 1994;266:1669–1674. doi: 10.1126/science.7992050. [DOI] [PubMed] [Google Scholar]
- 31.Chothia C, Lesk A. EMBO J. 1986;5:823–826. doi: 10.1002/j.1460-2075.1986.tb04288.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Benner S A, Ellington A D. Bioorg Chem Front. 1990;1:1–70. [Google Scholar]
- 33.Jermann T M, Opitz J G, Stackhouse J, Benner S A. Nature (London) 1995;374:57–59. doi: 10.1038/374057a0. [DOI] [PubMed] [Google Scholar]
- 34.Daniel R M, Danson M J. J Mol Evol. 1995;40:559–563. [Google Scholar]
- 35.Eschenmoser A. Angew Chem Int Ed Engl. 1988;27:5–39. [Google Scholar]
- 36.Kreitman M, Akashi H. Annu Rev Ecol Syst. 1995;26:403–422. [Google Scholar]
- 37.Woese C R. Microbiol Rev. 1987;51:221–271. doi: 10.1128/mr.51.2.221-271.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gould S J. The Panda’s Thumb. New York: Norton; 1980. [Google Scholar]
- 39.Philipps G, Clément B, Gigot C. FEBS Lett. 1995;358:67–70. doi: 10.1016/0014-5793(94)01397-j. [DOI] [PubMed] [Google Scholar]
- 40.Yang F, Lu G, Rubin H. J Bacteriol. 1994;176:6738–6743. doi: 10.1128/jb.176.21.6738-6743.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Jordan A, Gibert I, Barbé J. J Bacteriol. 1994;176:3420–3427. doi: 10.1128/jb.176.11.3420-3427.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Jordan A, Gibert I, Barbé J. Gene. 1995;167:75–79. doi: 10.1016/0378-1119(95)00656-7. [DOI] [PubMed] [Google Scholar]
- 43.Sjöberg B-M. Structure (London) 1994;2:793–796. doi: 10.1016/s0969-2126(94)00080-8. [DOI] [PubMed] [Google Scholar]
- 44.Gerloff D L, Chelvanayagam G, Benner S A. Proteins Struct Funct Genet. 1995;23:446–453. doi: 10.1002/prot.340230318. [DOI] [PubMed] [Google Scholar]
- 45.Benner S A. Adv Enzyme Regul. 1989;28:219–236. doi: 10.1016/0065-2571(89)90073-3. [DOI] [PubMed] [Google Scholar]