Abstract
Analysis of the haloalkanoate dehalogenase superfamily (HADSF) has uncovered homologues occurring within the same organism that are found to possess broad, overlapping substrate specificities and low catalytic efficiencies. Here we compare the HADSF phosphatase BT1666 from Bacteroides thetaiotaomicron VPI-5482 to a homologue with high sequence identity (40%) from the same organism BT4131, a known hexose-phosphate phosphatase. The goal is to find if these enzymes represent duplicated versus paralogous activities. The X-ray crystal structure of BT1666 was determined to 1.82 Å resolution. Superposition of the BT1666 and BT4131 structures revealed a conserved fold and identical active sites suggestive of a common physiological substrate. The steady-state kinetic constants for BT1666 were determined for a diverse panel of phosphorylated metabolites to define its substrate specificity profile and overall level of catalytic efficiency. Whereas BT1666 and BT4131 are both promiscuous, their substrate specificity profiles are distinct. The catalytic efficiency of BT1666 (kcat/Km = 4.4 × 102 M-1 s-1 for the best substrate fructose 1,6-(bis)phosphate) is an order of magnitude less than that of BT4131 (kcat/Km = 6.7 × 103 M-1 s-1 for 2-deoxyglucose 6-phosphate). The seemingly identical active-site structures point to sequence variation outside the active site causing differences in conformational dynamics or subtle catalytic positioning effects that drive the divergence in catalytic efficiency and selectivity. The overlapping substrate profiles may be understood in terms of differential regulation of expression of the two enzymes or a conferred advantage in metabolic housekeeping functions by having a larger range of possible metabolites as substrates.
Keywords: Haloalkanoate dehalogenase superfamily, phosphohydrolase, phosphatase, Bacteroides thetaiotaomicron, substrate promiscuity, housekeeping, divergent evolution, BT1666, BT4131
Introduction
The haloalkanoate dehalogenase superfamily (HADSF) is a large, ubiquitous and diverse superfamily of enzymes, the vast majority of which are phosphatases and ATPases. Multiple HADSF phosphatases exist within any given organism (for instance, 28 in Escherichia coli; 30 in Mycobacterium tuberculosis, 84 in Caenorhabditis elegans, 169 in Arabidopsis thaliana, and 183 in Homo sapiens) where they assume roles in biosynthetic and biodegradation pathways or perform a wide variety of housekeeping functions (1). Although phosphatases have evolved in other enzyme families, these families have not expanded to occupy the biological niches of the cell to nearly the extent seen with the HADSF phosphatases. A simple count of the phosphatases in E. coli shows that the HADSF phosphatases have been able to secure more biochemical jobs than the phosphatases from all six other phosphatase families together.
Earlier work on the HADSF phosphatases showed that the physiological substrates run the full gamut in size and shape from phosphoproteins, nucleic acids, and phospholipids to phosphorylated disaccharides, sialic acids, and terpenes to the smallest of the organophosphate metabolites, phosphoglycolate (2). It is through the acquisition of structural accessories to the catalytic domain that the HADSF has succeeded in covering this vast range of substrate structure(3). Moreover, the modular design of the HADSF phosphatase might play a key role in the high evolvability. The vast majority of the HADSF phosphatases consist of a conserved Rossmann domain (the catalytic domain) and a tethered “cap” domain (variable in fold) (Figure 1A). The catalytic scaffold, which is formed by four backbone segments located at the C-terminal end of the Rossmann-fold central sheet (Figure 1B), conserves the catalytic residues that mediate phosphate ester hydrolysis (Figure 1C). The mechanism conserved among HADSF phosphatases involves the nucleophilic attack of an aspartate on the phosphate of the phosphoryl group with general acid catalysis and subsequent hydrolysis of the aspartyl-phosphate intermediate. The cap domain is inserted into either one of two loops of the catalytic scaffold (Figure 1A). The catalytic scaffold alone binds the transferring phosphoryl group, whereas the cap domain alone binds the leaving group. Thus, HADSF phosphatase catalytic residues positioned by the catalytic scaffold are physically separated from the substrate recognition residues positioned on the cap domain. A priori, the substrate recognition site can evolve independent of the catalytic site. In addition, the cap domain provides a large surface, which can accommodate numerous substrate recognition residues. Consequently, the potential to recognize more than one substrate exists, consistent with the observation that the HADSF phosphatases tend to be promiscuous (4-5).
Figure 1.
(A) The HADSF subfamilies-C1 and C2 are distinguished by the insertion points of the cap (gold) in the Rossmann fold (blue) (C1 after β-strand 1, C2 after β-strand 2, C0 has a minimal insert in either site). (B) Liganded β-PGM with catalytic segments colored as in (C), substrate in purple and one of the substrate-binding cap residues in black with Mg2+ in cyan.. (C) Chemdraw representation of the HADSF conserved catalytic scaffold.
Substrate promiscuity underlies high evolvability. When a gene encoding a promiscuous enzyme is duplicated the opportunity exists for a novel function to evolve within one of the two genes through mutations that enhance a low, intrinsic activity towards a novel substrate. In earlier work, we had identified two HADSF type C0 phosphatases in Bacteroides thetaiotaomicron, with 25% sequence identity which undoubtedly are related by a gene duplication event. The 2-keto-3-deoxy-8-phospho-D-manno-octulosonic acid (KDO-8-P) phosphatase of the lipid A pathway is posited to be the progenitor while the “new” enzyme is posited to be 2-keto-3-deoxy-D-glycero-D-galacto-9-phosphonononic acid (KDN-9-P) phosphatase. The KDO-8-P phosphatase displays a kcat/Km = 1.5 × 104 M-1s-1 towards its physiological substrate KDO-8-P and a low level of activity (kcat/Km = 6.7 × 102 M-1s-1) towards KDN-9-P. Conversely, KDN-9-P phosphatase displays a kcat/Km = 1.1 × 104 M-1s-1 towards its physiological substrate KDN-9-P and a low level of activity (kcat/Km = 2.0 × 102 M-1s-1) towards KDO-8-P (6-7).
The work described in this paper examines a second pair of B. thetaiotaomicron HADSF phosphatase homologues with high sequence similarity, BT1666 and BT4131 with the goal of finding whether they have paralogous functions. Previously, our laboratories had reported the structure and function of BT4131(5). BT4131 possesses a C2b type cap domain, which supports phosphatase activity towards hexose-phosphates generated in the course of chitin recycling. BT1666 shares 40% sequence identity and 75% sequence similarity with BT4131. Bioinformatic analysis was carried out to define the biological range and co-occurrence of BT1666 and BT4131 orthologues. The X-ray crystal structure and substrate specificity profile of recombinant BT1666 was determined for the purpose of relating change in substrate recognition with change in the structure of the substrate binding site. Whereas the two substrate binding sites proved to be essentially identical, the substrate specificity profiles are not and, most curious is the finding that the level of activity in BT1666 as defined by the magnitude of the kcat/Km value is substantially lower than that observed with BT4131. The implications of these findings in relation to structure based function assignment within the HADSF are examined. In addition, the distribution of one or the other paralogue versus both paralogues among Bacteroides species is explored within the context of the minimal level of catalytic efficiency required of a HADSF housekeeper, and the potential advantage bestowed by an additional housekeeper.
Materials and Methods
Gene cloning, construct design, protein expression, and purification
The cDNA encoding the BT1666 gene from B. thetaiotaomicron VPI-5482 was amplified by PCR using the genomic DNA of B. thetaiotaomicron VPI-5482 (a kind gift from Dr. J. Gordon (Washington University, St. Louis, MO)), Pfu Turbo DNA polymerase (Stratagene) and oligonucleotide primers (5‘-CTTTGTAGCCCAAAAATATAGATCATATGAT, and 5’-GCTTTGTAGCCAAGGATCCTTAAGACATTT), containing the restriction endonuclease cleavage sites for NdeI and BamHI. The pET15B vector, cut by restriction enzymes NdeI and BamHI, was ligated to the PCR product that had been purified and then digested with the same restriction enzymes. The ligation product was used to transform E. coli JM109 competent cells (Stratagene), which were grown on an Ampicillin-containing agar plate. The selected colony was used to prepare plasmid DNA (Mini-Prep Kit from Qiagen). The gene sequence was confirmed by DNA sequencing carried out by the Tufts University Core Facility. The plasmid was transformed into E. coli Rosetta(DE3)plysS cells for protein expression. The transformed E. coli cells were used to inoculate 4 L of Luria-Bertani medium containing 100 μg/ml ampicillin at 37 °C. Following growth of the culture to an OD600 value of 0.6-1.0, gene expression was induced at 15 °C with 0.4 mM isopropyl α-D-thiogalactopyranoside. Following a 4 h induction period, the E. coli cells (10 grams) were harvested and suspended in 100 mL of buffer A (300 mM NaCl, 50 mM phosphate (pH 7.5)). The cells were disrupted by sonication on ice and the lysates were clarified by centrifugation at 48,384 g at 4 °C. BT1666 was purified by affinity chromatography on a Ni-NTA column (Qiagen). Protein purity was assessed by SDS-PAGE analysis. The desired protein fractions were pooled and applied to a 120 mL Sephacryl-200 gel filtration column (GE Healthcare) equilibrated with buffer A. Purified BT1666 was concentrated at 4 °C using a 10 kDa cutoff Amicon Ultra Centrifugal filter (Millipore) in buffer A and stored at -80 °C. The final yield was 10 mg protein/1 g wet cells. The protein concentration was determined using the Bradford method.
BT1666 molecular weight determination
The theoretical subunit molecular mass of recombinant BT1666 including the His6-tag and thrombin cleavage site was calculated as 30,111 Da by using the amino acid composition derived from the gene sequence and the EXPASY Molecular Biology Server program Compute pI/MW. The subunit mass was determined to be 30,019 Da by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. The molecular weight of native BT1666 was estimated by FPLC gel filtration column chromatography against protein standards using a 1.6 cm × 60 cm Sephacryl S-200HR column (GE Healthcare) eluted at 4 °C with 50 mM HEPES, 100 mM NaCl, pH 7.5 buffer at a flow rate of 1 mL/min. A molecular weight of ~30 kDa for BT1666 was derived from the measured elution volume by extrapolation of the plot of the elution volume of the molecular weight standard vs. log molecular weight (range 13.7-220 kDa; GE Healthcare).
Steady-state kinetic constant determination
The purified recombinant enzyme was concentrated with an Amicon Ultrafiltration apparatus (PM10) or Centricon-10 (Millipore), and dialyzed against buffer B (1 mM HEPES (pH 7.5)) before use in kinetic studies. The steady-state kinetic parameters (Km and kcat) of phosphorylated substrates were determined from initial reaction velocities measured at varying substrate concentrations (from 0.5- 5Km). The initial velocities were measured for reaction mixtures containing 5 mM MgCl2 in 50 mM HEPES buffer (pH 7.0) at 37 °C. The assay methods used for the various substrates are described below. Absorbance measurements were performed with a Beckman DU800 UV-vis spectrophotometer. Data were fitted to with KinetAsystI to the Michaelis-Menten equation.
Activity assays
Phosphate ester hydrolysis for all substrates was monitored using Biomol green (BIOMOL International) to measure the concentration of inorganic phosphate formed in the reaction solutions. The 1 mL assay mixture, containing 50 mM HEPES (pH 7.0), 5 mM MgCl2, various concentrations of substrate, and 0.023 μM BT1666 was incubated at 37 °C and aliquots removed at various times for phosphate determination. Times corresponding to initial rates (<10% of substrate consumes) were used to insure initial rates. In parallel, the background level of phosphate release was measured using a control reaction mixture, which excluded BT1666. For analysis, 100 μL of the mixture were added to 1 mL of Biomol green reagent. After 60 min incubation at room temperature, the absorbance of the solution at 620 nm was measured.
The rate of p-nitrophenyl phosphate (pNPP) hydrolysis was also determined by monitoring the increase in absorbance at 410 nm (Δε = 18.4 mM-1 cm-1) at 37 °C. The 0.5 mL assay mixtures contained 50 mM HEPES (pH 7.0), 5 mM MgCl2, and various concentrations of pNPP. The kinetic constants determined from this assay agreed with those determined using the fixed time phosphate assay described above.
pH-rate profile determination
All reactions were carried out at 37 °C using the following buffers and pH values/ranges: 50 mM sodium acetate, pH 4.5, 50 mM Bis-tris, pH 5.0-6.5; 50 mM Hepes, pH 7.0-7.5; 50 mM Tris, pH 8.0-8.5; and 50 mM bicine, pH 9.0. The substrate mannose 6-phosphate was used at 3 mM (~Km) to achieve subsaturating substrate (Vmax/Km) conditions. All reaction solutions contained 5 mM MgCl2. The initial reaction rates were measured using the discontinuous assay as described above.
Crystallization and data collection
Purified BT1666 was concentrated to 68 mg/mL in 1 mM HEPES (pH 7.5) for crystallization. Crystals of BT1666 were grown by the hanging-drop vapor-diffusion method where drops were formed by mixing 1 μL of protein solution with 1 μL of well solution. The initial BT1666 crystals were obtained from the Index Screen (Hampton Research). The refined conditions consisted of 0.1 M Bis-Tris pH 6.5, 0.2 M ammonium sulfate and 25% w/v PEG 3350, at 25 °C. Crystals appeared in approximately one week. The crystals were protected for low-temperature data collection by transfer to 100% Paratone-N (Hampton Research) for 5-10 min before data collection at 100 K. Data were collected on a Rigaku- RU-300 X-ray generator and R-Axis IV++ image plate located at Boston University School of Medicine. Data from a single BT1666 crystal, 0.3 × 0.2 × 0.05 in dimension, were collected and processed to 1.82 Å resolution using HKL2000 (8). BT1666 crystallized in space group P212121 with unit cell dimensions a=52.8 Å, b=70.7 Å, c=72.4 Å. The data collection statistics are reported in Table 1.
Table 1.
Crystallographic Data Collection and Refinement Statisticsa
| Data Collection Statistics | |
| Resolution (highest resolution shell) (Å) | 50.00-1.82(1.89-1.82) |
| X-ray Source | CuKα |
| Wavelength (Å) | 1.5418 |
| Space group | P212121 |
| Cell dimension (Å) | a=52.8 b=70.7 c= 72.4 |
| Reflections Observed (unique) | 24886 (2449) |
| Completeness (%) | 99.9 (100) |
| Rmergea (%) | 10.8 (55.7) |
| I/σ (I) | 15.5 (3.0) |
| Redundancy | 7.2 (7.0) |
| Refinement Statistics | |
| No. of protein residues/water atoms per asu | 2063/332 |
| No. Mg2+/sulfate ions per asu | 1/3 |
| Number of reflections (work/free) | 24832/1264 |
| Rwork/Rfree (%) | 16.4/19.8 |
| Resolution (Å) | 18.1-1.82 |
| Average B-factor (Å2) | 25.1 |
| Protein | 23.4 |
| Mg2+ | 8.5 |
| Sulfate | 37.8 |
| Water | 35.0 |
| Root mean square deviation | |
| Bond length (Å) | 0.008 |
| Bond angle (°) | 1.127 |
, where <Ihkl> is the mean intensity of the multiple Ihkl, i observations for symmetry-related reflections.
Molecular replacement and crystallographic refinement
The phase problem was solved via the molecular replacement method using BT4131 (5) as the search model (40% sequence identity and 75% sequence similarity to BT1666). The program MOLREP (9) in the CCP4 program suite was used to solve the rotation and translation functions, yielding a solution with a correlation coefficient of 37 % and an R-factor of 59 % at 2.1 Å resolution. The asymmetric unit contains one BT1666 molecule. Alternating rounds of manual rebuilding and refinement were performed using the molecular graphics program COOT (10) followed by minimization and simulated annealing in PHENIX (11). During refinement, the water molecules and ligands were added once the Rfree was below 30%. The final protein model was analyzed using composite-omit electron-density maps calculated with CNS (12). The stereochemistry was monitored with PROCHECK (13). Analysis of the Ramachandran plot showed that 97.2% of the residues fall in the most favored regions with 2.8% in the additionally allowed regions and with no residues falling in the generously allowed or disallowed regions. The refinement statistics are reported in Table 1.
The final model is well ordered, and comprises all 258 amino acid residues. The construct used for crystallization contained an N-terminal His6-tag and thrombin cleavage site (MGSSHHHHHHSSGLVPRGSH). The final protein model also included the last 10 residues of the His6-tag/thrombin cleavage site (SSGLVPRGSH). Refinement and final model statistics are given in Table 1. The graphic figures of the structure were produced using PyMOL, Molscript (14) and rendered by POVRAY (15).
Bioinformatic Analysis
BT1666 and BT4131 orthologues were identified by carrying out BLAST searches of the NCBI genome databank (http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi). The cutoff used for assigning orthologues was >51% sequence identity as determined by a pair-wise amino acid sequence alignment. Multiple sequence alignments were carried out in ClustalW (http://www.ch.embnet.org/software/ClustalW.html) and rendered in ESPript (http://espript.ibcp.fr/ESPript/ESPript/).
Results and Discussion
Substrate Specificity of BT1666
The gene encoding BT1666 was expressed from a pET15B vector bearing a N-terminal His6-tag (BT1666-pET15B plasmid) in Rosetta (DE3)plysS competent cells with induction at low temperature (15 °C) overnight. The (>95%) pure His6-tagged BT1666 was stable and soluble in 300 mM NaCl and 50 mM phosphate buffer, pH 7.5. Gel-filtration chromatography demonstrated that His6-tagged BT1666 is a monomer in solution (see the Materials and Methods section for details) as is its paralogue BT4131 (5). Purified BT1666 phosphohydrolase activity was screened against a number of phosphorylated small-molecule substrates (Table 2). Of the compounds tested, BT1666 shows similar activity toward D-fructose 1,6-(bis)phosphate, trehalose 6-phosphate, N-acetyl-glucosamine 6-phosphate, D-glucosamine 6-phosphate, mannose 6-phosphate, ribose 5-phosphate, D-arabinose 5-phosphate with kcat/Km values ranging between 1.1 × 102 M-1 s-1 to 4.4 × 102 M-1 s-1. The activity for the best substrate is low compared to that expected for enzymes involved in primary metabolism (kcat/Km >1 × 105 M-1 s-1) but also lower than for the best substrates of BT4131 (kcat/Km = 1.8 × 103 M-1 s-1 - 6.7 × 103 M-1 s-1) and other HADSF enzymes involved in secondary metabolism (eg. ~1 × 104 M-1 s-1 for E. coli NagD (4)). Thus, BT1666 does not appear to be a very efficient catalyst. Because low catalytic efficiency might be an artifact of the tethered His6-tag associated with the recombinant protein, BT1666 was prepared without the tag and assayed. The activity of the native BT1666 was no greater. We also measured the pH rate profile (using V/K conditions with substrate mannose 6-phosphate, Figure S1) to determine if the optimal pH for BT1666 catalysis might significantly deviate from the solution pH used in the substrate activity screen (pH 7.5, similar to that used previously for BT4131 (pH 7.0)). Although the activity of BT1666 is higher at slightly acidic pH, the activity at pH 7.0 and pH 7.5 is only three-fold lower. Thus, the catalytic efficiency of BT1666 is less than that of BT4131, even for the best substrates.
Table 2.
Steady-state kinetic constants for HPP and BT1666 catalyzed hydrolysis of phosphorylated small-molecule substrates in 50 mM HEPES containing 5 mM MgCl2 (pH 7.0, 37 °C).
| BT1666 | BT4131a | BT1666/BT4131 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| substrate | kcat (s-1) | Km (mM) | kcat/Km ( M-1s-1) | kcat (s-1) | Km (mM) | kcat/Km ( M-1s-1) | kcat /kcat | Km/Km | kcat/Km/kcat/Km |
| D-glucose 6-P | 2.9 (± 0.3) × 10-1 | 1.2 (± 0. 3) × 101 | 2.4 × 101 | 3.6 ± 0.2 | 1.1 (±0.2) × 101 | 3.3 × 102 | 0.08 | 1.1 | 0.072 |
| 2-deoxy glucose-6-P | 5.4 (± 0.3) × 10-1 | 1.3 (± 0.2) × 101 | 4.2 × 101 | 2.6 (± 0.1) × 101 | 3.9 ± 0.4 | 6.7 × 103 | 0.02 | 3.3 | 0.006 |
| glucosamine 6-P | 1.2 ± 0.3 | 5 ± 3 | 2.4 × 102 | 2.7 (± 0.2) × 10-1 | 6.7 ± 0.9 | 4.0 × 101 | 4.4 | 0.75 | 5.3 |
| NAc-glucosamine 6-P | 1.3 (± 0.1) × 10-1 | 1.1 ± 0.2 | 1.2 × 102 | 3.6 ± 0.5 | 1.7 (± 5) × 101 | 2.1 × 102 | 0.036 | 0.065 | 0.57 |
| mannose 6-P | 3.0 (± 0.3) × 10-1 | 2.5 ± 0.5 | 1.2 × 102 | 1.3 ± 0.1 | 1.9 ± 0.3 | 7.0 × 102 | 0.23 | 1.3 | 0.18 |
| fructose 6-P | 3.3 (± 0.3) × 10-1 | 8 ± 2 | 4.1 × 101 | 9 (± 1) × 10-1 | 4 ± 1 | 2.7 × 102 | 0.37 | 2 | 0.18 |
| fructose 1,6-(bis)P | 1.1 ± 0.1 | 2.5 ± 0.5 | 4.4 × 102 | 1.4 (± 0.2) × 10-2 | 4 ± 1 | 3.5 | 78 | 0.62 | 125 |
| arabinose 5-P | 2.6 (± 0.1) × 10-1 | 1.5 ± 0.1 | 1.7 × 102 | 9.6 ± 0.7 | 3.5 ± 0.5 | 2.7 × 103 | 0.027 | 0.43 | 0.063 |
| ribose 5-P | 4.2 (± 0.6) × 10-1 | 2.1 ± 0.5 | 2.0 × 102 | 8.7 ± 0.4 | 4.9 ± 0.6 | 1.8 × 103 | 0.048 | 0.43 | 0.11 |
| sorbitol 6-P | 4.2 (± 0.6) × 10-1 | 6.5 ± 3 | 6.5 × 101 | 5.4 (± 0.2) | 6.8 ± 0.5 | 8.0 × 102 | 0.078 | 0.96 | 0.081 |
| DL-α-glycerol 3-P | 4.2 (± 0.3) × 10-1 | 5.3 ± 0.7 | 7.9 × 101 | 1.04 (± 0.02) × 101 | 8.7 ± 0.4 | 1.2 × 103 | 0.040 | 0.61 | 0.066 |
| sucrose 6'-P | 1.1 (± 0.1) × 10-1 | 1.4 ± 0.3 | 7.9 × 101 | 1.3 (± 0.1) × 10-2 | 3.1 ± 0.1 | 4.2 | 8.5 | 0.45 | 19 |
| trehalose 6-P | 3.0 (± 0.3) × 10-1 | 2.2 ± 0.4 | 1.4 × 102 | ~6 × 10-2 | ----- | ----- | 5 | ----- | ----- |
| ADP | 4.5 (± 0.6) × 10-1 | 1.0 (± 2) × 101 | 4.5 × 101 | 1.6 (± 0.1) × 10-2 | 1.6 ± 0.4 | 1.0 × 101 | 28 | 6.3 | 4.5 |
| pyridoxal 5'-P | 2.9 (± 0.1) × 10-1 | 4.1 ± 0.5 | 7.0 × 101 | 1.8 ± 0.1 | 0.88 ± 0.08 | 2 × 103 | 0.16 | 4.7 | 0.035 |
| pNPP | 5.5 (± 0.7) × 10-2 | 0.51 ± 0.1 | 1.1 × 102 | 8.3 (± 0.2) × 10-2 | 0.77 ± 0.05 | 1.1 × 102 | 0.66 | 0.66 | 1 |
from (5)
BT1666 shows similar specificity constants toward seven of the sugar substrates assessed. However, BT1666 is not non-specific, as the kcat/Km for substrates such as β-D-glucose 6-phosphate is ~100 fold lower than that of the best substrates. Comparison of the specificity profile previously reported for BT4131 (Table 2) (5) to that of BT1666 reveals some divergence in substrate recognition. For instance, the substrates D-fructose 1,6-(bis)phosphate and sucrose 6’-phosphate show 125 and 19 times higher kcat/Km for BT1666 compared to BT4131, respectively, but 2-deoxy-D-glucose 6-phosphate and D-arabinose 5-phosphate show ~150 and ~20 fold higher specificity, respectively, for BT4131 compared to BT1666 (Table 2). Notably, the greatest contribution to the difference in specificity is in kcat not Km. Whereas the BT4131 gene context (juxtaposed to a chitobiase gene) together with the preference for hexose phosphates implicates a possible role in chitin recycling, the broad substrate range also raised the possibility of a housekeeping role. Gene context gives no clue as to the possible physiological function of BT1666; it is most frequently co-located with a putative phosphate acetyltransferase and a putative asparaginyl-tRNA synthetase. Overall, BT1666 is more active towards larger substrates than is BT4131, and the two together cover a larger substrate range than either alone can provide, which may confer a selective advantage in a general housekeeping function. The housekeepers would act by removing inhibitory phosphorylated metabolites which if they build to high concentrations, could prove to be cytotoxic.
Structure determination
In order to understand the structural basis for their overlapping, but distinct, substrate profiles, the structure of BT1666 was determined. The final model of BT1666, including all 258 residues and the last 10 residues from the N-terminal His6-tag (SSGLVPRGSH), was refined to 1.82 Å resolution with a Rwork of 16.4% and Rfree of 19.8%. All residues are well-defined (see Figure S2 for sample electron density) with an average B-factor for the protein of 23.4 Å2. The final BT1666 model includes one Mg2+ ion, which occupies the cofactor binding site found in all HAD phosphotransferases (16). Because the crystallization buffer contained 0.2 M sulfate, a sulfate ion was modeled in electron density observed at the BT1666 active site where a phosphate ion or the phosphoryl group of the substrate binds in the structures of other HADSF members(17). Two sulfate ions were also modeled at the enzyme surface.
BT1666, which is composed of two distinct domains connected by two linker regions (Figure 2A), belongs to the C2b class of HAD members (Figure 1A) (5). The larger domain is the “conserved core” domain that houses the phosphoryl-transfer site. The slightly smaller domain is the cap domain, which in other C2b members docks on the core domain to cover the active site and provide determinants of substrate specificity (18-21). The core domain is a modified Rossmann fold similar to that found in BT4131, including six parallel β-sheets surrounding by six α-helices. The cap is a mixed alpha/beta fold with αββ(αβαβ)αββ topology. As compared to other HAD members, BT1666 shows a unique feature in that there is a peptide from a neighboring molecule penetrating the active site (Figure 3). Analysis of the peptide shows that it comprises the final ten residues of the N-terminal His6-tag of the adjacent symmetry related BT1666 monomer. The peptide is well ordered with B-factors of 25.2 Å2 compared to those of the remaining protein structure (23.3 Å2; calculated excluding the ten residues of the peptide). This finding may give some insight into the mode of substrate binding of capped members of the HADSF that show phosphoprotein phosphohydrolase activity, including the cofilin-activating phosphatase chronophin (22), and Eyes absent (Eya) phosphatase responsible for dephosphorylating the C-terminal tyrosyl residue of histone H2A.X (23).
Figure 2.
(A) Ribbon diagram of BT1666 liganded to the cofactor Mg2+ (magenta sphere) and sulfate ion (ball and stick). The core domain is colored green and cap domain is colored blue with the two interdomain linkers colored red. (B.) An overlay of the structures of BT1666 (green) and BT4131 (grey). Ribbon diagram of (C) BT1666 and (D) BT4131 with the sequence variability calculated from a multiple sequence alignment of each enzyme mapped onto the structure from blue (conserved) to red (variable).
Figure 3.
The active site of BT1666 depicted as a ribbon diagram with the N-terminal 10 residues of the adjacent symmetry related monomer depicted as ball and stick.
Based on similarity in sequence and structure BT1666 and BT4131 clearly share common ancestry and might be the products of gene duplication (Figure 2B; Figures S3-S5 for sequence alignments). Comparison of BT4131 and BT1666 shows that the protein structures have 1.93 Å rmsd and 40 % sequence identity overall, 1.95 Å rmsd and 25% sequence identity for the cap domain and 0.84 Å rmsd and 48% sequence identity for the core domain (notably the rmsd is biased by the fact that BT4131 was the model used for molecular replacement). The finding that the sequence conservation between the cap domains is considerably lower than that observed between the core domains indicates that BT4131 and BT1666 are diverging in function, as it is the cap that determines the physiological substrate(s). It is thus the cap domain that should be examined to understand the differences in substrate specificity between the two paralogues (vide supra).
The relative disposition of the respective cap and core domains in BT1666 and BT4131 differ. Specifically, BT1666 is in a cap-open conformation whereas BT4131 is in a cap-closed conformation. The cap-open conformation allows substrate to bind and product to dissociate. The cap-closed conformation is required for catalytic turnover (24-25). The superposition of the BT1666 and BT4131 domains separately (to remove the effect of open and closed conformations) revealed that cap and core domain residues that comprise the active site are identical (Figure 4). In view of the differences in the specificity profiles and in the catalytic efficiencies of the two enzymes, this was an unexpected finding. The differences in “second-shell” residues in the cap could change the size and shape of the active site, either by interactions with active-site residues or by affecting the fit between cap and core. In BT4131, because of the closed cap conformation, there is an enclosed area which resembles the possible substrates in size and shape(5). However, because of the cap-open conformation, the size and shape of the active-site cavity cannot be assessed for BT1666. Thus, it is not possible to ascertain whether the differences in substrate specificity are due to differences in the active site volume.
Figure 4.
Overlay of the active site of BT1666 with that of BT4131 colored as in Fig 2B. Residues W171/174, F175/ F178 are contributed by the cap domain.
In addition to “static” effects of sequence variation on the structure of the active sites of the two paralogues, it is possible that the dynamics of the two differ. Protein dynamics at the level of large substrate-induced conformational changes as well as more subtle rearrangements as the enzyme approaches the transition-state have been implicated in both the catalytic efficiency of enzymes and in the selectivity between two substrates(26-28). To examine patterns of conservation that might be linked to protein scaffold dynamics, the sequence similarity of each protein was mapped onto each structure (29). The variation in conservation follows the expected pattern with highest conservation of catalytic residues in the core domain and those few cap domain residues that help form the active site (Figure 2C and 2D). Although the lower catalytic efficiency and differing specificity profiles of BT4131 and BT1666 may well be due to conformational dynamics, the pattern of residue conservation does not highlight underlying structural changes to which these differences can be ascribed.
Conclusion
Comparison of the protein structures and overlapping substrate profiles highlights the similarities more than the differences between BT1666 and BT4131. The question remains, why would the bacterium retain two such similar catalysts? First, the substrates profiles between these enzymes, though overlapping, do differ. Although the kcat/Km values for these substrates are low, and hence might have little physiological relevance, the catalytic proficiency ((kcat/Km) / kuncat) and rate acceleration (kcat/kuncat) are high and thus can provide some advantage under selection processes. In addition, the retention of two similar enzymes under the control of different promoters for expression could allow the recruitment of these activities under differing conditions(4).
It has been hypothesized that ancient enzymes possessed broad specificities such that a small number of catalysts could fill multiple metabolic roles in the cell(30). Divergence of these generalist enzymes, through gene duplication, mutation, and natural selection(31), has led to the formation of enzyme families, with individual members that are highly specific and proficient (although the order of events is unclear(32)). Based on their high overall sequence identity (40%) one can speculate that BT4131 and BT1666 represent a gene duplication event followed by limited divergence and honing toward specific substrates. The sequence conservation among orthologues of each enzyme (33 % for BT1666, 27 % for BT4131) does not allow speculation as to which was the “original” and which was the “copy”. Neither does the distribution of the two enzymes (Table S1) which shows that out of the 28 Bacteroides species possessing either enzyme, 15 bear BT4131 alone, 6 bear BT1666 alone and 7 bear both enzymes. Indeed, given the exceedingly high frequency of lateral gene transfer among bacteria, both within the genus Bacteroides and among Bacteroides species and other gram-positive bacteria in the human colon(33), these paralogues may have been acquired together, or in separate events. Such lateral gene transfer events may introduce enzyme activities isofunctional to those already encoded by the recipient genome(34). Notably, it has been demonstrated that high and low efficiency forms of phosphoglycerate mutase stemming from two different enzyme families (isologous enzymes) present in E. coli have overlapping and complementary roles in the cell. Overexpression of the low efficiency form of phosphoglycerate mutase can complement deletion of the high efficiency form. This functional redundancy is thought to be best tolerated in larger bacterial genomes (>3.7 Mb) and B. theataiotaomicron falls in this category at 4.6 Mb(35). In the case where genes are essential, functional redundancy is a prerequisite for any subsequent selective loss of one gene. In the case of phosphoglycerate mutase this necessity leads to the detection of non-homologous genes in the ancestral genome that encode enzymes capable of carrying out the same reaction. In the case of BT4131 and BT1666, it appears that divergence of a common ancestor or gene duplication in some combination with lateral gene transfer has led to the phenomenon of closely functioning, if not isofunctional paralogues in Bacteroides thetaiotaomicron where gene regulation can allow either selective usage or enhanced production of one paralogue to fill a role in secondary metabolism.
Supplementary Material
Abbreviations used are
- Bis-Tris
bis-(2-hydroxy-ethyl)-amino-tris(hydroxymethyl)aminomethane
- Eya
Eyes absent
- HADSF
haloalkanoate dehalogenase superfamily
- HEPES
4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid
- (KDN-9-P)
2-keto-3-deoxy-D-glycero-D-galacto-9-phosphonononic acid
- KDO-8-P
2-keto-3-deoxy-8-phospho-D-manno-octulosonic acid
- pNPP
p-nitrophenyl phosphate
- rmsd
root-mean-square deviation
Footnotes
This work was supported by N.I.H. grants GM61099 and U54 GM093342.
The coordinates for the X-ray crystal structure of BT1666 are deposited in the protein data bank under the ID code 3R4C.
References
- 1.Allen KN, Dunaway-Mariano D. Markers of fitness in a successful enzyme superfamily. Curr Opin Struct Biol. 2009;19:658–665. doi: 10.1016/j.sbi.2009.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Allen KN, Dunaway-Mariano D. Phosphoryl group transfer: evolution of a catalytic scaffold. Trends Biochem Sci. 2004;29:495–503. doi: 10.1016/j.tibs.2004.07.008. [DOI] [PubMed] [Google Scholar]
- 3.Burroughs AM, Allen KN, Dunaway-Mariano D, Aravind L. Evolutionary genomics of the HAD superfamily: understanding the structural adaptations and catalytic diversity in a superfamily of phosphoesterases and allied enzymes. J Mol Biol. 2006;361:1003–1034. doi: 10.1016/j.jmb.2006.06.049. [DOI] [PubMed] [Google Scholar]
- 4.Tremblay LW, Dunaway-Mariano D, Allen KN. Structure and activity analyses of Escherichia coli K-12 NagD provide insight into the evolution of biochemical function in the haloalkanoic acid dehalogenase superfamily. Biochemistry. 2006;45:1183–1193. doi: 10.1021/bi051842j. [DOI] [PubMed] [Google Scholar]
- 5.Lu Z, Dunaway-Mariano D, Allen KN. HAD superfamily phosphotransferase substrate diversification: structure and function analysis of HAD subclass IIB sugar phosphatase BT4131. Biochemistry. 2005;44:8684–8696. doi: 10.1021/bi050009j. [DOI] [PubMed] [Google Scholar]
- 6.Lu Z, Wang L, Dunaway-Mariano D, Allen KN. Structure-function analysis of 2-keto-3-deoxy-D-glycero-D-galactonononate-9-phosphate phosphatase defines specificity elements in type C0 haloalkanoate dehalogenase family members. J Biol Chem. 2009;284:1224–1233. doi: 10.1074/jbc.M807056200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wang L, Lu Z, Allen KN, Mariano PS, Dunaway-Mariano D. Human symbiont Bacteroides thetaiotaomicron synthesizes 2-keto-3-deoxy-D-glycero-D-galacto-nononic acid (KDN) Chem Biol. 2008;15:893–897. doi: 10.1016/j.chembiol.2008.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Otwinowski Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. Methods in Enzymology. 1997;276:307–326. doi: 10.1016/S0076-6879(97)76066-X. [DOI] [PubMed] [Google Scholar]
- 9.Vagin A, Teplyakov A. MOLREP: an Automated Program for Molecular Replacement. Journal of Applied Crystallography. 1997;30(Part 6):1022–1025. [Google Scholar]
- 10.Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
- 11.Adams PD, Grosse-Kunstleve RW, Hung LW, Ioerger TR, McCoy AJ, Moriarty NW, Read RJ, Sacchettini JC, Sauter NK, Terwilliger TC. PHENIX: building new software for automated crystallographic structure determination. Acta Crystallogr D Biol Crystallogr. 2002;58:1948–1954. doi: 10.1107/s0907444902016657. [DOI] [PubMed] [Google Scholar]
- 12.Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang JS, Kuszewski J, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL. Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr. 1998;54:905–921. doi: 10.1107/s0907444998003254. [DOI] [PubMed] [Google Scholar]
- 13.Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 1993;26(part 2):283–291. [Google Scholar]
- 14.Kraulis PJ. MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 1991;24:946–950. [Google Scholar]
- 15.Fenn TD, Ringe D, Petsko GA. a program for model and data visualization using persistence of vision ray-tracing. Journal of Applied Crystallography. 2003;36:944–947. [Google Scholar]
- 16.Zhang G, Morais MC, Dai J, Zhang W, Dunaway-Mariano D, Allen KN. Investigation of metal ion binding in phosphonoacetaldehyde hydrolase identifies sequence markers for metal-activated enzymes of the HAD enzyme superfamily. Biochemistry. 2004;43:4990–4997. doi: 10.1021/bi036309n. [DOI] [PubMed] [Google Scholar]
- 17.Lu Z, Dunaway-Mariano D, Allen KN. The catalytic scaffold of the haloalkanoic acid dehalogenase enzyme superfamily acts as a mold for the trigonal bipyramidal transition state. Proc Natl Acad Sci U S A. 2008;105:5687–5692. doi: 10.1073/pnas.0710800105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kim Y, Yakunin AF, Kuznetsova E, Xu X, Pennycooke M, Gu J, Cheung F, Proudfoot M, Arrowsmith CH, Joachimiak A, Edwards AM, Christendat D. Structure- and function-based characterization of a new phosphoglycolate phosphatase from Thermoplasma acidophilum. J Biol Chem. 2004;279:517–526. doi: 10.1074/jbc.M306054200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Rao KN, Kumaran D, Seetharaman J, Bonanno JB, Burley SK, Swaminathan S. Crystal structure of trehalose-6-phosphate phosphatase-related protein: biochemical and biological implications. Protein Sci. 2006;15:1735–1744. doi: 10.1110/ps.062096606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Shin DH, Roberts A, Jancarik J, Yokota H, Kim R, Wemmer DE, Kim SH. Crystal structure of a phosphatase with a unique substrate binding domain from Thermotoga maritima. Protein Sci. 2003;12:1464–1472. doi: 10.1110/ps.0302703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Silvaggi NR, Zhang C, Lu Z, Dai J, Dunaway-Mariano D, Allen KN. The X-ray crystal structures of human alpha-phosphomannomutase 1 reveal the structural basis of congenital disorder of glycosylation type 1a. J Biol Chem. 2006;281:14918–14926. doi: 10.1074/jbc.M601505200. [DOI] [PubMed] [Google Scholar]
- 22.Gohla A, Birkenfeld J, Bokoch GM. Chronophin, a novel HAD-type serine protein phosphatase, regulates cofilin-dependent actin dynamics. Nat Cell Biol. 2005;7:21–29. doi: 10.1038/ncb1201. [DOI] [PubMed] [Google Scholar]
- 23.Krishnan N, Jeong DG, Jung SK, Ryu SE, Xiao A, Allis CD, Kim SJ, Tonks NK. Dephosphorylation of the C-terminal tyrosyl residue of the DNA damage-related histone H2A.X is mediated by the protein phosphatase eyes absent. J Biol Chem. 2009;284:16066–16070. doi: 10.1074/jbc.C900032200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Dai J, Finci L, Zhang C, Lahiri S, Zhang G, Peisach E, Allen KN, Dunaway-Mariano D. Analysis of the structural determinants underlying discrimination between substrate and solvent in beta-phosphoglucomutase catalysis. Biochemistry. 2009;48:1984–1995. doi: 10.1021/bi801653r. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dai J, Wang L, Allen KN, Radstrom P, Dunaway-Mariano D. Conformational cycling in beta-phosphoglucomutase catalysis: reorientation of the beta-D-glucose 1,6-(Bis)phosphate intermediate. Biochemistry. 2006;45:7818–7824. doi: 10.1021/bi060136v. [DOI] [PubMed] [Google Scholar]
- 26.Kellinger MW, Johnson KA. Nucleotide-dependent conformational change governs specificity and analog discrimination by HIV reverse transcriptase. Proc Natl Acad Sci U S A. 2010;107:7734–7739. doi: 10.1073/pnas.0913946107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hammes-Schiffer S, Benkovic SJ. Relating protein motion to catalysis. Annual Review of Biochemistry. 2006;75:519–541. doi: 10.1146/annurev.biochem.75.103004.142800. [DOI] [PubMed] [Google Scholar]
- 28.Fraser JS, Clarkson MW, Degnan SC, Erion R, Kern D, Alber T. Hidden alternative structures of proline isomerase essential for catalysis. Nature. 2009;462:669–673. doi: 10.1038/nature08615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Garcia-Boronat M, Diez-Rivero CM, Reinherz EL, Reche PA. PVS: a web server for protein sequence variability analysis tuned to facilitate conserved epitope discovery. Nucleic Acids Res. 2008;36:W35–41. doi: 10.1093/nar/gkn211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Jensen RA. Enzyme recruitment in evolution of new function. Annu Rev Microbiol. 1976;30:409–425. doi: 10.1146/annurev.mi.30.100176.002205. [DOI] [PubMed] [Google Scholar]
- 31.Ohno S, editor. Evolution by gene duplication. xv. Allen & Unwin/Springer-Verlag; London-New York: 1970. [Google Scholar]
- 32.Khersonsky O, Tawfik DS. Enzyme promiscuity: a mechanistic and evolutionary perspective. Annu Rev Biochem. 2010;79:471–505. doi: 10.1146/annurev-biochem-030409-143718. [DOI] [PubMed] [Google Scholar]
- 33.Shoemaker NB, Vlamakis H, Hayes K, Salyers AA. Evidence for extensive resistance gene transfer among Bacteroides spp. and among Bacteroides and other genera in the human colon. Appl Environ Microbiol. 2001;67:561–568. doi: 10.1128/AEM.67.2.561-568.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Foster JM, Davis PJ, Raverdy S, Sibley MH, Raleigh EA, Kumar S, Carlow CK. Evolution of bacterial phosphoglycerate mutases: non-homologous isofunctional enzymes undergoing gene losses, gains and lateral transfers. PLoS One. 2010;5:e13576. doi: 10.1371/journal.pone.0013576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Shaheduzzaman SM, Akimoto S, Kuwahara T, Kinouchi T, Ohnishi Y. Genome analysis of Bacteroides by pulsed-field gel electrophoresis: chromosome sizes and restriction patterns. DNA Res. 1997;4:19–25. doi: 10.1093/dnares/4.1.19. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




