Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Dec 8.
Published in final edited form as: Biochemistry. 2015 Nov 24;54(48):7142–7155. doi: 10.1021/acs.biochem.5b01143

Structural Studies of Geosmin Synthase, a Bifunctional Sesquiterpene Synthase with Alpha-Alpha Domain Architecture that Catalyzes a Unique Cyclization-Fragmentation Reaction Sequence

Golda G Harris , Patrick M Lombardi †,§, Travis A Pemberton , Tsutomu Matsui &, Thomas M Weiss &, Kathryn E Cole †,, Mustafa Köksal †,#, Frank V Murphy IV $, L Sangeetha Vedula , Wayne KW Chou , David E Cane , David W Christianson †,¢,*
PMCID: PMC4674366  NIHMSID: NIHMS738513  PMID: 26598179

Abstract

Geosmin synthase from Streptomyces coelicolor (ScGS) catalyzes an unusual, metal-dependent terpenoid cyclization and fragmentation reaction sequence. Two distinct active sites are required for catalysis: the N-terminal domain catalyzes the ionization and cyclization of farnesyl diphosphate to form germacradienol and inorganic pyrophosphate (PPi), and the C-terminal domain catalyzes the protonation, cyclization, and fragmentation of germacradienol to form geosmin and acetone through a retro-Prins reaction. A unique αα domain architecture is predicted for ScGS based on amino acid sequence: each domain contains the metal-binding motifs typical of a class I terpenoid cyclase, and each domain requires Mg2+ for catalysis. Here, we report the X-ray crystal structure of the unliganded N-terminal domain of ScGS and the structure of its complex with 3 Mg2+ ions and alendronate. These structures highlight conformational changes required for active site closure and catalysis. Although neither full-length ScGS nor constructs of the C-terminal domain could be crystallized, homology models of the C-terminal domain were constructed based on ~36% sequence identity with the N-terminal domain. Small-angle X-ray scattering experiments yield low resolution molecular envelopes into which the N-terminal domain crystal structure and the C-terminal domain homology model were fit, suggesting possible αα domain architectures as frameworks for bifunctional catalysis.


With over 75,000 members identified to date, terpenoids (also known as terpenes or isoprenoids) comprise the largest known family of natural products.1 Their structural and stereochemical diversity enables myriad biological functions across all domains of life, such as defense against parasites and other predators, inter- and intra-species communication, and intracellular signaling.24 Many terpenoids are used as flavorings, such as menthol5 and the cannabinoid β-caryophyllene;6 fragrances, such as limonene7 and sclareol;8 advanced biofuels, such as farnesane9 and bisabolane;10 and pharmaceuticals, such as artemisinin11 and Taxol.12 Thus, the broad commercial importance of this family of natural products spans diverse chemical industries worldwide.1316

The vast chemodiversity of terpenoid natural products belies simple biosynthetic roots in the C5 building blocks isopentenyl diphosphate and dimethylallyl diphosphate, which are substrates for chain elongation reactions that produce acyclic isoprenoids such as the C15 sesquiterpene farnesyl diphosphate (FPP) (Figure 1).17,18 Isoprenoid diphosphates such as FPP then serve as substrates for terpenoid cyclases that generate diverse products containing multiple fused rings and stereocenters.1924 Impressively, more than half of the carbon atoms of the isoprenoid diphosphate substrate typically undergo changes in bonding and/or hybridization during the course of a complex, multi-step cyclization cascade. Given their biosynthetic roots in C5 isoprenoid precursors, cyclic terpenoids usually contain 5n carbon atoms (n = 1, 2, 3…), unless the terpenoid has been subject to additional biosynthetic modification preceding or following cyclization (e.g., methylation, acetylation, benzoylation, etc.).

Figure 1.

Figure 1

Cyclization sequence catalyzed by bifunctional geosmin synthase from Streptomyces coelicolor (ScGS). The FPP cyclization reaction indicated in black is catalyzed by the N-terminal α domain, and the retro-Prins fragmentation reaction indicated in green is catalyzed by the C-terminal α domain.

The bicyclic natural product geosmin (Figure 1) is a noncanonical terpenoid, in that it contains 12 carbon atoms instead of 5n carbon atoms.25,26 Remarkably, the chemistry that yields this noncanonical terpenoid with C12 ≠ C5n is mediated by the terpenoid cyclase itself, rather than an upstream or downstream processing enzyme in the geosmin biosynthetic pathway.27,28 Geosmin is a powerful odorant with an extremely low human detection threshold of less than 10 parts-per-trillion, and is mainly responsible for the characteristic odor of freshly turned earth.29,30 Although geosmin contributes to the pleasant, earthy flavor of beets,31 it is also a commonly occurring contaminant of musty-tasting water, wine, and fish.3235 Geosmin is not known to cause human disease, but its detection and elimination from potable water sources is a critical environmental and water quality issue.

Geosmin is produced by essentially all known species of the Gram-positive bacterial genus Streptomyces. Geosmin synthase from the soil bacterium Streptomyces coelicolor (ScGS) is a 726-residue, bifunctional sesquiterpene cyclase that catalyzes tandem metal ion-dependent cyclization and fragmentation reactions utilizing the C15 substrate FPP to yield C12 geosmin, C3 acetone, and inorganic pyrophosphate (PP) (Figure 1).27,28,36,37 The active site in the N-terminal domain of ScGS catalyzes the ionization-dependent cyclization of FPP to form PPi and two cyclic products: germacradienol (major product, 85%) and germacrene D (minor product, 15%). After dissociation from the N-terminal domain, germacradienol is rebound to the active site of the C-terminal domain where it is converted to geosmin in a protonation-dependent cyclization reaction accompanied by the elimination of acetone through a retro-Prins reaction.28,38 Analysis of the dependence of the ratio of geosmin to germacradienol on protein concentration has established that dissociation of the germacradienol intermediate is mandatory, suggesting that there is no direct channel for transfer of the intermediate from the active site of the N-terminal domain to that of the C-terminal domain.28 This diffusive transfer of the germacradienol intermediate is reminiscent of the established mode of transfer of the copalyl diphosphate intermediate between the class II and class I terpenoid synthase domains of abietadiene synthase and other labdane synthases.39 The observed fragmentation chemistry catalyzed by the C-terminal domain of ScGS is unprecedented in terpenoid cyclase reactions. Even more curious is the Mg2+ requirement for catalysis by the C-terminal domain, since the role for metal ion(s) is unclear in the absence of an isoprenoid diphosphate substrate; moreover, the fragmentation reaction does not require inorganic pyrophosphate.28 These results are supported by experiments with the individual recombinant N-terminal and C-terminal domains as well as reconstituted mixtures of the two.28

Because the tandem cyclization-fragmentation reactions catalyzed by ScGS require two distinct active sites, a unique αα domain architecture is predicted for ScGS based on primary structure analysis.28 Although bacterial terpenoid cyclases usually consist of a single α domain, as first observed for the sesquiterpene cyclase pentalenene synthase,40 plant cyclases often include multiple domains. For example, αβ domain architecture is found in 5-epi-aristolochene synthase from Nicotiana tabacum,41 and αβγ domain architecture is found in taxadiene synthase from Taxus brevifolia42 (where α, β, and γ domains represent distinct folds, as classified by Oldfield43). In contrast, ScGS is believed to adopt αα domain architecture: the N-terminal domain and the C-terminal domain are separated by a 41-residue linker and share 28% and 29% amino acid sequence identity, respectively, with pentalenene synthase.28,36 Each domain contains characteristic metal ion-binding motifs of class I terpenoid cyclases:44 the aspartate-rich motif is found as D86DHFLE91 and D455DYYP459, and the “NSE/DTE” motif is found as N229DLFSYQRE237 and N598DVFSYQKE606.

Here, we report the X-ray crystal structure of the unliganded N-terminal cyclase domain of ScGS and the structure of its complex with three Mg2+ ions and alendronate. These structures show that this cyclase domain adopts the characteristic α fold of a class I terpenoid synthase, and also highlight the ligand-induced conformational changes required for complete active site closure and catalysis. Although neither full-length ScGS nor constructs of the C-terminal domain could be crystallized, the C-terminal domain is also predicted to adopt an α fold homologous to that of the N-terminal domain based on approximately 36% amino acid sequence identity between these domains. In the absence of crystal structure data for the full-length enzyme, small-angle X-ray scattering (SAXS) experiments with full-length ScGS and a nearly full-length construct containing residues 1–690 (ScGS690) suggest possible αα domain architectures that are consistent with and indeed may facilitate bifunctional catalysis.

MATERIALS AND METHODS

Expression and purification of full-length ScGS and ScGS690

The plasmid encoding full-length ScGS was prepared previously in the Cane laboratory, and this protein was expressed and purified as described.36 However, we were unable to crystallize full-length ScGS. Because the last 36 residues were predicted to be disordered using the DISOPRED server45 and could possibly hinder crystallization, we prepared a new construct in which these residues were deleted to yield a 690-residue protein. This truncated construct was prepared by PCR mutagenesis using forward primer 5′GGCATCCTCAACTGGCACCGGTAGTAGCCCCGTTACAAGGCCGAGTACC-3′ and reverse primer 5′-AGGTACTCGGCCTTGTAACGGGGCTACTACCGGTGCCAGTTGAGGATGCC-3′ (underlined nucleotides denote the mutagenic codons). PCR products were purified by gel electrophoresis and extracted according to manufacturer instructions (QIAGEN, QIAQuick Gel Extraction Kit). Purified PCR products were digested with DpnI at 37 °C for 1 h, transformed into XL1-Blue Escherichia coli, and plated on Lysogeny Broth (LB) agar medium typically containing 50 μg/mL ampicillin. Single colonies were used to inoculate 5-mL cultures of LB media with antibiotic (50 μg/mL ampicillin). These 5-mL cultures were grown overnight (16–18 h) at 37 °C with 250 rpm shaking, pelleted by centrifugation (~4000 × g, 20 min, 4 °C), and plasmid DNA was extracted using a QIAGEN QIAprep Spin Miniprep Kit. The plasmid was confirmed to encode the desired 690-residue protein by sequencing at the DNA Sequencing Facility at the Perelman School of Medicine, University of Pennsylvania.

A His6-tag was added to the C-terminus of this protein to facilitate its purification. Thus, although 36 residues were deleted from the C-terminus of full-length ScGS, 8 residues were added to the C-terminus of this construct (the His6 motif plus a 2-residue linker) such that the ScGS690 construct has a net loss of 28 residues from the C-terminus. PCR amplification (using PfuUltra DNA polymerase and 7% DMSO) of an NdeI-KpnI fragment was conducted using forward primer 5′-GCAGCAGCACATATGACGCAACAGCCCTTCCAACTCCCGCAC-3′ and reverse primer 5′-GCAGCAGCAGGTACCCCGGTGCCAGTTGAGGATGCCGGC-3′ (underlined nucleotides denote restriction sites for NdeI and KpnI, respectively). The PCR-amplified insert was purified by agarose gel electrophoresis and extracted. The insert was digested with NdeI and KpnI at 37 °C for 3 h, gel purified, and extracted to prepare the final insert fragment for ligation.

A variant of the pET22b vector (pET22bMV; Novagen) was used for ligation. A KpnI restriction site immediately upstream of the C-terminal His6-tag was created using PCR with the following forward and reverse mutagenic primers: 5′-CGTCGACAAGCTTGCGGCCGCAGGTACCCACCACCACCACCACCACTGAG-3′ and 5′-CTCAGTGGTGGTGGTGGTGGTGGGTACCTGCGGCCGCAAGCTTGTCGACG-3′ (underlined nucleotides denote the restriction site for KpnI). The vector was digested with NdeI, KpnI, and calf intestinal alkaline phosphatase at 37 °C for 3 h, gel purified, and extracted. The insert and vector were ligated using DNA ligase with 1:1, 2:1, and 5:1 insert:vector molar ratios at 15 °C overnight. Ligation products were transformed into XL1-Blue E. coli and plated on LB agar medium containing 50 μg/mL ampicillin. Single colonies were used to inoculate 5-mL cultures of LB media with antibiotic (50 μg/mL ampicillin), and plasmid DNA was extracted from the overnight cultures. DNA sequencing confirmed that the protein encoded by this plasmid, ScGS690, included residues 1–690 of ScGS and a C-terminal His6-tag to facilitate purification.

This plasmid was transformed into BL21(DE3)pLysS E. coli, and cells were plated on LB agar medium containing 34 μg/mL chloramphenicol and 100 μg/mL ampicillin. Single colonies were used to inoculate 5-mL cultures of LB media with antibiotic (34 μg/mL chloramphenicol and 50 μg/mL ampicillin). These 5-mL cultures were grown overnight (16–18 h) at 37 °C with 250 rpm shaking. Overnight cultures were used to inoculate 1-L cultures of LB media with antibiotic (34 μg/mL chloramphenicol and 50 μg/mL ampicillin). These 1-L cultures were grown at 37 °C with 250 rpm shaking until reaching an OD600 of 0.5–0.8. Cultures were cooled to 18–20 °C and induced with 1 mL of 1 M isopropyl-β-D-1-thiogalactopyranoside (IPTG). Induction continued overnight at 18–20 °C with 250 rpm shaking.

Cells were harvested by centrifugation (4200 × g, 10 min, 4 °C) and resuspended in buffer A [50 mM Tris (pH 8.2), 50 mM NaCl, 5 mM MgCl2, 20% glycerol, 5 mM β-mercaptoethanol (BME)] with protease inhibitor cocktail added. Cells were lysed by sonication and the cell lysate was clarified by centrifugation (30,000 × g, 1 h, 4 °C). Supernatant was loaded onto a Ni-IDA affinity column that had been equilibrated with buffer A. The column was washed with 90% buffer A/10% buffer B [50 mM Tris (pH 8.2), 50 mM NaCl, 5 mM MgCl2, 250 mM imidazole, 20% glycerol, 5 mM BME], and protein was eluted with 20% buffer A/80% buffer B. Protein purity was assessed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). Unfortunately, we were unable to crystallize ScGS690, so we resorted to proteolytic treatment to generate a protein sample that would crystallize.

Limited proteolysis of ScGS690

Fractions containing ScGS690 were concentrated to 1 mg/mL and subjected to limited proteolysis by proteinase K using the following conditions: 1:1000 protease:substrate in the presence of 5 mM CaCl2, incubated at room temperature for 30 minutes. The proteolysis reaction was quenched with 5 mM phenylmethanesulfonyl fluoride. The limited proteolysis mixture was concentrated to approximately 5 mL by centrifugation and loaded onto a Superdex-200 size exclusion column that had been equilibrated with size exclusion chromatography buffer [25 mM Tris (pH 8.2), 5 mM MgCl2, 5 mM BME]. Two protein peaks were observed on the size exclusion chromatogram, corresponding to the molecular weight of a complex of two domains, and the molecular weight of a single domain. Fractions containing these two species were separated and concentrated to approximately 10 mg/mL. Protein purity was assessed by SDS-PAGE.

Expression and purification of the N-terminal domain, ScGS366

Plasmid pRW22, containing ScGS residues 1–366 cloned into pET-21d, was prepared as previously described.28 The plasmid pRW22 was transformed into BL21(DE3)pLysS E. coli cells and the transformants were plated on LB agar medium containing 34 μg/mL chloramphenicol and 100 μg/mL ampicillin. Cell colonies were used to inoculate four 5-mL cultures of LB media and antibiotic (34 μg/mL chloramphenicol and 100 μg/mL ampicillin) that were grown to saturation overnight at 37 °C with 250 rpm shaking. Overnight cultures were used to inoculate four 250-mL cultures of LB media and antibiotic (34 μg/mL chloramphenicol and 100 μg/mL ampicillin) that were grown at 37 °C with 250 rpm shaking until reaching an OD600 of approximately 0.5. Protein expression was induced by the addition of 100 μL of 1M IPTG and allowed to continue overnight at 25 °C.

Cells were harvested by centrifugation at 6,100 × g for 15 minutes and resuspended in lysis buffer [50 mM Bis-tris (pH 7.0), 20% glycerol, and 10 mM BME] with a protease inhibitor cocktail tablet added. The cells were lysed by sonication and then centrifuged at 57,400 × g at 4 °C for 1 h. Following centrifugation, the supernatant was passed through a 0.22-μm filter and then loaded onto a Q-sepharose anion exchange column that had been equilibrated with buffer C [50 mM Bis-tris (pH 6.95), 20% glycerol, 5 mM MgCl2, 10 mM BME, and 50 mM NaCl]. After washing with 50 mL of buffer C, the protein was eluted by a 375 mL linear gradient of buffer C to buffer D [50 mM Bis-tris (pH 6.95), 20% glycerol, 5 mM MgCl2, 10 mM BME, and 500 mM NaCl]. Fractions containing ScGS366, as determined by SDS-PAGE, were collected and ammonium sulfate was added to a final concentration of 1.5 M. The solution was passed through a 0.22-μm filter and loaded onto a methyl column that had been equilibrated in buffer E [50 mM Bis-tris (pH 6.95), 20% glycerol, 5 mM MgCl2, 10 mM BME, and 1.5 M ammonium sulfate]. After washing with 250 mL of buffer E, the protein was eluted using a 150 mL linear gradient from buffer E to buffer F [50 mM Bis-tris (pH 6.95), 20% glycerol, 5 mM MgCl2, and 10 mM BME]. The purest fractions, as determined by SDS-PAGE, were pooled, concentrated to approximately 5 mL by ultrafiltration over a YM-10 membrane, and loaded onto a Superdex-200 size exclusion column that had been equilibrated in buffer G [25 mM Tris (pH 8.2), 5 mM MgCl2, and 10 mM BME]. The purest fractions, as determined by SDS-PAGE, were collected and concentrated by ultrafiltration over a YM-10 membrane. A Bradford assay was conducted to determine the final protein concentration, and aliquots of 5–15 mg/mL ScGS366 were prepared for crystallization trials.

Expression attempts with the C-terminal domain of ScGS

Three constructs containing varying segments of the C-terminal domain of ScGS were prepared, each with a C-terminal His6-tag: (1) residues 390–690, (2) residues 327–690, and (3) residues 327–726. Plasmids containing these constructs were transformed into BL21(DE3)pLysS E. coli and expressed as described above for ScGS690, except that 400 μL of 1.0 M IPTG was used for induction. We attempted to purify these constructs by Ni-IDA affinity chromatography as with ScGS690, but found that no expressed protein was in the soluble fraction of the lysate. It was determined that these C-terminal domain constructs were expressed primarily in inclusion bodies.

X-ray crystal structure determinations

For ScGS690 after limited proteolysis, fractions from the size exclusion column containing two-domain and single-domain proteins were isolated and used for separate crystallization trials using the sitting-drop vapor diffusion method. Only single-domain protein samples yielded crystals. Typically, a 0.6-μL drop of protein solution [7 mg/mL protein, 25 mM Tris (pH 8.2), 5 mM MgCl2, 10 mM BME, and 1.5 mM sodium alendronate] was added to a 0.6-μL drop of precipitant solution [0.2 M sodium acetate trihydrate (pH 7.0), and 20% (w/v) polyethylene glycol 3,350] and equilibrated against a 100-μL reservoir of precipitant solution at room temperature. Crystals formed from these conditions contained only the N-terminal domain of ScGS, residues 1–338, as confirmed by mass spectrometry. Hence, this protein was designated ScGS338. Crystals of ScGS338 belonged to space group P43212 (a = b = 67.223 Å, c = 345.454 Å; α = β = γ = 90°) with two molecules in the asymmetric unit. These crystals diffracted X-rays to 2.11 Å resolution at the National Synchrotron Light Source, beamline X-29, Brookhaven National Laboratory.

For ScGS366, the hanging drop vapor diffusion method was used for crystallization. A 2.0-μL drop of protein solution [5–15 mg/mL protein, 25 mM Tris (pH 8.2), 5 mM MgCl2, and 10 mM BME] was added to a 2.0-μL drop of precipitant solution [0.1 M Tris (pH 8.2), 0.5 M MgCl2, and 28% PEG 400] and equilibrated against a 1-mL reservoir of precipitant solution at 4 °C. Showers of needles, approximately 0.005 mm × 0.005 mm × 0.040 mm, appeared in less than a week. X-ray diffraction data were collected from single needles using a 20-μm wide beam at the Northeastern Collaborative Access Team beamline 24-ID-E, Advanced Photon Source, Argonne National Laboratory. Crystals belonged to space group P43212 (a = b = 76.8 Å, c = 130.6 Å; α = β = γ = 90°) with one molecule in the asymmetric unit, and initially diffracted X-rays to 2.4 Å resolution.

Data collection from ScGS366 needles was hindered by rapid crystal decay in the X-ray beam, even when crystals were frozen in liquid nitrogen. High-resolution X-ray diffraction disappeared after irradiating the ScGS366 needle crystals for just a few seconds. To produce the most complete dataset of reflections achievable (88.7% complete overall, 77.6% complete in the highest resolution shell), X-ray diffraction was recorded from several positions along the long axis of individual ScGS366 needles crystals and then merged with diffraction data from crystals looped in different orientations. While the Rmerge for the data is relatively high (Rmerge = 0.201 overall), the reflections exhibit a strong signal to noise ratio (I/σ = 4.00 in the highest resolution shell) and are internally consistent as judged by CC1/2 (CC1/2 = 0.896 in the highest resolution shell), a metric that is considered superior to Rmerge in assessing data quality.46,47

X-ray diffraction data sets were indexed, integrated, and scaled using HKL2000.48 The crystal structure of ScGS366 was determined by molecular replacement using the program phenix.mr_rosetta49 with a model of pentalenene synthase40 edited to match the ScGS366 sequence and refined with Rosetta prior to rotation and translation function calculations. For the liganded ScGS338 structure crystallized after limited proteolysis, the lower resolution ScGS366 structure was used as a search model. Manual model building and refinement were performed with COOT and PHENIX, respectively.50,51 Structure validation of each final model was performed using MolProbity.52 Data collection and refinement statistics for the unliganded and liganded structures of the N-terminal domain of ScGS are recorded in Table 1. Disordered segments characterized by uninterpretable broken or missing electron density were excluded from the final models. In the structure of the ScGS338-Mg2+3-alendronate complex, these segments included D116-A119; in the structure of unliganded ScGS366, these segments included D116-M121 and S167-A175.

Table 1.

Data collection and refinement statistics

Complex Unliganded ScGS366 Liganded ScGS338
A. Data Collection
 Resolution limits (Å) 34.3 − 3.20 48.2 − 2.11
 Total/unique reflections measured 22930/6094 567185/46408
 Space group P43212 P43212
 Unit cell: a, b, c (Å) 76.8, 76.8, 130.6 67.2, 67.2, 345.4
Rmergea,b 0.201 (0.326) 0.145 (0.918)
I/σ(I)a 6.99 (4.00) 14.64 (1.68)
 Completeness (%)a 88.7 (77.6) 98.8 (98.6)
 CC1/2 0.969 (0.896) 0.791 (0.391)
B. Refinement
 Reflections used in refinement/test set 10938/534 43984/2282
Rworka,c 0.240 (0.292) 0.207 (0.301)
Rfreea,c 0.288 (0.317) 0.250 (0.352)
 Protein atomsd 2353 5099
 Ligand atomsd 0 28
 Mg2+ ionsd 0 6
 Solvent atomsd 6 145
R.m.s. deviations
 bonds (Å) 0.002 0.006
 angles (deg) 0.7 0.9
Average B factors (Å2) 43 44
 Main chain 44 43
 Side chain 43 45
 Ligand 37
 Solvent 14 44
Ramachandran Plot (%)e
 Allowed 94.7 92.7
 Additionally allowed 5.3 7.0
 Generously allowed 0.0 0.4
 Disallowed 0.0 0.0
a

Numbers in parentheses refer to the highest resolution shell of data.

b

Rmerge for replicate reflections, R = Σ|Ih − 〈Ih|/Σ〈Ih〉; Ih = intensity measure for reflection h; and 〈Ih〉 = average intensity for reflection h calculated from replicate data.

c

Rwork = Σ‖Fo| − |Fc‖/Σ|Fo| for reflections contained in the working set. Rfree = Σ‖Fo| − |Fc‖/Σ|Fo| for reflections contained in the test set held aside during refinement (5% of total). |Fo| and |Fc| are the observed and calculated structure factor amplitudes, respectively.

d

Per asymmetric unit.

e

Evaluated with PROCHECK.

Homology modeling

Four online servers were used to create homology models of the C-terminal domain of ScGS: SWISS-MODEL,53 I-TASSER,54 Phyre2,55 and HHPred/MODELLER.56,57 For all models, residues G374-H698 (as expressed in the ScGS690 construct containing a C-terminal His-tag) or the full-length C-terminal domain (residues G374-H726) were provided as the sequences to be modeled. For I-TASSER, Phyre2, and HHPred/MODELLER, no template was specified for modeling. Phyre2 created one ScGS C-terminal domain model using chain A of selinadiene synthase (PDB ID 4OKM)58 as a structural template. I-TASSER created three different ScGS C-terminal models using threading templates identified by LOMETS59 with the highest significance; the top 10 threading alignments included structures of epi-isozizaene synthase (PDB IDs 3KB9 and 4LTV),60,61 chain A of selinadiene synthase (PDB ID 4OKM),58 chain A of hedycaryol synthase (PDB ID 4MC3),62 and chain B of N219L pentalenene synthase (PDB ID 1HM7).63 HHPred performed a multiple sequence alignment, and MODELLER created one ScGS C-terminal domain model using chain A of hedycaryol synthase (PDB ID 4MC3)62 as a template. When chain A of the ScGS338 structure was specified as a template, I-TASSER and SWISS-MODEL created a sixth and seventh ScGS C-terminal domain model. For homology modeling of the full-length C-terminal domain (residues G374-H726), I-TASSER created five homology models based on chain A of the ScGS338 structure as a template. These models corresponded to the 5 largest clusters of predicted structures generated by I-TASSER.

For all C-terminal domain homology models, significant errors in geometry (e.g. clashes, poor rotamers, poor Ramachandran angles, cis-peptide linkages) were detected by MolProbity. Phenix was used to normalize the geometry of these models using the program phenix.geometry_minimization.51 MolProbity was then used to assess the quality of each model. The side chains of certain asparagine, glutamine, and histidine residues were flipped as recommended to optimize hydrogen bond interactions. Final models were evaluated on the basis of their MolProbity statistics. We judged the C-terminal domain models generated by I-TASSER, with the N-terminal domain serving as a template, as the best models. The two C-terminal domain homology models – one based on the truncated construct ScGS690, the other based on full-length ScGS – were used in rigid body modeling with small-angle X-ray scattering data.

Small-angle X-ray scattering (SAXS)

Samples of ScGS690 after Ni-IDA purification were used for macromolecular SAXS experiments at the Stanford Synchrotron Radiation Lightsource (SSRL), beamline 4-2. Protein samples were concentrated to 21 mg/mL, flash-cooled in liquid nitrogen, and shipped to the beamline on dry ice. Aggregation was observed upon thawing at the beamline; samples were centrifuged to pellet the aggregated protein, and the supernatant was used for SAXS experiments. Protein samples were run on a Superdex 200 PC3.2 column (GE Healthcare) in SAXS buffer [25 mM Tris (pH 8.2), 5 mM MgCl2, 5 mM dithiothreitol] in the FPLC-coupled solution SAXS system. X-ray exposure times of 1.0 s were used throughout the data collection. Scattering curves were corrected using scattering data from the size exclusion chromatography effluent as background. PRIMUS was used to analyze the scattering curve, Guinier plot, Kratky plot, and Porod-Debye plot, and to calculate the radius of gyration (Rg) and Porod volume for each data set.64 GNOM was used to calculate the pair distribution function P(r) and to estimate the maximum dimension (Dmax) for the scattering particle.65 Molecular envelopes for the scattering particles were generated using the ShapeUp server.66

Sample of full-length ScGS were used for macromolecular SAXS experiments at the SIBYLS beamline 12.3.1, Advanced Light Source, Lawrence Berkeley National Laboratory. Samples of full-length ScGS were analyzed at three concentrations (2 mg/mL, 5 mg/mL, and 10 mg/mL) in a 96-well plate, and shipped to the beamline at 4 °C. At the beamline, SAXS data for each sample were collected at exposure times of 0.5 s, 1.0 s, 2.0 s, and 4.0 s. The scattering curves for each sample were corrected for background by subtracting the scattering curve of the reference buffer. A merged data set for a sample of full-length ScGS at 10 mg/mL was created by combining the scattering at low q from the 0.5 s exposure scattering curve, and the scattering at high q from the 2.0 s exposure scattering curve. Data analysis was performed as described above for SAXS experiments with ScGS690. Data collection statistics for all SAXS experiments are recorded in Table S1, Supporting Information.

SAXS rigid body modeling of the ScGS338 dimer

Since the structure of liganded ScGS338 contains a dimer in the asymmetric unit, this αα quaternary structure was used as a starting point for fitting the αα tertiary structure of ScGS690 into the molecular envelope calculated from SAXS data. The scattering profile for the ScGS338 dimer calculated with the FoXS server67 was compared to the experimental scattering profile of ScGS690 and the two-domain complex resulting from limited proteolysis using the FoXS χ value. Manual rotation of the two molecules of ScGS338 was guided by visual inspection and fit within the molecular envelope calculated by the ShapeUp server;66 the FoXS χ value was used to score the fits of the different poses of the subunits obtained by manual rotation to the experimental scattering profile.

SAXS rigid body modeling of the ScGS338 monomer with homology models of the ScGS C-terminal domain

The unliganded and liganded structures of the N-terminal domain of ScGS were each paired with the 7 homology models of the C-terminal domain of ScGS, resulting in 14 combinations representing possible models for the association of the N-terminal domain and the C-terminal domain in the full length protein. Each of the 14 domain pairs was modeled against the small-angle X-ray scattering curve for ScGS690 using the web service CORAL;68 this process created 14 models for the association of the N- and C-terminal domains, with a linker of the appropriate length modeled as “dummy” residues connecting the two domains. This linker was manually deleted in the coordinate file, and the resulting coordinate file was compared to the scattering curve of ScGS690 with the ShapeUp server66 and the FoXS server67 in the same manner as described above for the ScGS338 dimer. The FoXS χ value was used to score each model against the SAXS data measured from ScGS690. The two-domain models containing the unliganded and liganded N-terminal domain, each paired with the ScGS690 C-terminal domain homology model from I-TASSER, were used for further analysis.

To model ScGS from SAXS data collected from the full-length enzyme, the unliganded and liganded N-terminal domain structures were each paired with the full-length C-terminal domain homology model from I-TASSER and modeled into the SAXS data using CORAL68 in the same manner as described above. These two full-length ScGS models were similarly analyzed and evaluated using the FoXS χ value.

RESULTS

Crystal structure of the ScGS338-Mg2+3-alendronate complex

The N-terminal cyclase domain of ScGS resulting from proteolytic treatment of ScGS690, designated ScGS338, adopts the α-helical class I terpenoid synthase fold as first observed in the crystal structure of avian farnesyl diphosphate synthase.69 Designated as the “α fold” by Oldfield,43 key features of this fold include characteristic metal ion-binding motifs D86DHFLE91 and N229DLFSYQRE237 in which underlined residues coordinate to a cluster of 3 Mg2+ ions in complex with the bisphosphonate inhibitor alendronate (Figure 2; alendronate is formulated as the drug Fosamax used to treat osteoporosis70). Water molecules complete octahedral coordination polyhedra for all three Mg2+ ions. In addition to metal ion coordination, the bisphosphonate moiety is also stabilized by hydrogen bonds with R184, R236, R325, and Y326. Thus, 3 metal ions, 3 basic residues, and the phenolic hydroxyl group of a tyrosine stabilize the anionic bisphosphonate moiety. These interactions are also expected to accompany the binding of the diphosphate group of substrate FPP.

Figure 2.

Figure 2

(A) Simulated annealing omit maps of the Mg2+ ions (magenta map, contoured at 2.5σ) and the bisphophonate inhibitor alendronate (gray map, contoured at 2.5σ) bound in the active site of the N-terminal domain construct ScGS338 (structure determined at 2.11 Å resolution). Alendronate atoms are color-coded as follows: C = green, N = blue, O = red, P = orange. Metal ligands in the aspartate-rich segment (red) on helix D and the NSE segment (orange) on helix H are labeled. (B) Metal coordination interactions (solid lines) and hydrogen bond interactions (dashed lines) in the ScGS338-Mg2+3-alendronate complex. Atoms are color-coded as follows: C = red (aspartate-rich metal-binding motif), orange (NSE metal-binding motif), cyan (diphosphate recognition motif), and green (alendronate); N = blue, O = red, P = yellow; Mg2+ ions are large magenta sphere, water molecules are small red spheres.

Interestingly, in other cyclase structures the 3 basic residues that donate hydrogen bonds to the substrate diphosphate group comprise a combination of arginine and lysine side chains.44 However, in ScGS338 the 3 basic residues are all arginine side chains. Additionally, in some but not all cyclase active sites there is a tyrosine that donates a hydrogen bond to a diphosphate oxygen of inorganic pyrophosphate (PPi) or a similar anionic ligand such as a bisphosphonate.44 In addition to providing a sufficient driving force to trigger FPP ionization, these hydrogen bond interactions along with metal coordination interactions help maintain the active site in a closed conformation, inaccessible to bulk solvent, for the duration of the cyclization cascade. As a consequence, carbocation intermediates in catalysis are thereby protected from premature quenching by bulk solvent.

The pendant propylamino group of alendronate binds in the predominantly nonpolar region of the active site, along with 8 solvent molecules. While the pKa of this amino group is approximately 10.9 in aqueous solution,70 this pKa may be sufficiently perturbed by the hydrophobic environment so that the amino group is not protonated in the bound complex. The amino group donates a hydrogen bond to the backbone carbonyl oxygen of V82. It is interesting to note that 4 aromatic residues contribute significant surface area to the active site contour: F83, W192, F221, and W312. Although these residues may stabilize carbocation intermediates in catalysis through cation-π interactions, they are not sufficiently well oriented to similarly stabilize the buried propylamino group of alendronate.

The ScGS338-Mg2+3-alendronate complex crystallizes with an isologous dimer in the asymmetric unit. Accordingly, the active sites of the two protein molecules are oriented in antiparallel fashion. Single α-domain terpenoid synthases generally crystallize as monomers or dimers; those that crystallize as dimers usually contain monomers oriented in antiparallel fashion, as observed here. Occasionally, dimers comprised of parallel monomers are observed, for example, in the crystal structures of avian farnesyl diphosphate synthase69 or methyl isoborneol synthase from S. coelicolor.71 Assembly of the ScGS338 dimer buries a total of 819 Å2 surface area and the dimer interface is mainly comprised of helices G1, H1, α-1, and I. Dimer assembly is most similar to that observed in pentalenene synthase.

In terms of primary structure, the N-terminal domain of ScGS is most similar to hedycaryol synthase (27%/36% sequence identity/similarity).62 Pentalenene synthase,40 while having a slightly lower sequence identity (26%) with the N-terminal domain of ScGS, has a higher sequence similarity (41%). ScGS338 has nearly identical helix topology to that found in the structures of pentalenene synthase and epi-isozizaene synthase (EIZS).40,60 Nonetheless, the tertiary structure of ScGS338 most closely resembles that of selinadiene synthase58 (25%/38% sequence identity/similarity; PDB ID 4OKZ) based on analysis with Dali.72 The r.m.s. deviation of 275 Cα atoms between ScGS338 and selinadiene synthase is 1.6 Å (as calculated with Coot), indicating a closer structural match than ScGS338 with 275 Cα atoms of pentalenene synthase (2.1 Å), 293 Cα atoms of EIZS (2.0 Å), or 267 Cα atoms of hedycaryol synthase (2.6 Å), even though ScGS338 has a slightly higher sequence identity with these three enzymes than with selinadiene synthase. It is noteworthy that the three-dimensional structures of selinadiene synthase and the N-terminal domain of ScGS, each of which catalyzes the 1,10-cyclization of FPP to yield a (E,E)-germacradienyl cation intermediate, are more similar to each other than to structures of enzymes that catalyze alternative cyclization reactions. For example, pentalenene synthase catalyzes an initial 1,11-ring closure reaction of FPP, while EIZS and hedycaryol synthase catalyze net 1,6- and 1,10-ring closure reactions, respectively, of the rearranged intermediate nerolidyl diphosphate. Evidently, similarities in tertiary structure rather than primary structure more accurately indicate similar template functions that determine the manner by which these enzymes enforce the initial cyclization of FPP.

There are also notable distinctions between the quaternary structures of these cyclases. Selinadiene synthase crystallizes as a tetramer, or dimer of parallel dimers, in the asymmetric unit. EIZS forms crystallographic dimers that are oriented in anti-parallel fashion, but the crystallographic dimer interface differs from that of the ScGS338 dimer. Pentalenene synthase appears to be most similar to ScGS338, as it crystallizes as a dimer in the asymmetric unit. The ScGS338 dimer is reminiscent of the pentalenene synthase dimer, as both structures have a quasi-antiparallel orientation of subunits, and share a similar dimer interface along helices H and I.

Crystal structure of unliganded ScGS366

Crystals of the unliganded N-terminal cyclase domain ScGS366 were very poor in quality, as summarized in Materials and Methods. Crystals formed as very small needles, approximately 0.005 mm × 0.005 mm × 0.040 mm. X-ray diffraction data collected from these crystals were not highly complete, mainly due to rapid crystal decay in the X-ray beam. However, X-ray diffraction data collected from several crystals could be merged together with reasonably good agreement (CC1/2 = 0.896 in the highest resolution shell) to create a dataset of reflections with 88.7% overall completeness. These data were sufficiently useful to determine a moderate 3.2 Å resolution structure of the unliganded N-terminal domain of ScGS to facilitate analysis of structural differences with the liganded enzyme. The ScGS366 model was refined to Rwork/Rfree values of 0.240/0.288. A representative electron density map of the unliganded metal-binding motifs D86DHFLE91 and N229DLFSYQRE237 is shown in Figure 3.

Figure 3.

Figure 3

Electron density map calculated with Fourier coefficients 2|Fo|−|Fc| contoured at 1.0σ showing the active site cleft of unliganded ScGS366 at 3.2 Å resolution. Selected residues in the aspartate-rich and NSE metal-binding motifs are indicated (in the absence of bound Mg2+ ions, the side chain of E237 is disordered and hence truncated at Cβ). Despite technical challenges with crystal decay, refinement yielded a reasonable structure illustrating the open active site conformation of the cyclase.

The overall structure of unliganded ScGS366 is similar to that of ScGS338 in its complex with 3 Mg2+ ions and alendronate, with an r.m.s. deviation of 1.2 Å for 301 Cα atoms. However, in comparison with the liganded structure, notable conformational changes are evident that reflect active site closure upon ligand binding (Figure 4). Specifically, most of the helices shift 2–3 Å inward towards the active site upon ligand binding; in particular helix D, bearing the aspartate-rich motif, moves closer to helix H to optimize the geometry for coordination of Mg2+A and Mg2+C by D86. In contrast, metal-coordinating residues in the “NSE” motif on helix H adopt nearly identical conformations in the liganded and unliganded structures, so the Mg2+B binding site is essentially pre-formed for function. The most significant structural change upon ligand binding is the disorder-order transition for N168-A175 in helix F. This conformational transition contributes significant surface area to the enclosed active site contour and serves to fully close the active site during catalysis.

Figure 4.

Figure 4

Structural comparison of unliganded and liganded N-terminal domain structures of ScGS reveals conformational changes that accompany active site closure triggered by the binding of 3 Mg2+ ions and alendronate. These conformational changes are also expected to accompany the binding of 3 Mg2+ ions and substrate FPP.

A significant conformational change in the ligand-binding residues R325 and Y326 is also observed upon ligand binding to ScGS338. While these residues are oriented away from the active site in the unliganded state and make crystal contacts with E182 from a neighboring protein molecule, these residues both donate hydrogen bonds to the bisphosphonate moiety of alendronate in the liganded structure. These conformational changes are similar to those observed upon ligand binding to EIZS.60 In the structure of EIZS complexed with Mg2+3-PPi and the benzyltriethylammonium cation, conserved residues R338 and Y339 donate hydrogen bonds to the PPi anion. By contrast, in structures of unliganded EIZS, these residues are either disordered and not observed in the electron density, or are flipped outward, away from the active site, similar to the unliganded structure of ScGS366. Other conformational changes generally observed upon ligand binding to class I terpenoid cyclases, such as the ordering of the H-α-1 loop, α-1 helix, and the J-K loop (observed in EIZS), are not observed upon ligand binding to ScGS. Both the H-α-1 loop and the α-1 helix are ordered and in very similar positions in the structures of unliganded ScGS366 and liganded ScGS338. Electron density for the J-K loop is observed for almost identical portions of the sequence of ScGS in the unliganded and liganded structures (to K329 in the liganded structure, and to N328 in the unliganded structure), and in fact helix J is slightly longer in the unliganded structure, whereas the corresponding sequence uncoils to form the J-K loop in the liganded structure.

Homology model of the C-terminal fragmentation domain

Homology models of the C-terminal domain of ScGS based on the ScGS690 truncation calculated by SWISS-MODEL, I-TASSER, Phyre2, and HHPred/MODELLER are similar to one another (Figure S1, Supporting Information), with pairwise r.m.s. deviations ranging from 0.8–3.4 Å for 252–317 Cα atoms. Homology models of the full-length C-terminal domain calculated by I-TASSER are also similar (Figure S2, Supporting Information), with pairwise r.m.s. deviations ranging from 1.2–2.3 Å for 267–311 Cα atoms.

The α-helical topology of each model based on the ScGS690 truncation is nearly identical to that of ScGS338, as expected based on the approximately 36% amino acid sequence identity between the N-terminal and C-terminal domains. However, some minor differences are observed. Of the 7 homology models calculated, additional, short α-helices are predicted in the N-terminal segment preceding helix A (1 model), loop segments that connect helices A and B (1 model), helices D and E1 (2 models), helices G2 and H (5 models), helices I and J (2 models), and the C-terminal segment following helix J (3 models). Some helices present in ScGS338 are absent in the SWISS-MODEL homology model (helix C), or shorter in the HHPred/MODELLER model (helix H). Helix D is shorter in several of the models (three models generated by I-TASSER without using a template, one model generated by HHPred/MODELLER, and one model generated by Phyre2); consequently, the aspartate-rich metal-binding motif is located at the beginning of a loop segment in these models. In the SWISS-MODEL homology model, helix J is broken into three shorter α-helices connected by loop segments.

Following energy minimization using subroutines in PHENIX51 and evaluation with MolProbity,52 we judged the homology model generated by I-TASSER using the structure of ScGS338 as a template as the highest quality model based on molecular geometry statistics (Figure 5A). This model comprises a general reference point for understanding structure-activity relationships. Intact metal-binding motifs are located at the mouth of the active site: the aspartate-rich segment D455DYYP459 on helix D and the NSE motif N598DVFSYQKE606 on helix H. Although the aspartate-rich motif lacks the third aspartate residue that typically characterizes this motif, only the first aspartate coordinates to Mg2+ ions in the crystal structure of ScGS338, so this could also be the case for D455 in the C-terminal domain. On the other hand, one or both aspartates of the D455DYYP459 motif may well play a mechanistically distinct role as a general acid for proton-initiated cyclization of the rebound germacradieniol intermediate, as discussed below.

Figure 5.

Figure 5

Homology models of the C-terminal domain of ScGS690 (G374–R690) (A) and full-length ScGS (G374–H726) (B) generated by I-TASSER using the crystal structure of ScGS338 as a template. For reference, the locations of the asparate-rich motif (D455DYYP, helix D) and NSE motif (N598DVFS602YQKE606, helix H) are red and orange, respectively.

Since I-TASSER yielded what we judged to be the best homology model of the C-terminal domain based on the truncated construct ScGS690, we also used I-TASSER exclusively to generate 5 homology models of the full-length C-terminal domain. These models correspond to the 5 largest clusters of structures predicted by I-TASSER, and these models were similarly evaluated and optimized using MolProbity and PHENIX. Each model exhibits similar α-helical topology with that of ScGS338, with minor differences: additional, short α-helices are predicted in N-terminal segment before helix A (1 model), loop segments that connect helices G2 and H (4 models), and the C-terminal segment after helix J (3 models). In one of these models (model 5), helix I is shorter than in ScGS338, and an additional α-helix is inserted between helices I and J. Two of the α-helices found in ScGS338 are broken into two α-helices in the full-length C-terminal domain homology models: helix J (model 3) and helix D (model 1 and model 4). Consequently, the aspartate-rich motif of model 1 is partially located on a loop segment at the break in helix D. Of the 5 models generated, we judged model 1 to be the best based on MolProbity statistics (Figure 5B). This model contains two additional short α-helices when compared with ScGS338, lying in the loop segment between helices G2 and H and in the C-terminal segment after helix J, near the very end of the sequence (P722-T725). The break in helix D noted above causes the aspartate-rich motif of model 1 to lie partially on a loop segment at this break. The presence or absence of the 28-residue C-terminal segment does not appear to significantly impact the overall structure prediction, as might be expected for a disordered segment.

In the homology models of the C-terminal domain based on full-length ScGS or the truncated ScGS690 construct, the chemical nature of the active site pocket is mainly nonpolar, as observed for ScGS338. The ring faces of two aromatic residues (F556 and W688) in part define the active site contour, and these residues could potentially stabilize carbocation intermediate(s) in the cyclization cascade through cation-π interactions. Some partially polar residues are located in the active site pocket (C428, T452, and T561), but it is not clear that these residues could serve as general base/general acid residues in the retro-Prins fragmentation reaction catalyzed in this domain.

Low resolution SAXS structures of ScGS690 and full-length ScGS

Hypothesizing that the predicted disorder of the C-terminus of ScGS hindered crystallization of the full-length protein, the truncated construct ScGS690 was prepared and used for crystallization trials. Gas chromatographic-mass spectrometric analysis of cyclization products indicated that deletion of 36 residues from the C-terminus of ScGS and replacement with a His6-tag abolished catalytic activity in the C-terminal domain: the ScGS690 construct generated increased concentrations of germacradienol (the cyclization product of the N-terminal domain) but no geosmin (data not shown). Thus, even though the C-terminus of ScGS is most likely disordered, it is absolutely required for catalytic activity in the C-terminal domain. Even so, both ScGS690 and full-length ScGS proved to be excellent samples for small-angle X-ray scattering (SAXS) measurements.

SAXS data collected for ScGS690 and full-length ScGS show that both proteins are well-folded and monodisperse in solution, and scattering in the Guinier region is linear (Figures 6 and 7; structural parameters are recorded in Table S1, Supporting Information). The radius of gyration (Rg) calculated by PRIMUS using the Guinier approximation for ScGS690 is 31.3 Å, and Rg for full-length ScGS is 32.5 Å. The slight increase in Rg for full-length ScGS is consistent with a slightly larger scattering particle due to 28 additional residues at the C-terminus. Since the FoXS calculated Rg for ScGS338 is 18.8 Å, the measured Rg values for ScGS690 and full-length ScGS indicate a protein larger than a single α domain, i.e., a monomeric protein containing two α domains. Calculation of the pair-distance distribution function yields Dmax values of 110 Å and 117 Å, respectively, for ScGS690 and full-length ScGS. These Dmax values cannot distinguish between a monomer and a dimer. Inspection of the SAXS molecular envelopes for ScGS690 and full-length ScGS, however, strongly suggests a monomeric protein containing two α domains (Figures 6 and 7).

Figure 6.

Figure 6

(A) Best fit of the N-terminal domain (ScGS366, blue) and ScGS690 C-terminal domain homology model (light green) to the molecular envelope calculated from SAXS data collected from ScGS690, generated by the ShapeUp server. (B) Experimental SAXS scattering profile for ScGS690 (black circles) superimposed on the theoretical scattering profile for the αα domain model in (A) (red). For this model of αα domain assembly, the χ value calculated by FoXS is 4.47. Inset: The Guinier region is linear for SAXS data collected from ScGS690.

Figure 7.

Figure 7

(A) Best fit of the N-terminal domain (ScGS366, blue) and the full-length C-terminal domain homology model (forest green) to the molecular envelope calculated from SAXS data collected from full-length ScGS, generated by the ShapeUp server. (B) Experimental SAXS scattering profile for full-length ScGS (black circles) superimposed on the theoretical scattering profile for the αα domain model in (A) (red). For this model of αα domain assembly, the χ value calculated by FoXS is 2.64. Inset: The Guinier region is linear for SAXS data collected from full-length ScGS.

Using the experimentally determined crystal structures of the liganded and unliganded N-terminal domain of ScGS, and the best homology models of the C-terminal domain of ScGS690 (G374–R690) and full-length ScGS (G374–H726) generated by I-TASSER, we performed rigid body modeling with CORAL68 to generate 4 models illustrating potential domain assembly modes consistent with SAXS profiles. Using the FoXS server, the best models of ScGS690 and full-length ScGS were judged to be those with the lowest χ value (a lower χ value indicates a closer match between the experimental scattering profile and the calculated scattering profile for a given structure), and these models are illustrated in Figures 6 (χ = 4.47) and 7 (χ = 2.64), respectively. The FoXS calculated Rg values for the models in Figures 6 and 7 are 29.4 Å and 29.1 Å, respectively, consistent with the dimensions of a monomeric protein containing two α domains. The orientation of one α domain with respect to the other differs in these models. Thus, the active sites in the N-terminal and C-terminal domains are oriented in roughly parallel fashion in the model of ScGS690, but are oriented approximately 90° away from each other in the model of full-length ScGS. It is nonetheless interesting to note that the same general faces of the N-terminal and C-terminal domains appear to mediate αα domain assembly. Regardless of the specific orientation of one α domain with respect to the other, the molecular envelopes of ScGS690 and full-length ScGS are consistent with a model of αα domain assembly in which the active sites of the two domains are oriented away from each other rather than toward each other. Such a model would account for the lack of direct channeling of germacradienol between the two active sites. In fact, the observation of free germacradienol in solution is consistent with mandatory release of germacradienol from the N-terminal domain and diffusive rebinding to the C-terminal domain of the same or another geosmin synthase protein prior to conversion to geosmin.28

DISCUSSION

Because the first step of the cyclization cascade catalyzed by a class I terpenoid cyclase is always the ionization of the diphosphate group of FPP, the three-dimensional contour of the active site encodes all the information required to direct a unique sequence of carbon-carbon bond-forming reactions. The most important step in such a sequence is the initial macrocyclization reaction, since this sets the stage for all subsequent steps. The initial binding conformation of FPP is enforced by the active site contour and determines whether a 1,6- or 1,7-cyclization reaction occurs via an ionization-isomerization-reionization sequence, whether a 1,10-cyclization reaction occurs with or without allylic isomerization, or whether a direct 1,11-cyclization reaction will take place. The active site contours of enzymes that direct initial 1,6-cyclizations, such as epi-isozizaene synthase60 and trichodiene synthase,73 tend to be somewhat narrow and deep. In contrast, enzymes that direct 1,10-cyclization reactions, such as selinadiene synthase58 (via direct formation of a (E,E)-germacradienyl cation intermediate) and hedycaryol synthase62 (via initial isomerization to nerolidyl diphosphate followed by formation of a (Z,E)-helminthogermacradienyl cation), as well as an enzyme that directs a 1,11-cyclization reaction, pentalenene synthase,40 all appear to have wider, shallower active sites (Figure S3, Supporting Information).

Intriguingly, the N-terminal domain of ScGS has an active site contour that is deeper than that of other enzymes that catalyze initial 1,10-cyclization reactions, and it also has a wider neck than that of enzymes that catalyze initial 1,6-isomerization-cyclization reactions (Figure S3, Supporting Information). Thus, the three-dimensional contour of the ScGS N-terminal domain active site appears to be a structural hybrid in that it exhibits features of both general active site shapes. Additionally, the active site contour of the N-terminal domain branches at its base, forming a shape somewhat like that of a “mitten”, a feature that is not observed in the 1,10-cyclization active sites of either selinadiene synthase (which generates a germacradienyl cation from FPP) or hedycaryol synthase (which generates a helminthogermacradienyl cation via nerolidyl diphosphate). Otherwise, the chemical nature of the N-terminal domain active site is similar to that of other terpenoid cyclases in that it is mainly hydrophobic. The active site contains a few aromatic residues, one of which in particular (F83) has its ring face oriented so as to enable stabilization of carbocation intermediates by cation-π interactions.

In the active site of the N-terminal domain of ScGS, the diphosphate group of FPP is presumed to bind in a similar manner to the bisphosphonate group of alendronate as observed in its complex with ScGS338. With FPP locked in this orientation, there is sufficient active site volume for FPP to adopt the conformation required for the initial 1,10-cyclization reaction following ionization of the diphosphate group (Figure 8). The initially formed germacradienyl cation intermediate must be deprotonated to generate the novel isolepidozene intermediate. Proton-initiated ring opening of the vinylcyclopropyl moiety and quenching of the resultant homoallylic cation by water then affords germacradienol, the major product of the cyclization reaction catalyzed by the N-terminal domain. Possible general acids and/or general bases that may participate in this reaction might include E161, which is located near Mg2+C at the neck of the active site. Although E161 forms a salt link with R184, it appears to be ideally located to deprotonate Hb from the C1 atom to form isolepidozene. Other potential catalytic groups include H226 and H320, which are situated more deeply in the active site. Finally, since water is a co-substrate in the reaction, one or more of the active site water molecules observed in the ScGS338-Mg2+3-alendronate complex may remain trapped in the active site upon FPP binding. There is sufficient extra volume in the active site of the modeled enzyme-substrate complex to accommodate trapped solvent (Figure 8).

Figure 8.

Figure 8

Model of substrate FPP bound in the active site of the N-terminal domain of ScGS; the position of the FPP diphosphate group is based on the position of the bisphosphonate group of alendronate in the ScGS338-Mg2+3-alendronate complex. Protein atoms are color coded as follows: C = dark blue, N = blue, O = red, Mg2+ = magenta; for the FPP stick figure, C = yellow, P = orange, O = red. The active site contour is indicated by light gray meshwork, and a red dashed line indicates the trajectory of initial carbon-carbon bond formation between C1 and C10 of FPP. The additional volume in the lower active site may accommodate a trapped solvent molecule that quenches the final carbocation intermediate. Active site contour in meshwork created with VOIDOO.80

The C-terminal domain of ScGS (Figure 5) is predicted to adopt the fold of a class I terpenoid cyclase based on its high level of amino acid sequence identity with the N-terminal domain. However, the C-terminal domain of ScGS catalyzes an unprecedented cyclization-fragmentation involving a retro-Prins reaction resulting in fragmentation of the C15 substrate germacradienol to yield C12 geosmin and C3 acetone. The proton-initiated cyclization-fragmentation may be initiated by one or both of the conserved aspartate residues of the D455DYYP459 motif, consistent with the observation that the D455N/D456N double mutant of geosmin synthase no longer produces geosmin but accumulates only germacradienol and germacrene D, the characteristic products of the N-terminal domain. The aspartates of the DDYYP domain may therefore play a role in proton-initiated polyene cyclization similar to that of the conserved DXDD motif (general acid residue underlined) of typical class II diterpene synthases such as ent-copalyl diphosphate synthase,74,75 the triterpene cyclase squalene-hopene synthase,76 and oxidosqualene-lanosterol synthase, which contains a closely related DCTA motif.77

The cyclization-fragmentation reaction catalyzed by the C-terminal domain of geosmin synthase does not involve the ionization of an isoprenoid diphosphate such as FPP to generate the initial carbocation, the hallmark of catalysis by a class I terpenoid cyclase, yet cyclization-fragmentation nevertheless has an absolute requirement for Mg2+ ion(s).28 Catalysis by a class I terpenoid cyclase typically involves coordination and activation of the substrate diphosphate group by 3 Mg2+ ions to form a Mg2+3-diphosphate/PPi complex that remains bound for the duration of the cyclization reaction; the closed conformation of the enzyme active site is exemplified by that of the Mg2+3-alendronate complex with ScGS338 (Figure 4). Since germacradienol, the substrate for the C-terminal domain of ScGS, lacks a diphosphate group altogether and the reaction has been shown not to require inorganic pyrophosphate, neither the mechanistic nor the structural basis for the requirement for Mg2+ by the C-terminal domain is as yet apparent.

The crystal structure of the ScGS338-Mg2+3-alendronate complex and the amino acid sequence identity between the N- and C-terminal domains of ScGS suggest a possible structural basis for the Mg2+ requirement for catalysis by the C-terminal domain. Specifically, residues that coordinate to catalytic Mg2+ ions in the N-terminal domain are conserved in the C-terminal domain as D455 in the aspartate-rich motif, and N598, S602, and E606 in the “NSE” motif. Additionally, the three basic residues and the tyrosine residue that hydrogen bond with the bisphosphonate anion in the N-terminal domain are conserved in the C-terminal domain as R552, K605, R694, and Y695. Metal-binding residues as well as residues that hydrogen bond with diphosphate/PPi are highly conserved in all class I terpenoid cyclases.44 Conservation of these structural elements in the C-terminal domain of ScGS strongly suggests that this domain similarly binds one or more Mg2+ ions. The binding of Mg2+ might facilitate full active site closure of the C-terminal domain, just as it does for the N-terminal domain (Figure 4), but without interactions with a diphosphate group. As for all class I terpenoid cyclases, complete active site closure ensures the protection of reactive carbocation intermediates from premature quenching by bulk solvent. Notably, the more than 70 deduced or verified Streptomyces geosmin synthase sequences that have been reported to date in the protein databases exhibit an exceptionally high level of sequence conservation (60–85% sequence identity over more than 720 amino acids), with essentially 100% identity of the conserved aspartate-rich and NSE motifs in both the C-terminal and N-terminal domains.

The fragmentation reaction catalyzed in the C-terminal domain of ScGS utilizes germacradienol as a substrate. The first step is a proton-assisted cyclization, covalently linking C-2 and C-7 along with retro-Prins fragmentation to eliminate the 2-propanol side chain as acetone. The resulting bicyclic intermediate, octalin, has a trans-decalin-like configuration and subsequently undergoes protonation and a hydride shift to yield a tertiary carbocation, which is quenched by a water molecule to yield the final product alcohol, geosmin, which similarly adopts a trans-decalin-like configuration. It is highly unusual for a cyclization reaction to be initiated by protonation in a class I terpenoid cyclase; ordinarily, such a protonation is the chemical strategy for cyclization adopted by a class II terpenoid cyclase, which adopts a completely unrelated protein fold.21

Both a general acid and general base are required for the retro-Prins fragmentation reaction catalyzed by the C-terminal domain of ScGS (Figure 1), yet analysis of the homology model reveals a relative dearth of chemical functionality within the mainly nonpolar active site pocket beyond the conserved D455 and D456 residues. On the other hand, in the crystal structure of the ScGS338-Mg2+3-alendronate complex with the N-terminal domain, several ordered solvent molecules are trapped in the active site. Corresponding solvent molecules might conceivably remain bound in the active site of the C-terminal domain upon the binding of the substrate germacradienol, with one of these waters ultimately quenching the final carbocation intermediate, as shown in Figure 1.

Finally, rigid body modeling with SAXS data provides a plausible model for the assembly of the N- and C-terminal α domains of ScGS. Manual docking of the N-terminal domain structure and the homology model of the C-terminal domain yielded poor fits to the scattering profile. However, CORAL generated models with better fits, although not all of these models made good chemical sense. For example, domain interactions in some models appear to be mediated solely by loops rather than secondary structural elements (data not shown). Intriguingly, depending on the flexibility of the linker connecting the N-terminal domain and the C-terminal domain, multiple domain orientations could be feasible. While a single αα domain orientation cannot be definitively established, it is clear that the two α domains interact with each other through a substantial interface based on the cross-sections of the molecular envelopes shown in Figures 6 and 7. Although it could be argued that αα domain assembly in full-length ScGS ought to mimic αα dimer assembly as observed in the crystal structure of ScGS338, this particular αα assembly mode does not yield a satisfactory fit to the scattering data and yields much higher χ value of 9.96 (Figure S4, Supporting Information).

Regardless of the orientation of one domain to the other in the αα assembly, there is no channel specifically formed between the two active sites. Although bifunctional catalysis might be facilitated by a simple proximity or clustering effect,78,79 it is not clear that germacradienol generated in the N-terminal domain is captured by the C-terminal domain of the same or a different geosmin synthase molecule. Future studies will continue to probe these aspects of catalysis and the role of domain architecture in bifunctional catalysis by ScGS.

Supplementary Material

Supporting Information

Acknowledgments

This work is based upon research conducted at the Northeastern Collaborative Access Team beamlines, which are funded by the National Institute of General Medical Sciences from the National Institutes of Health (P41 GM103403). This research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357. We also thank the Stanford Synchrotron Radiation Lightsource for access to beamline 4–2 for SAXS experiments. Additionally, we thank the SIBYLS beamline at the Advanced Light Source, for additional SAXS measurements. The Advanced Light Source is supported by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Finally, D.W.C thanks the Radcliffe Institute for Advanced Study for the Elizabeth S. and Richard M. Cashin Fellowship.

Funding

Supported by National Institutes of Health (NIH) Grants GM56838 to D.W.C. and GM30301 to D.E.C., and an NIH Structural Biology and Molecular Biophysics Training Grant to G.G.H.

Abbreviations

BME

β-mercaptoethanol

EIZS

epi-isozizaene synthase

FPP

farnesyl diphosphate

IPTG

isopropyl-β-D-1-thiogalactopyranoside

LB

Lysogeny Broth

PPi

inorganic pyrophosphate

SAXS

small-angle X-ray scattering

ScGS

full-length geosmin synthase from Streptomyces coelicolor

ScGS690

recombinant geosmin synthase construct containing residues 1–690

ScGS338

geosmin synthase N-terminal domain containing residues 1–338 resulting from proteolysis of ScGS690

ScGS366

recombinant geosmin synthase N-terminal domain construct containing residues 1–366

Footnotes

Supporting Information

Table S1, SAXS data collection statistics and structural parameters; Figure S1, homology models of the C-terminal domain of ScGS690 generated by various modeling programs; Figure S2, homology models of the C-terminal domain of full-length ScGS generated by I-TASSER; Figure S3, comparison of sesquiterpene cyclase active site contours; Figure S4, best fit of the ScGS338 dimer to the SAXS-derived molecular envelope of full-length ScGS. This information is available free of charge via the Internet at http://pubs.acs.org.

Accession Codes

The atomic coordinates and structure factors of the ScGS338-Mg2+3-alendronate complex and unliganded ScGS366 have been deposited in the Protein Data Bank (www.rcsb.org) with accession codes 5DZ2 and 5DW7, respectively.

Notes

The authors declare no competing financial interests.

References

  • 1.Dictionary of Natural Products. http://dnp.chemnetbase.com.
  • 2.Trapp SC, Croteau RB. Genomic organization of plant terpene synthases and molecular evolutionary implications. Genetics. 2001;158:811–832. doi: 10.1093/genetics/158.2.811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tholl D. Terpene synthases and the regulation, diversity and biological roles of terpene metabolism. Curr Opin Plant Biol. 2006;9:297–304. doi: 10.1016/j.pbi.2006.03.014. [DOI] [PubMed] [Google Scholar]
  • 4.Gershenzon J, Dudareva N. The function of terpene natural products in the natural world. Nat Chem Biol. 2007;3:408–414. doi: 10.1038/nchembio.2007.5. [DOI] [PubMed] [Google Scholar]
  • 5.Farco JA, Grundmann O. Menthol–pharmacology of an important naturally medicinal “cool”. Mini Rev Med Chem. 2013;13:124–131. [PubMed] [Google Scholar]
  • 6.Gertsch J, Leonti M, Raduner S, Racz I, Chen JZ, Xie XQ, Altmann KH, Karsak M, Zimmer A. Beta-caryophyllene is a dietary cannabinoid. Proc Natl Acad Sci USA. 2008;105:9099–9104. doi: 10.1073/pnas.0803601105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Cometto-Muñiz JE, Cain WS, Abraham MH, Kumarsingh R. Trigeminal and olfactory chemosensory impact of selected terpenes. Pharmacol Biochem Behav. 1998;60:765–770. doi: 10.1016/s0091-3057(98)00054-9. [DOI] [PubMed] [Google Scholar]
  • 8.Schalk M, Pastore L, Mirata MA, Khim S, Schouwey M, Deguerry F, Pineda V, Rocci L, Daviet L. Toward a biosynthetic route to sclareol and amber odorants. J Am Chem Soc. 2012;134:18900–18903. doi: 10.1021/ja307404u. [DOI] [PubMed] [Google Scholar]
  • 9.Renninger N, McPhee D. Fuel compositions comprising farnesane and farnesane derivatives and method of making and using same. 7,399,323 US patent. 2008
  • 10.Peralta-Yahya PP, Ouellet M, Chan R, Mukhopadhyay A, Leasling JD, Lee TS. Identification and microbial production of a terpene-based advanced biofuel. Nat Commun. 2011;2:483. doi: 10.1038/ncomms1494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Miller LH, Su X. Artemisinin: Discovery from the Chinese herbal garden. Cell. 2011;146:855–858. doi: 10.1016/j.cell.2011.08.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Schiff PB, Fant J, Horwitz SB. Promotion of microtubule assembly in vitro by taxol. Nature. 1979;277:665–667. doi: 10.1038/277665a0. [DOI] [PubMed] [Google Scholar]
  • 13.Ajikumar PK, Tyo K, Carlsen S, Mucha O, Phon TH, Stephanopoulos G. Terpenoids: Opportunities for biosynthesis of natural product drugs using engineered microorganisms. Mol Pharm. 2008;5:167–190. doi: 10.1021/mp700151b. [DOI] [PubMed] [Google Scholar]
  • 14.Bohlmann J. Terpenoid synthases – from chemical ecology and forest fires to biofuels and bioproducts. Structure. 2011;19:1730–1731. doi: 10.1016/j.str.2011.11.009. [DOI] [PubMed] [Google Scholar]
  • 15.Peralta-Yahya P, Zhang F, del Cardayre SB, Keasling JD. Microbial engineering for the production of advanced biofuels. Nature. 2012;488:320–328. doi: 10.1038/nature11478. [DOI] [PubMed] [Google Scholar]
  • 16.Smanski MJ, Peterson RM, Huang SX, Shen B. Bacterial diterpene synthases: New opportunities for mechanistic enzymology and engineered biosynthesis. Curr Op Chem Biol. 2012;16:132–141. doi: 10.1016/j.cbpa.2012.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Poulter CD, Rilling HC. The prenyl transfer reaction. Enzymatic and mechanistic studies of the 1′-4 coupling reaction in the terpene biosynthetic pathway. Acc Chem Res. 1978;11:307–313. [Google Scholar]
  • 18.Poulter CD. Farnesyl diphosphate synthase. A paradigm for understanding structure and function relationships in E-polyprenyl diphosphate synthases. Phytochem Rev. 2006;5:17–26. [Google Scholar]
  • 19.Cane DE. Isoprenoid biosynthesis. Stereochemistry of the cyclization of allylic pyrophosphates. Acc Chem Res. 1985;18:220–226. [Google Scholar]
  • 20.Cane DE. Enzymatic formation of sesquiterpenes. Chem Rev. 1990;90:1089–1103. [Google Scholar]
  • 21.Christianson DW. Structural biology and chemistry of the terpenoid cyclases. Chem Rev. 2006;106:3412–3442. doi: 10.1021/cr050286w. [DOI] [PubMed] [Google Scholar]
  • 22.Christianson DW. Unearthing the roots of the terpenome. Curr Opin Chem Biol. 2008;12:141–150. doi: 10.1016/j.cbpa.2007.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Austin MB, O’Maille PE, Noel JP. Evolving biosynthetic tangos negotiate mechanistic landscapes. Nat Chem Biol. 2008;4:217–222. doi: 10.1038/nchembio0408-217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gao Y, Honzatko RB, Peters RJ. Terpenoid synthase structures: A so far incomplete view of complex catalysis. Nat Prod Rep. 2012;29:1153–1175. doi: 10.1039/c2np20059g. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gerber NN. Geosmin, from microorganisms, is trans-1,10-dimethyl-trans-9-decalol. Tet Lett. 1968:2971–2974. [Google Scholar]
  • 26.Bentley R, Meganathan R. Geosmin and methylisoborneol biosynthesis in streptomyces. Evidence for an isoprenoid pathway and its absence in non-differentiating isolates. FEBS Lett. 1981;125:220–222. doi: 10.1016/0014-5793(81)80723-5. [DOI] [PubMed] [Google Scholar]
  • 27.Jiang J, He X, Cane DE. Geosmin biosynthesis. Streptomyces coelicolor germacradienol/germacrene D synthase converts farnesyl diphosphate to geosmin. J Am Chem Soc. 2006;128:8128–8129. doi: 10.1021/ja062669x. [DOI] [PubMed] [Google Scholar]
  • 28.Jiang J, He X, Cane DE. Biosynthesis of the earthy odorant geosmin by a bifunctional Streptomyces coelicolor enzyme. Nat Chem Biol. 2007;3:711–715. doi: 10.1038/nchembio.2007.29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Gerber NN. Volatile substances from actinomyces: Their role in the odor pollution of water. CRC Crit Rev Microbiol. 1979;7:191–214. doi: 10.3109/10408417909082014. [DOI] [PubMed] [Google Scholar]
  • 30.Buttery RG, Garibaldi JA. Geosmin and methylisoborneol in garden soil. J Agric Food Chem. 1976;24:1246–1247. [Google Scholar]
  • 31.Tyler LD, Acree TE, Becker RF, Nelson RR, Butts RM. Effect of maturity, cultivar, field history, and the operations of peeling and coring on the geosmin content of Beta vulgaris. J Agric Food Chem. 1978;26:1466–1469. [Google Scholar]
  • 32.Heil TP, Lindsay RC. Volatile compounds in flavor-tainted fish from the Upper Wisconsin River. J Environ Sci Health B. 1988;23:489–512. doi: 10.1080/03601238809372621. [DOI] [PubMed] [Google Scholar]
  • 33.Jardine CG, Gibson N, Hrudey SE. Detection of odour and health risk perception of drinking water. Water Sci Technol. 1999;40:91–98. [Google Scholar]
  • 34.Schrader KK, Dennis ME. Cyanobacteria and earthy/musty compounds found in commercial catfish (Ictalurus punctatus) ponds in the Mississippi Delta and Mississippi-Alabama Blackland Prairie. Water Res. 2005;39:2807–2814. doi: 10.1016/j.watres.2005.04.044. [DOI] [PubMed] [Google Scholar]
  • 35.Darriet P, Pons M, Lamy S, Dubourdieu D. Identification and quantification of geosmin, an earthy odorant contaminating wines. J Agric Food Chem. 2000;48:4835–4838. doi: 10.1021/jf0007683. [DOI] [PubMed] [Google Scholar]
  • 36.Cane DE, Watt RM. Expression and mechanistic analysis of a germacradienol synthase from Streptomyces coelicolor implicated in geosmin biosynthesis. Proc Natl Acad Sci USA. 2003;100:1547–1551. doi: 10.1073/pnas.0337625100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.He X, Cane DE. Mechanism and stereochemistry of the germacradienol/germacrene D synthase of Streptomyces coelicolor A3(2) J Am Chem Soc. 2004;126:2678–2679. doi: 10.1021/ja039929k. [DOI] [PubMed] [Google Scholar]
  • 38.Jiang J, Cane DE. Geosmin biosynthesis. Mechanism of the fragmentation-rearrangement in the conversion of germacradienol to geosmin. J Am Chem Soc. 2008;130:428–429. doi: 10.1021/ja077792i. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Peters RJ, Ravn MM, Coates RM, Croteau RB. Bifunctional abietadiene synthase: free diffusive transfer of the (+)-copalyl diphosphate intermediate between two distinct active sites. J Am Chem Soc. 2001;123:8974–8978. doi: 10.1021/ja010670k. [DOI] [PubMed] [Google Scholar]
  • 40.Lesburg CA, Zhai G, Cane DE, Christianson DW. Crystal structure of pentalenene synthase: Mechanistic insights on terpenoid cyclization reactions in biology. Science. 1997;277:1820–1824. doi: 10.1126/science.277.5333.1820. [DOI] [PubMed] [Google Scholar]
  • 41.Starks CM, Back K, Chappell J, Noel JP. Structural basis for cyclic terpene biosynthesis by tobacco 5-epi-aristolochene synthase. Science. 1997;277:1815–1820. doi: 10.1126/science.277.5333.1815. [DOI] [PubMed] [Google Scholar]
  • 42.Köksal M, Jin Y, Coates RM, Croteau R, Christianson DW. Taxadiene synthase structure and evolution of modular architecture in terpene biosynthesis. Nature. 2011;469:116–120. doi: 10.1038/nature09628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Oldfield E, Lin FY. Terpene biosynthesis: Modularity rules. Angew Chem Int Ed. 2012;51:1124–1137. doi: 10.1002/anie.201103110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Aaron JA, Christianson DW. Trinuclear metal clusters in catalysis by terpenoid synthases. Pure Appl Chem. 2010;82:1585–1597. doi: 10.1351/PAC-CON-09-09-37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT. The DISOPRED server for the prediction of protein disorder. Bioinformatics. 2004;20:2138–2139. doi: 10.1093/bioinformatics/bth195. [DOI] [PubMed] [Google Scholar]
  • 46.Karplus PA, Diederichs K. Linking crystallographic model and data quality. Science. 2012;336:1030–1033. doi: 10.1126/science.1218231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Diederichs K, Karplus PA. Better models by discarding data? Acta Cryst. 2013;D69:1215–1222. doi: 10.1107/S0907444913001121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Otwinowski Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. In: Carter CW Jr, Sweet RM, editors. Methods in Enzymology Macromolecular Crystallography (Part A) Vol. 276. Academic Press; New York: 1997. pp. 307–326. [DOI] [PubMed] [Google Scholar]
  • 49.Terwilliger TC, DiMaio F, Read RJ, Baker D, Bunkoci G, Adams PD, Grosse-Kunstleve RW, Afonine PV, Echols N. phenix.mr_rosetta: molecular replacement and model rebuilding with Phenix and Rosetta. J Struct Funct Genomics. 2012;13:81–90. doi: 10.1007/s10969-012-9129-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta Crystallogr. 2010;D66:486–501. doi: 10.1107/S0907444910007493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Adams PD, Afonine PV, Bunkoczi G, Chen VB, Davis IW, Echols N, Headd JJ, Hung LW, Kapral GJ, Grosse-Kunstleve RW, McCoy AJ, Moriarty NW, Oeffner R, Read RJ, Richardson DC, Richardson JS, Terwilliger TC, Zwart PH. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr, Sec. 2010;D66:213–221. doi: 10.1107/S0907444909052925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Chen VB, Arendall WB, III, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Cryst. 2010;D66:12–21. doi: 10.1107/S0907444909042073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Cassarino TG, Bertoni M, Bordoli L, Schwede T. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nuc Acids Res. 2014;42:W252–W258. doi: 10.1093/nar/gku340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010;5:725–738. doi: 10.1038/nprot.2010.5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc. 2015;10:845–858. doi: 10.1038/nprot.2015.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Söding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nuc Acids Res. 2005;33:W244–W248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Šali A, Potterton L, Yuan F, van Vlijmen H, Karplus M. Evaluation of comparative protein modeling by MODELLER. Proteins. 1995;23:318–326. doi: 10.1002/prot.340230306. [DOI] [PubMed] [Google Scholar]
  • 58.Baer P, Rabe P, Fischer K, Citron CA, Klapschinski TA, Groll M, Dickschat JS. Induced-fit mechanism in class I terpene cyclases. Angew Chem Int Ed. 2014;53:7652–7656. doi: 10.1002/anie.201403648. [DOI] [PubMed] [Google Scholar]
  • 59.Wu S, Zhang Y. LOMETS: A local meta-threading-server for protein structure prediction. Nucleic Acids Res. 2007;35:3375–3382. doi: 10.1093/nar/gkm251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Aaron JA, Lin X, Cane DE, Christianson DW. Structure of Epi-Isozizaene Synthase from Streptomyces coelicolor A3(2), a Platform for New Terpenoid Cyclization Templates. Biochemistry. 2010;49:1787–1797. doi: 10.1021/bi902088z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Li R, Chou WKW, Himmelberger JA, Litwin KM, Harris GG, Cane DE, Christianson DW. Reprogramming the Chemodiversity of Terpenoid Cyclization by Remolding the Active Site Contour of epi-Isozizaene Synthase. Biochemistry. 2014;53:1155–1168. doi: 10.1021/bi401643u. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Baer P, Rabe P, Citron CA, de Oliveira Mann CC, Kaufmann N, Groll M, Dickschat JS. Hedycaryol synthase in complex with nerolidol reveals terpene cyclase mechanism. ChemBioChem. 2014;15:213–216. doi: 10.1002/cbic.201300708. [DOI] [PubMed] [Google Scholar]
  • 63.Seemann M, Zhai G, de Kraker JW, Paschall CM, Christianson DW, Cane DE. Pentalenene Synthase. Analysis of Active Site Residues by Site-Directed Mutagenesis. J Am Chem Soc. 2002;124:7681–7689. doi: 10.1021/ja026058q. [DOI] [PubMed] [Google Scholar]
  • 64.Konarev PV, Volkov VV, Sokolova AV, Koch MHJ, Svergun DI. PRIMUS: a Windows PC-based system for small-angle scattering data analysis. J Appl Crystallogr. 2003;36:1277–1282. [Google Scholar]
  • 65.Svergun DI. Determination of the regularization parameter in indirect-transform methods using perceptual criteria. J Appl Crystallogr. 1992;25:495–503. [Google Scholar]
  • 66.Liu H, Hexemer A, Zwart PH. The Small Angle Scattering ToolBox (SASTBX): an open-source software for biomolecular small-angle scattering. J Appl Crystallogr. 2012;45:587–593. [Google Scholar]
  • 67.Schneidman-Duhovny D, Hammel M, Sali A. FoXS: a web server for rapid computation and fitting of SAXS profiles. Nucleic Acids Res. 2010;38:W540–W544. doi: 10.1093/nar/gkq461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Petoukhov MV, Franke D, Shkumatov AV, Tria G, Kikhney AG, Gajda M, Gorba C, Mertens HDT, Konarev PV, Svergun DI. New developments in the ATSAS program package for small-angle scattering data analysis. J Appl Crystallogr. 2012;45:342–350. doi: 10.1107/S0021889812007662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Tarshis LC, Yan M, Poulter CD, Sacchettini JC. Crystal structure of recombinant farnesyl diphosphate synthase at 2.6-Å resolution. Biochemistry. 1994;33:10871–10877. doi: 10.1021/bi00202a004. [DOI] [PubMed] [Google Scholar]
  • 70.Ezra A, Golomb G. Administration routes and delivery systems of bisphosphonates for the treatment of bone resorption. Adv Drug Delivery Rev. 2000;42:175–195. doi: 10.1016/s0169-409x(00)00061-2. [DOI] [PubMed] [Google Scholar]
  • 71.Köksal M, Chou WKW, Cane DE, Christianson DW. Structure of 2-methylisoborneol synthase from Streptomyces coelicolor and implications for the cyclization of a noncanonical C-methylated monoterpenoid substrate. Biochemistry. 2012;51:3011–3020. doi: 10.1021/bi201827a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Holm L, Rosenström P. Dali server: conservation mapping in 3D. Nuc Acids Res. 2010;38:W545–W549. doi: 10.1093/nar/gkq366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Rynkiewicz MJ, Cane DE, Christianson DW. Structure of trichodiene synthase from Fusarium sporotrichioides provides mechanistic inferences on the terpene cyclization cascade. Proc Natl Acad Sci USA. 2001;98:13543–13548. doi: 10.1073/pnas.231313098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Prisic S, Xu J, Coates RM, Peters RJ. Probing the role of the DXDD motif in class II diterpene cyclases. Chembiochem. 2007;8:869–874. doi: 10.1002/cbic.200700045. [DOI] [PubMed] [Google Scholar]
  • 75.Köksal M, Hu H, Coates RM, Peters RJ, Christianson DW. Structure and mechanism of the diterpene cyclase ent-copalyl diphosphate synthase. Nature Chem Biol. 2011;7:431–433. doi: 10.1038/nchembio.578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Wendt KU, Poralla K, Schulz GE. Structure and function of a squalene cyclase. Science. 1997;277:1811–1815. doi: 10.1126/science.277.5333.1811. [DOI] [PubMed] [Google Scholar]
  • 77.Thoma R, Schulz-Gasch T, D’Arcy B, Benz J, Aebi J, Dehmlow H, Hennig M, Stihle M, Ruf A. Insight into steroid scaffold formation from the structure of human oxidosqualene cyclase. Nature. 2004;432:118–122. doi: 10.1038/nature02993. [DOI] [PubMed] [Google Scholar]
  • 78.Brodelius M, Lundgren A, Mercke P, Brodelius PE. Fusion of farnesyldiphosphate synthase and epi-aristolochene synthase, a sesquiterpene cyclase involved in capsidiol biosynthesis. Eur J Biochem. 2002;269:3570–3577. doi: 10.1046/j.1432-1033.2002.03044.x. [DOI] [PubMed] [Google Scholar]
  • 79.Castellana M, Wilson MZ, Xu Y, Joshi P, Cristea IM, Rabinowitz JD, Gitai Z, Wingreen NS. Enzyme clustering accelerates processing of intermediates through metabolic channeling. Nat Biotechnol. 2014;32:1011–1018. doi: 10.1038/nbt.3018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Kleywegt GJ, Jones TA. Detection, Delineation, Measurement and Display of Cavities in Macromolecular Structures. Acta Crystallogr, Sec. 1994;D50:178–185. doi: 10.1107/S0907444993011333. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

RESOURCES