Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Mar 20.
Published in final edited form as: Nat Prod Rep. 2024 Mar 20;41(3):402–433. doi: 10.1039/d3np00033h

Class II terpene cyclases: structures, mechanisms, and engineering

Xingming Pan a, Jeffrey D Rudolf b, Liao-Bin Dong a
PMCID: PMC10954422  NIHMSID: NIHMS1953056  PMID: 38105714

Abstract

Terpene cyclases (TCs) catalyze some of the most complicated reactions in nature and are responsible for creating the skeletons of more than 95,000 terpenoid natural products. The canonical TCs are divided into two classes according to their structures, functions, and mechanisms. The class II TCs mediate acid-base-initiated cyclization reactions of isoprenoid diphosphates, terpenes without diphosphates (e.g., squalene or oxidosqualene), and prenyl moieties on meroterpenes. The past twenty years witnessed the emergence of many class II TCs, their reactions and their roles in biosynthesis. Class II TCs often act as one of the first steps in the biosynthesis of biologically active natural products including the gibberellin family of phytohormones and fungal meroterpenoids. Due to their mechanisms and biocatalytic potential, TCs elicit fervent attention in the biosynthetic and organic communities and provide great enthusiasm for enzyme engineering to construct novel and bioactive molecules. To engineer and expand the structural diversities of terpenoids, it is imperative to fully understand how these enzymes generate, precisely control, and quench the reactive carbocation intermediates. In this review, we summarize class II TCs from nature, including sesquiterpene, diterpene, triterpene, and fungal meroterpenoid cyclases (MTCs) as well as noncanonical class II TCs and inspect their sequences, structures, mechanisms, and structure-guided engineering studies.

Graphical Abstract

This review offers an overview of the canonical and noncanonical class II TCs, including sesquiterpene, diterpene, triterpene, and meroterpenoid cyclases. It delves into their sequences, structures, mechanisms, and engineering studies.

graphic file with name nihms-1953056-f0001.jpg

1. Class II TCs

Terpenoids constitute the largest family of natural products with over 95,000 known compounds.1 Backed by their complex structures and polycyclic systems, terpenoids possess impressive biological activities, such as the anticancer drug taxol,2 antimalarial agent artemisinin,3 and the antifungal polygodial.4,5 Terpenoids are synthesized from the foundational C5 isoprenoid building blocks, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). These precursors originate from one of two primary pathways. The mevalonate (MVA) pathway begins with acetyl-CoA and encompasses seven distinct steps.6,7 Conversely, the non-mevalonate pathway, or methyl erythritol phosphate (MEP) pathway, starts with the fusion of pyruvate and d-glyceraldehyde 3-phosphate and consists of eight steps.8 IPP and DMAPP serve as direct alkylating agents or substrates for extending the prenyl diphosphates in increments of five carbon atoms.9,10 This polymerization creates the acyclic allylic diphosphates geranyl diphosphate (GPP, C10), farnesyl diphosphate (FPP, C15), and geranylgeranyl diphosphate (GGPP, C20). Terpenoid classification is based on carbon count, and while the aforementioned diphosphates are common, others like geranylfarnesyl diphosphate (GFPP, C25),11 hexaprenyl diphosphate (HexPP, C30),12 and heptaprenyl diphosphate (HepPP, C35)13 or even more also enrich the vast terpenoid spectrum.14

At the center of the biosynthetic routes of these fascinating molecules are the terpene cyclases (TCs, also referred to as terpene synthases); in this review, we specifically address enzymes that catalyze cyclization reactions, thus favouring the term 'terpene cyclases' over the broader concept of terpene synthases.9 TCs are responsible for cyclizing the acyclic isoprenoid precursors into (poly)cyclic skeletons.1,15 By abstracting the diphosphate group of GPP, FPP, and GGPP, respectively, class I TCs commonly forge the corresponding monoterpenes, sesquiterpenes, and diterpenes.1,16,17 Some class I TCs can cyclize the GFPP and HexPP, although they are rare.11,12,18 In contrast to class I TCs, canonical class II TCs initiate cyclization by protonating either a double bond or epoxide of the substrate, leaving any present diphosphate group intact. Class II TCs can also act on terpene moieties that were previously appended onto non-terpenoids such as polyketides or indoles. These appropriately named meroterpenoid cyclases (MTCs) differ from canonical class II TCs and frequently act on products of post-prenylation that differs from the aforementioned prenylation aimed for chain extension.1,19-21

The superfamily of class II TCs encompass a diversity of sequences and functions including sesquiterpene, diterpene, triterpene, and MTCs as well as noncanonical class II TCs. To exemplify the evolutionary relationships amongst class II TCs, a phylogenetic tree was constructed (Fig. 1).22 The unique characteristics of class II TCs, including a dominant functional β domain, a central acidic residue, and a protonation-initiated catalytic mechanism, collectively define class II TCs. Whereas class I TCs possess an Asp-rich, commonly DDxxD, and NSE/DTE motifs to assist in abstraction of the diphosphate, class II TCs rely on a central acidic residue, often found in a DxDD motif, to initiate its reaction by donating a proton to the substrate when acting directly on olefins. However, this motif has been heavily modified in TCs acting on epoxidized substrates, such as the oxidosqualene cyclases (OSCs).20 Most canonical class II TCs adopt the central Asp mechanism, although some noncanonical class II TCs, such as cytochrome P450 (P450s) and V5+-dependent haloperoxidases (VHPOs), perform terpene cyclization via class II-like mechanisms without protonation from an acidic Asp.9

Fig. 1.

Fig. 1

The primary amino acid sequences of representative class II TCs were used to construct a neighbor-joining phylogenetic tree. The primary sequence alignment was generated through ClustalW and the tree was built in MEGA 11.0, further modified in tvBOT.173 The accession numbers are indicated in parentheses. UniProt accession numbers: AoDMS (Q2UEK4), CVBPO (Q8LLW7), VrtK (D7PHY9), NapH1 (A7KH27), DmtA1 (A0A343VTS1), MacJ (A0A2P1DP74), OlcD (S7ZWB0), Trt1 (Q0C8A7), AdrI (A0A1Y0BRF5), AusL (Q5AR23), PrhH (A0A1E1FFM9), AtS2B (A0A1L9NBU5), AfB (B8NIK0), Pyr4 (Q4WLD2), CdmG (A0A3G9H8P0), AtS5B1 (A0A1L9N065), PaxB (E3UBL6), MstE (A0A2D1CM82), AncC (AncC), AsDMS (A0A1M6CXF0), AcDMS (A0A0U5GNT1), SsDMS (A0A2P2GK84), ScDMS (F8JSS1), HsOSC (P48449), AaSHC (P33247), ScOSC (P38604), AgAS (Q38710), OsCPS2 (Q5MQ85), SmCPS (B8PQ84), AtCPS (Q38802), Rv3377c (O50406), PvCPS (A0A348FUE1), Tpn2 (A0A2M9LDX2), PtmT2 (D8L2U6), SdCPS2 (A0A1S5RW73), TwTPS14 (A0A0M5L832), CpPS (A0A2L0VXR0), OsCPS1 (Q6ET36), PpCPS/KS (A5A8G0), MvPPPS (A0A075FAK4), LCD (A0A125SXN2), LCC (A0A125SXN1), TylF (A0A1Y0RDB4), AscF (A0A455RAK9), AscI (A0A455R4Z0), CtvD (Q0C9L5), NtnI (A0A455LRW3), AtnI (A0A455LN86); The NCBI accession numbers: EtCPS (WP_020322919.1), GfCPS (CAA75244.1), HomoS (XP_025551549.1), HomoB (XP_025551548.1), FumiS1 (EDP47976.1), FumiB (EDP47980.1), AlliS (XP_031902717.1), AlliB (XP_031902718.1), SAD1 (CAC84558.1); The others are retrieved from the original papers: LCE,121 InsB2,133 InsA7,159 TelSHC.157 The TTCs clade is coloured in blue; the DTCs clade is coloured in red; the STCs clade is shown in purple; the MTCs clades is coloured in green; the noncanonical cyclases clade is coloured in sand.

The determination of a trio of TC structures, including two class I TCs (5-epi-aristolochene synthase and pentalenene synthase), and squalene hopene cyclase (SHC), in 1997 was a milestone in terpene biochemistry, revealing an understanding of the complexity of TC-catalyzed terpene skeleton formation.16,17,19 All canonical TCs, whether belonging to class I or II, exhibit all α-helical-folds as revealed by the first three X-ray crystal structures.23 These structures generally consist of α, β, and γ domains in varying combinations.1 It should be noted that the membrane-bound UbiA-type cyclases also share similar all α-helical-folds structures as some of the class I TCs.24-28 The catalytic domain of class I TCs is designated as the α-domain and contains the metal-binding and catalytic DDxxD and NSE/DTE motifs;16,17,23,29 structurally, the class II TCs are often two α-helical domains, designated as β and γ domains, in nature positioning the substrate in the cleft of its βγ domain and employing the functional β domain, which contains the catalytic DxDD motif.19,23,29 So far, the only known monodomain class II TC is the MDTC MstE that has its active site located in the interior of the β domain.30 Some TCs contain all three domains and can be monofunctional (either class I or class II activity) or bifunctional (both class I and class II activities).2,31,32

This intricate process of TC-catalyzed cyclization encompasses the dynamic generation of carbocations and their eventual quenching, which involves a series of complex changes in the active site. Through the exploration of class II TC structures, a deeper understanding of these mechanisms can be attained.31,33,34 Prior to protonation, the active site undergoes changes to accommodate the substrate, such as movement of aromatic residues, the approach of a diphosphate sensor, and alterations in the binding site of Mg2+ (ref. 33). During the cyclization process, aromatic residues help to stabilize the carbocationic intermediates. Finally, a catalytic base, which is typically either a water molecule or a residue within the active site, quenches the final carbocation via deprotonation; in some cases, a water molecule quenches the carbocation via direct attack.35,36

Due to their mechanisms and biocatalytic potential, class II TCs attract great attention in both the biosynthesis and organic synthesis communities. Class II TCs exist exclusively in the early-stage of natural product biosynthetic routes.5,37-42 The diphosphate group present in the cyclized ring system generated by class II diterpene cyclases (DTCs) or sesquiterpene cyclases (STCs) is known to facilitate the formation of carbocations, which can be further leveraged in synthetic chemistry approaches. This property makes a cyclized terpene skeleton bearing a diphosphate group an attractive starting point for carbocation chemistry and subsequent functional group decorations using both chemical strategies and enzymatic transformations. Notably, SHC, as well as other MTCs, are known to trigger ionization on substrates without a diphosphate moiety, making them attractive starting points for organic synthetic applications. In this review, we summarize the sequences, structures, catalytic mechanisms, and engineering outcomes of the fascinating family of class II TCs.

2. Class II TC structures

The structural intricacies of class II TCs are pivotal in determining their functional capabilities. The publication of three TC structures, including one class II TC and two class I TCs, in 1997 established the basis for comprehending the enzymes that are predominantly responsible for the diversity of terpenoid natural products.1,17,19,43 Class II TCs primarily function at the cleft between the β and γ domains. A detailed examination of their active sites indicates enzymes creating larger terpene molecules possess a more expansive cavity and additional interactive residues. Dynamic shifts in the active site during substrate introduction promote the reaction. As more enzyme structures are elucidated, it presents chances to modify these active sites, enabling the synthesis of new terpenoid skeletons.33 To date, class I TCs have been more extensively characterized with over 30 structures; only twelve class II TC structures are reported.1,30,33,44-48 In this section, we will review these twelve class II TCs with determined structures (Table 1).

Table 1.

Structurally determined class II TCs

Name Organism Biosynthetic pathway UniProt ID PDB ID
AaSHC Alicyclobacillus acidocaldarius subsp. acidocaldarius ATCC 27009 hopene P33247 1SQC8, 2SQC52, 1UMP53
OSC Homo sapiens cholesterol P48449 1W6K, 1W6J20
AtCPS Arabidopsis thaliana gibberellin Q38802 3PYA, 3PYB31, 4LIX63
PtmT2 Streptomyces platensis CB00739 (thio)platensimycin/(thio)platencin D8L2U6 5BP834
Rv3377c Mycobacterium tuberculosis 1-tuberculosinyl-adenosine O50406 6VPT45
Tpn2 Kitasatospora sp. CB02891 terpentecin A0A2M9LDX2 7XKX44
SsDMS Streptomyces showdoensis drimenol A0A2P2GK84 7XQ4, 7XQZ, 7XR7, 7XRA, 7XRU33
PvCPS Penicillium verruculosum CPP A0A348FUE1 6V0K48
AgAS Abies grandis abietadiene Q38710 3S9V32
MstE Scytonema sp. PCC 10023 merosterol A A0A2D1CM82 6SBB, 6SBC, 6SBD, 6SBE, 6SBF, 6SBG30
CVBPO Corallina officinalis snyderols Q8LLW7 1QHB47
VCPO (NapH1) Streptomyces sp. CNQ-525 napyradiomycins A7KH27 3W35, 3W3646

2.1. Triterpene cyclases (TTCs)

Triterpenoids, a diverse group of naturally occurring compounds that are formed by six C5 isoprene units, serve as precursors to vitamins and bioactive metabolites and function in membrane fluidity regulation.49 In bacteria, SHC is responsible for protonating the terminal alkene of squalene to initiate the formation of hopene (also called diploptene), the fundamental pentacyclic hopanoid core.50 The incorporation of hopanoids into the lipid membrane confers enhanced stability and reduced permeability, aiding to maintain overall membrane integrity. Moreover, owing to their remarkable chemical stability, hopanoids serve as reliable biomarkers for identifying microbial species and assessing sedimentary environments.51

Similarly, OSC serves an indispensable function in eukaryotic biosynthesis by catalyzing the production of lanosterol, a critical intermediate in the pathway to cholesterol. Despite its negative connotation, cholesterol plays a vital role in the formation of cell membranes and serves as a precursor to bile acids, vitamin D, and steroid hormones, rendering it indispensable to the physiology of both animals and plants.37,49 Elucidating the crystal structure and molecular mechanism of human OSC represents a key step in the design of potent inhibitors targeted at reducing cholesterol levels in humans.20,37

In 1997, SHC from Alicyclobacillus acidocaldarius was the first structurally determined class II TC, determined at a resolution of 2.9 Å by X-ray diffraction (PDB ID: 1SQC).19 This monotopic membrane protein displayed an overall dumbbell shape consisting of β and γ domains, the γ domain of which is inserted between the first and second helices of the β domain. Later analysis revealed that almost all the class II TCs featuring the βγ didomain follow this rule (Fig. 2A).1 Both domains share an α66 barrel and form a large active site cavity of 1200 Å3 at the interface of the two domains. The active site is predominantly hydrophobic, displaying a polarity gradient that extends from a polar patch featuring the catalytic acid located at the top to nonpolar aromatic residues that line the active site and provide stabilization of the cationic intermediates; the catalytic base is expected to reside at the distal end of the cavity. In the active site, the competitive inhibitor N,N-dimethyldodecylamine-N-oxide was bound to the nonpolar surface, indicating a potential binding cleft for the substrate. The large and mobile nonpolar plateau, located on the protein surface surrounding the entrance of a nonpolar channel between the helices of the γ domain, provides access to the central cavity. Notably, the plateau is encompassed by cationic residues, implying its insertion into the hydrophobic core of the membrane, thereby elucidating the mechanism underlying the translocation of squalene from its site of solvation within the membrane interior into the central cavity of SHC. In 1999, a novel crystalline form (PDB ID: 2SQC) was reported, which substantially enhanced the resolution of SHC to 2.0 Å.52 Then in 2004, an additional aza-substrate bound structure (PDB ID: 1UMP) was reported, revealing a meandering conformation that is strictly enforced by the shape of the cavity.53 The structure illustrated that the tertiary amine of azasqualene creates a salt bridge with D376, positioned at the top of the active site, while the last isoprene unit interacts with F343 at the bottom of the cavity. Both structures (2SQC and 1UMP) disclose the presence of a water molecule between D376 and Y495 increasing the acidity of D376 and likely reprotonating D376, which is also hydrogen bonded with H451.52,53

Fig. 2.

Fig. 2

Structures and functions of squalene-hopene cyclase (SHC) and oxidosqualene cyclase (OSC). (A) SHC with the DxDD motif coloured in pink and the membrane anchor helix coloured in yellow; SHC transforms squalene into hopene; the inhibitor azasqualene is shown in magenta. (B) OSC with the VxDC motif coloured in pink and the membrane anchor helix coloured in yellow; OSC cyclizes 2,3-oxidosqualene into lanosterol (magenta).

Although the sources of OSC and SHC differ significantly, they share 25% sequence identity, indicative of essential structural similarity.20 Like bacterial SHC, OSC from Homo sapiens is a monotopic membrane protein comprised of two α66 barrel domains (β and γ) that are linked by loops and three β-strands (PDB IDs: 1W6K) (Fig. 2).20,54 Similar to SHC, the active site of OSC at the domain interface accommodates lanosterol, while the helix anchors the protein in the membrane. Additionally, a 25 Å diameter plateau is present on the membrane-inserted surface that includes a channel leading to the active site, which facilitates the entry of the substrate oxidosqualene from the membrane into the active site cavity.20 In contrast to the DxDD motif present in SHC, human OSC (hsOSC) employs a single central Asp residue, D455 in a VxDC motif (refer to as DCTAE motif in other OSCs),55 whose acidity is augmented by two peripheral Cys, C456 and C533.20 The catalytic base, H232, plays a crucial role in terminating the cyclization reaction. H232 is the only basic residue that is close enough to accept the proton required for the specific deprotonation of the C8/C9 lanosteryl cation, which ultimately leads to the termination of catalysis. Furthermore, the hydroxy group of Y503 is hydrogen-bonded to H232 and positioned in such a way that it may assist in the final deprotonation step (Fig. 2B).20,56

2.2. Diterpene cyclases (DTCs)

The first committed step in gibberellin biosynthesis, with routes known in plants, fungi, and bacteria, is the conversion of GGPP into ent-copalyl diphosphate (ent-CPP) .39,57-62 The structure of ent-CPP synthase (ent-CPS) remained undetermined until the structural elucidation of a homolog from the model plant Arabidopsis thaliana (AtCPS, also known as ent-kaurene synthetase A) involved in gibberellin biosynthesis at 2.25 Å in 2011 (PDB IDs: 3PYA and 3PYB).31,38 The overall structure of AtCPS is different from that of SHC and OSC with its domain architecture exhibiting an αβγ tridomain and no membrane-associated helix. The three domain-architecture was also seen in taxadiene synthase; however, the βγ domains in taxadiene synthase are nonfunctional as taxadiene synthase only catalyzes a class I TC reaction.2 This is the opposite in AtCPS, where the α domain in AtCPS lacks the conserved DDxxD and NSE/DTE motifs in class I TCs and is an evolutionary vestige (Fig. 3A). Like SHC and OSC, the hydrophobic active site of AtCPS is located between the β and γ domains as indicated by the complexed structure with two analogs, (S)-15-aza-14,15-dihydrogeranylgeranyl thiolodiphosphate and 13-aza-13,14-dihydro-CPP. A 1.55 Å high-resolution structure of AtCPS was determined three years later (PDB ID: 4LIX)63 and shows a single and unambiguous substrate conformation. Unfortunately, 15-aza-GGPP in this structure of AtCPS is not in a catalytically competent conformation as its aza group ion pairs with the central Asp, potentially due to the absence of Mg2+ in the structure.33 Analysis of this structure also proposed a series of “proton wires”, constructed by the hydrogen-bonded solvent network between D379 and bulk solvent, that might facilitate catalysis. The central Asp is hydrogen bonded with N425 and T421 to ensure that side chain Oδ2 atom of D379 is anti-oriented toward the substrate. It is also known that divalent ions such as Mg2+ improve the catalytic efficiency of class II TCs by facilitating binding since their substrates bear negatively charged diphosphate groups, and therefore a divalent metal binding site was proposed to reside in class II DTCs.64 Within the plant class II DTCs, an anticipated Mg2+-binding motif, EDxxD was identified.29 However, the corresponding motif in AtCPS is situated 18 Å from the diphosphate group of the substrate. This spatial disposition implies the EDxxD motif may not be involved in binding Mg2+. Furthermore, no electron density pertaining to divalent ions was discernible in any of the AtCPS structures.31,63

Fig. 3.

Fig. 3

Structures of class II DTCs. (A) Overlay of class II DTCs including AtCPS, PtmT2, Rv3377c and Tpn2 showing conserved active sites located in the cleft between the β and γ domains and that the α domain is vestigial (AtCPS, cyan; PtmT2, orange; Rv377c, blue; Tpn2, green). B–D. Active site overlays of AtCPS with PtmT2 (B), Rv3377c (C), and Tpn2 (D). AtCPS active site (cyan sticks) with GGPP analog (magenta) bound is used to indicate the likely substrate location in the active sites (shown as sticks) of other DTCs.

While the plant ent-CPS revealed the three-domain architecture, the sequences of bacterial class II DTCs suggested that they were architecturally like the bacterial SHC and OSC.29 This was experimentally confirmed after the structure of PtmT2 (PDB ID: 5BP8),34 an ent-CPS found in the biosynthesis of the (thio)platensimycin and (thio)platensin, promising antibiotics produced by several Streptomyces strains.65-68 The structure of PtmT2 was determined at a resolution of 1.8 Å and closely resembles the β and γ domains of AtCPS but PtmT2 lacks the α domain (Fig. 3A and 3B). Its central catalytic D313 is hydrogen bonded with H359 and connected to Y409 for reprotonation by a water molecule. In the structure of PtmT2, the divalent ion(s) remain unobserved, even after adding 5 mM MgSO4 and keeping the pH value of the crystallization condition neutral.34 Despite the absence of a ligand in PtmT2, the active site was found in a closed conformation with the W399–S406 loop moved down toward the binding pocket. The application of structure-based mutagenesis using docking models with GGPP enabled the identification of essential catalytic residues as well as important stabilizing aromatic and aliphatic residues that envelop the substrate during catalysis. The catalytically inert mutant K402A and the catalytically decreased mutants D128A and E133A implied that these residues might serve as residues that bind the diphosphate moiety or Mg2+ ion(s), respectively.29 Somewhat surprisingly, these mutagenesis results supported the prediction that the EDxxD region in AtCPS binds Mg2+, despite its distance from the diphosphate group. At this point, the Mg2+-binding site in class II TCs remained enigmatic.

Rv3377c was the second structurally characterized class II DTC from bacteria, determined at a resolution of 2.7 Å in 2020 (PDB ID: 6VPT).45 This enzyme is involved in the biosynthetic pathway of 1-tuberculosinyl-adenosine, a natural product produced by Mycobacterium tuberculosis (Mtb).45,69 In the bottom of the active site sits the general acid D295, which is hydrogen bonded with H341 as was seen in PtmT2 (with H359) (Fig. 3C). Rv3377c, like other class II TCs, contains two QW motifs composed of QxxDGSWG, one located in the β domain and the other in the γ domain, both situated on the outer surface of the protein.45 Its possession of two QW motifs is a distinctive feature and may not be universally conserved among all class II DTCs. Following the determination of Rv3377c, Tpn2 from Kitasatospora sp. CB02891, a class II DTC responsible for terpentedienyl diphosphate (TPP) formation in the biosynthesis of the antitumor antibiotic terpentecin, was determined at a resolution of 2.57 Å (PDB ID: 7XKX).44 This was also the first structure of a clerodienyl diphosphate synthase and showed the same βγ didomain as in the other bacterial class II DTCs. Different from the halimadienyl diphosphate producer Rv3377c, Tpn2 forms a narrower active site with W384 and L291 protruding into the domain interface. The corresponding central Asp in Tpn2 is D296, which is also hydrogen bonded to a His, H342, showing a conserved evolution in bacterial class II DTCs evolution. Unfortunately, there were no bound divalent ions found in the electron density maps of Rv3377c or Tpn2 (Fig. 3D).

2.3. Sesquiterpene cyclase (STC)

Although structures of class II TTCs and DTCs were known and well-studied, and it was logical that members of the class II TC family may accept and cyclize FPP, no class II STCs were reported until 2022. The first class II STC, a drimenyl diphosphate (DPP) synthase from Streptomyces showdoensis (SsDMS) was identified, functionally characterized, and structurally determined at a resolution of 1.58 Å (PDB ID: 7XQ4).33 Resembling the didomain architecture of PtmT2, SsDMS places its active site in the cleft of the two domains. The active site is lined by several aliphatic and aromatic residues, enveloping the ligand as shown in the ligand-bound structures and creating an active site with a volume of 536 Å3, significantly smaller than that of the TTC (SHC, 1036 Å3) and DTC (AtCPS, 940 Å3). The central Asp, D303, is hydrogen bonded with R403 and likely reprotonated by Y307 via a water molecule.

Excitingly, the number and location of divalent ions were identified in the active site of SsDMS, solving one of the lingering questions of class II TCs (Fig. 4A). In a complex structure of the SsDMSD303E or SsDMSD303A variants with FPP or 2-fluoro-FPP, two Mg2+ ions were found anchoring the diphosphate group near the active site entrance. Both ions were coordinated by E169. E169, which spatially aligns with E211 in AtCPS and E133 in PtmT2, appears to be the only residue needed to bind Mg2+ (Fig. 4D and 4E). Two other residues appear to play key roles in controlling substrate binding and catalysis. W165 acts as a gate keeper residue that blocks the active site in the open state and R132 acts as a diphosphate sensor that turns its side chain toward the diphosphate group upon substrate binding (Fig. 4B and 4C). Canonical class I TCs bind a trinuclear Mg2+ cluster with their DDxxD and NSE/DTE motifs to facilitate diphosphate abstraction;1,70 whereas class II TCs bind a binuclear Mg2+ cluster with a single Glu residue for diphosphate binding.

Fig. 4.

Fig. 4

Structure of the STC SsDMS reveals the class II TC metal-binding mode. (A) SsDMS complex shows the Mg2+-binding site and the substrate location. The Mg2+ ions are shown in green spheres; SsDMS transforms FPP into DPP. (B) W165, as the gate keeper, blocks the active site in the open state and rotates away along with L166 when the substrate binds. The closed state of SsDMS is coloured in sand; the open state of SsDMS is coloured in gray. (C) The diphosphate sensor R132 undergoes a conformational change in which its side chain is rotated, resulting in the positioning of the guanidinium moiety within 3.1 Å of the diphosphate moiety. (D) E169 undergoes a positional shift of 2.6 Å, thereby facilitating the formation of the substrate-ions-enzyme coordinated complex. (E) Overlay of AtCPS and PtmT2 with the structure of SsDMS indicates the true Mg2+-bound residue might be E211 (AtCPS, cyan) and E133 (PtmT2, orange). (F) Proposed catalytic base group deprotonates the final carbocation intermediate in SsDMS. The water molecule is shown in red sphere.

2.4. Bifunctional cyclases

Three-domain architectures, with either monofunctional or bifunctional activities, are not commonly observed in class II TCs.71,72 Two monofunctional TCs with the three-domain architecture are found in plants: the class I taxadiene synthase from Taxus brevifolia (TbTS) and the class II AtCPS described above.2,31 TbTS lacks a functional DxDD motif in its β domain and AtCPS lacks the functional DDxxD and NSE/DTE motifs in its α domain (Fig. 5C and 5D). A well-studied three-domain, bifunctional TC is abietadiene synthase from Abies grandis (AgAS), whose structure was determined at a resolution of 2.30 Å in 2012 (PDB ID: 3S9V).32,73 All three canonical motifs, the class I DDxxD and NTE motifs and the class II DxDD motif are present in the α and β domains of AgAS (Fig. 5A). GGPP is first cyclized into CPP via protonation of the terminal alkene by D404, which is coordinated to N451 (Fig. 5B and 5D). CPP is then transferred via diffusion out of the βγ-interfacial active site and into the α-domain active site for the second cyclization to abietadiene (Fig. 5C). AgAS is catalytically independent but cannot work when the two domains are separated as shown by truncation experiments.74 Distinct from the canonical class II TCs discussed above, AgAS was seen in the loop-out conformation, where loop 482–492 was located at the exterior of the protein (Fig. 5A); the corresponding loop of AtCPS penetrated into the active site.31,63 Molecular dynamics (MD) simulations on this loop showed a key residue, D277, that acted as a gate by correlating with the loop and preventing it from moving into the active site.32 The loop-in conformer presented in the MD study plays a counterintuitive role by reducing the volume of the active site crevice while increasing the conformational flexibility of GGPP.32

Fig. 5.

Fig. 5

Structure of bifunctional diterpene cyclase AgAS with three functional domains. (A) Overview of AgAS (coloured in magenta) shows the loop-out conformation, which is coloured in blue. (B) AgAS catalyzes the conversion of GGPP into CPP. (C) Comparing with the active site of class I domain of AgAS with that of TbTS (coloured in grey). The active site residues are presented in sticks; Mg2+ ions are shown as green spheres; GGPP analog is presented in the white stick and the diphosphate is coloured in orange. (D) Comparing with the active site of class II domain of AgAS with that of AtCPS (coloured in grey). The water molecule is shown as a red sphere.

The bifunctional TC of PvCPS, a CPS from Penicillium verruculosum TPU1311, adopts a three-domain architecture that comprises an α domain functioning as a prenyltransferase (PT), which generates GGPP for subsequent cyclization in the βγ domain. The βγ domain acts as a class II DTC, catalyzing the cyclization of GGPP to produce CPP.75 Unfortunately, the full-length structure is not available and only the PT domain was structurally characterized at a resolution of 2.41 Å (PDB ID: 6V0K).48 However, the full-length sequence reveals a significantly longer linker of 170 residues between the α and β domains; in the plant three-domain TCs, only 33–40 residues link these two domains.2,31,32 Different from AgAS, the individual PT and TC domains of PvCPS are still active in the absence of the linker or other domains.48,71,75

2.5. Meroditerpene cyclases (MDTCs)

MstE, a specialized class II TC, was identified as being responsible for forming an ent-sterol-like skeleton fused with an aryl moiety during the biosynthesis of cytotoxic merosterol A from Scytonema sp. PCC 10023.76,77 MstE cyclizes 5-geranylgeranyl-3,4-dihydroxybenzoate (GG-DHB) into merosterolic acid A, an established biosynthetic precursor of merosterol A, via a class II cyclization mechanism (Fig. 6D). The structure of MstE shows a unique single β domain, the first of its kind. Like the β domains of the class II TCs, MstE shows the canonical α66 barrel fold (Fig. 6A).30 Due to the lack of a γ domain and thus an interfacial cleft for substrate binding, the substrate is instead harbored in the inner cavity of the β domain (Fig. 6B). As in SHC and OSC, MstE possesses two QW motifs in its C-terminal region of outer-Helix 4 and outer-Helix 6 (see Section 3.4). Upon substrate binding, residues 157–165 fold around the entrance in a substrate-induced fit, mediated by interactions between the DHB carboxylate and Y157 and between D162 and R337, to prevent solvent from accessing the cationic intermediates (Fig. 6C). Unlike terrestrial MTCs, which lack a recognizable Asp-rich motif, MstE possesses a D107ADT motif, an alternative to the DxDD motif (Fig. 6B), with the central D109 serving as the catalytic acid.

Fig. 6.

Fig. 6

Structure of MDTC MstE. (A) Top view of MstE shows the canonical α66 barrel fold; the outer helices are coloured in blue; the inner helices are coloured in pink. (B) The hydrophobic active site of MstE that binds GG-DHB is shown as a pink surface with GG-DHB depicted as blue sticks, and the DxDT motif is coloured in orange. (C) Upon substrate binding, the side chain of Y157 engages in an interaction with GG-DHB, while D162 and R337 ionically interact. Residues in the open and closed conformations of MstE are coloured in grey and pink, respectively. (D) MstE catalyzes the GG-DHB into merosterolic acid A.

2.6. Noncanonical class II TCs

Haloperoxidases are a family of enzymes that catalyze the two-electron oxidation of a halide to the corresponding hypohalous acid, which can further react with a wide range of nucleophilic acceptors to form halogenated natural products.78-80 These enzymes are named based on their ability to oxidize the most electronegative halides, i.e., chloro-, bromo-, or iodoperoxidases. V5+-dependent haloperoxidases (VHPOs) are one of the three known groups of haloperoxidases.81 One subclass of VHPOs, known as V5+-dependent bromoperoxidases (VBPOs), can catalyze halide introduction and subsequent cyclization of terpenes in a class II TC-like manner (Fig. 7A).47,82,83 A VBPO from the red algae Corallina officinalis, CVBPO, was structurally characterized at a resolution of 2.3 Å (PDB ID: 1QHB) (Fig. 7A).47 The crystal structure of CVBPO is different from the crystal structure of canonical class II TCs, which consists of 12 subunits with each subunit consisting of 595 amino acids. The V5+-binding site of CVBPO is located at the bottom of the active site cleft, which is approximately 20 Å deep and 14 Å wide. The V5+-binding locus of CVBPO is indicated, in lieu of the vanadate cofactor, by the bound inorganic phosphate; inorganic phosphate is known to compete with vanadate for binding in the enzyme active site (Fig. 7A).84 This site involves specific residues that are essential for V5+coordination and stabilization. Interestingly, the location of the V5+-binding site sits at the cavity bottom, which is reminiscent of the position of the DxDD motif found in canonical class II TCs, although the actual residues involved in the two binding sites are distinct (Fig. 7A). It is noted that the V5+ binding site is highly conserved in the two algal bromoperoxidases and the fungal vanadium chloroperoxidase from Curvularia inaequalis, the active site residues display considerable variability.85-87 This variability suggests different substrate recognition capabilities that may contribute to additional cyclization functions.

Fig. 7.

Fig. 7

Structures of VBPO and NapH1. The active site pocket is shown in blue surface. (A) Cartoon representation of the overall structure of CVBPO (yellow); CVBPO reaction scheme; CVBPO active site and inorganic pyrophosphate (PPi) shown as yellow and orange sticks, respectively; (B) Cartoon representation of the overall structure of NapH1 (green); NapH1 reaction scheme; NapH1 active site and vanadate cofactor (grey) shown as sticks; two key residues S427 and K324 are highlighted in magenta and with bold font.

The biosynthesis of napyradiomycins involves a sequence of reactions in which NapH1 from Streptomyces sp. CNQ-525 catalyzes the conversion of naphthomevalin into the tricyclic napyradiomycin A1 (Fig. 7B). The crystal structures of NapH1 were determined at a resolution of 2.4 Å (PDB IDs: 3W35 and 3W36) (Fig. 7B).46 As a homodimer, each subunit of NapH1 contains both N- and C-terminal helical bundles with an interconnecting region; a vanadate-binding site is located in each subunit. In NapH1, the vanadate cofactor is stabilized by the dipole interaction from helix 12, forming charge-charge interactions with K372, R379, and R488. The vanadate cofactor coordinates to H494 and is also hydrogen bonded to S427, which corresponds to a conserved His residue in other structurally characterized VHPOs (Fig. 7B). Furthermore, the orientation of K324 towards the vanadate cofactor and S427 on helix 12 suggests that these residues may have a critical role for halide oxidation and cyclization (Fig. 7B). Mutation of K324 resulted in the inability to catalyze the chlorocyclization of naphthomevalin. Moreover, the S427A variant exhibited reduced halide oxidation activity, indicating the importance of the hydrogen bond between S427 and the vanadate cofactor.

3. Functional motifs in class II TCs

The functional result of a TC is controlled by carbocation chemistry. The overall mechanistic logic of TCs is the initiation of the reaction via generation of a carbocation, controlling the active site environment to ensure carbocation stabilization throughout the cyclization mechanism and prevent premature cation quench, and reaction termination via deprotonation to form an alkene or nucleophilic quench of the ultimate carbocation by a water molecule.30,31,34,63 TCs achieve these delicately controlled mechanisms through the use of Asp-rich motifs, divalent metal ions, hydrophobic and aromatic active sites, and general bases.33,88,89

It is well established that Asp-rich motifs in TCs differentiate the function of TC classes. Class I TCs abstract the diphosphate group of the substrate through the divalent cation-binding Asp-rich regions, most commonly DDxxD and NSE/DTE motifs, to generate the cyclization-triggering carbocation.2,70,90 The Asp-rich region of class II TCs, typically a DxDD motif, acts as a general acid that protonates an alkene and generates the initial carbocation. Yet, variations of this motif are observed in class II TCs, such as OSCs that function on epoxidized substrates.1,15,20 The quantity of Asp in each TC can exhibit significant variation with the count oscillating from one to three, depending on the substrate involved and the acidic strength needed to generate the primary carbocation intermediate. In certain instances, TCs may employ Glu as the native general acid, thereby obviating the necessity for an Asp-rich motif.91,92 The conservation of the general base dyad in plant DTCs, implicated in the biosynthesis of gibberellin, can be attributed to a shared intermediate, ent-CPP, which plays a formative role in the shaping of the active site, ultimately promoting greater congruency in the properties of the active site.93,94 Bacterial class II TCs, owing to their heterogeneous product skeletons, exhibit a marked deviation from conservation. Their general bases are found in distinct places in the active site, or could be a water molecule, depending on the location of the final carbocation intermediate where the general acid may be utilized.33,34,45,89,93 Despite the well-established fact that the activity of numerous class II TCs is reliant on divalent cations to bind with the diphosphate moiety of the substrate, the precise character and localization of these ions was not fully ascertained until recently.33

Other sequences, such as the QW motif, were bioinformatically identified as conserved motifs in class II TCs, used as predictive indicators of function,29 and recently proposed to play a role in stabilizing the enzyme during product release.19,29 Detailed understanding of the sequence-function relationships of class II TCs will lead to deeper understanding of their mechanisms, the ability to predict function based on sequence alone, and provide logical routes for protein engineering.

3.1. Catalytic acid

A general acid is required to initiate the multistep cyclization reactions catalyzed by class II TCs. Nature empowers each enzyme with an appropriately powerful acidic group depending on the properties of its preferred substrate (Fig. 8). In TCs that protonate the (frequently terminal) alkene of the acyclic terpene or isoprenyl diphosphate, a complete DxDD motif is required. Mediated by the nearby conserved Arg/His/Asn in bacterial and plants TCs, the more acidic anti-oriented proton of the central Asp is positioned toward the alkene;33,34,63 however, this anti-configuration is unstable without a suitable environment95-97. In SHC, D376 is adjacent to C3 of squalene and is regarded as the general catalytic acid; this was supported by site-directed mutagenesis (Fig. 2a).19,52 The negative charges on the other two Asp residues, D313 and D447, contribute to increase the pKa of D376, thus creating a microenvironment to allow D376 to act as an acid. In AtCPS, a hydrogen bond network created by D379, N425, and T421 collaboratively ensures the Oδ2 atom of D379 is oriented toward GGPP. Site-directed mutagenesis supported that N425 acts to orient D379 while T421 reactivates the general acid after the cyclization (Fig. 3).63 In PtmT2, Rv3377c, Tpn2, and SHC, the residue corresponding to N425 is His. In the class II STC of SsDMS, an Arg is used to orient the general acid for protonation.33 However, the residue that corresponds to T421 was not revealed, which may be attributed to the structural difference and the lack of related mutagenesis data.

Fig. 8.

Fig. 8

Overall view on acid catalysis of class II TCs.

Epoxides have increased basicity and ring strain compared with alkenes and therefore need less acidic functional groups to be protonated.98 Human OSC, which accepts 2,3-oxidosqualene, therefore employs a single Asp residue in a VxDC motif to initiate the reaction (Fig. 2).20,55 This variation is also seen in other class II TCs whose substrate is an epoxyterpene.21 As would be expected, enzymes possessing a DxDD motif can protonate both alkenes and epoxides, expanding their substrate potential. This concept was confirmed in both class II DTCs and TTCs. Mutation of the central Asp in AtCPS (D379A) is ineffective against all substrates, but the D380A variant can only accept epoxy-GGPP.98 In SHC, either the first or the middle Asp of the DxDD motif can react with oxidosqualene, indicating the chemically reactive nature of the epoxide.99

Similar to the OSC, DTCs that act on epoxidized substrates have a distinct catalytic acid motif.92 PlaT2 from the phenalinolactone producer Streptomyces sp. Tü6071 possesses a DSAN motif and was proposed to target (S)-epoxy-GGPP for cyclization.100 Bra4 from Nocardia brasiliensis IFM 0406, which shares 42% sequence identity with PlaT2, has an ESAE motif. In the biosynthesis of brasilicardin A, Bra4 was hypothesized to cyclize (R)-epoxy-GGPP and their similar motifs suggest similar catalytic mechanisms.92,101 Similarly, the TnlT2 from Streptomyces sp. CB03234 was proposed to convert the (S)-epoxy-GGPP into a 3S-hydroxyl-dodecahydrophenanthrene scaffold product with diphosphate, thus facilitating the production of tiancilactones.92 In general, these DTCs whose substrates are the oxidized GGPP share a catalytic motif of (E/D)(T/S)xE, which is distinct from the canonical class II TCs.92 However, further biochemical evidence is needed to support the role of this unique catalytic motif in these class II TCs.

Additionally, the newly determined single domain class II TC of MstE also showed a DADT motif in the similar site location as OSC and the canonical DTCs, suggesting a common evolutionary route of itself and the TTCs and DTCs.30 However, unlike canonical class II TCs, Pyr4, a type of MTC, lacks a conserved DxDD motif.91 Two residues (E63 and D218) in Pyr4, the first MTC discovered from Aspergillus fumigatus Af293, were proposed to facilitate the protonation of an epoxide.91 Despite the marked reduction in activity observed upon mutagenesis of E63, which provides evidence for its involvement in the catalytic process, the exact functions of both E63 and D218 within the context of Pyr4 remain ambiguous. Furthermore, it is uncertain whether this trend applies to other types of MTCs.

3.2. Catalytic base

After cyclization occurs, the reaction is commonly terminated by the deprotonation of a carbon adjacent to the ultimate cation. This base has been elusive in many class II TCs as its identity and position varies in different enzymes. The catalytic base in OSC has attracted the most attention. Early mutagenesis studies with OSC from Saccharomyces cerevisiae (ScOSC) supported that H234 acted as the final base;102 103 correspondingly, in HsOSC, H232 is the catalytic base.20,56

The final deprotonation step in HsOSC was also investigated using quantum mechanical/molecular mechanical (QM/MM) molecular dynamics simulations. The results showed that the C8-β-H points to the hydroxy group of Y503, which is hydrogen-bonded to H232. This suggests that Y503 helps position H232 and mediates the proton transfer from the last carbocation intermediate to H232. H232 is highly conserved in OSC homologs and plays the role of proton acceptor in catalysis, thereby acting as the general base for deprotonation.56 Alternatively, SHC from A. acidocaldarius might use a polarized water molecule as the catalytic base instead of a residue to quench the reaction (Fig. 2A).104

The determination of the catalytic base of class II DTCs was first investigated in a bifunctional DTC AgAS.105,106 H348 ligated with Y287, which was originally proposed as the catalytic base, is conserved in homologous gymnosperms TCs and resides across the active site from the general acid (Fig. 5D).105,106 However, substitution of the His to Asp and resulted in the formation of 8α-hydroxy-CPP; Y287F can also produce 8α-hydroxy-CPP.106 The conclusion was that the catalytic base is in fact a water molecule, which is coordinated by H348. Disruption of this residue negatively impacts the orientation of water resulting in addition to the copalyl diphosphate cation rather than deprotonation at C7.105

While H348 may function as the catalytic base in closely related bifunctional DTCs found in gymnosperms, including AgAS from A. grandis, this residue is limited to gymnosperms and is not found in other class II DTCs, including the ent-CPS from Picea glauca.35 The catalytic base landscape of class II DTCs involved in gibberellin biosynthesis and specialized metabolism was investigated through a sequence alignment focusing on the catalytic base dyad and revealed a high conservation of two motifs including the LHS and PNV motifs (bold letters refer to the catalytic base dyad) in these enzymes.93,107,108 In the case of AtCPS, mutagenesis studies revealed that a water molecule coordinated to His and Asn directs the quenching of the reaction (Fig. 3).35,93 For the purpose of this review, the residues corresponding to the His and Asn positions are designated as the first and second positions, respectively.35,93,109

Sequence alignment and mutagenesis of the fungal CPS/KSs showed a highly conserved His/Thr pair as the catalytic base dyad.93 Furthermore, the plant (−)-clerodienyl diphosphate synthase SdCPS2 from Salvia divinorum exhibited a Phe/Asn pair. Altering Phe to His resulted in the reciprocal effect to H263F in AtCPS, which is also seen in the Y265H mutant of KPP synthase TwTPS14 from Tripterygium wilfordii.94,109,110 Bacterial class II TCs are a diverse group of enzymes with varying catalytic mechanisms, as evidenced by the different products they form (see Section 5.2). The significant variation in terpene skeletons produced by bacterial TCs suggests a corresponding divergence in the makeup and location of their catalytic bases. Some use a water molecule as the catalytic base, while others use unique catalytic base dyads. The ent-CPS PtmT2 employs a His/Asp dyad (Fig. 3).34 The conservation of the His residue in PtmT2 may result from its functional similarity to plant DTCs, despite the overall divergence of bacterial TCs.34,93 Rv3377c possesses the non-basic Phe/Ala pair at the first and second position, though its true catalytic base was determined by mutagenesis to be Y479, which was found proximal to C6 of the final intermediate halima-13-en-5-yl+ (Fig. 3C).89 Tpn2 proves to be an interesting example of the catalytic acid possibly acting as the catalytic base. The location of protonation, C14 of GGPP, ends up being the same carbon for deprotonation, C3 of the final intermediate syn-cleroda-13E-en-4-yl+. Thus, assuming no significant changes in binding orientation between protonation and deprotonation, the catalytic acid D296 likely ends up being the catalytic base.44 The recently characterized SsDMS provides insight into the catalytic base of bacterial class II STCs. The Tyr/Gln pair, which is highly conserved among most putative bacterial class II STCs, is believed to facilitate the catalytic base role of a ligated water molecule. This water molecule also forms hydrogen bonds with the backbone carbonyl oxygens of A498 and W393 (Fig. 4F).33 Given that many bacterial TCs involved in terpene production remain unmined, the wide variety of catalytic bases among these enzymes will undoubtedly add further complexity to the bacterial terpene family.

3.3. Metal binding motif

Metal, most commonly magnesium ions, are inseparable in the reactions of TCs and PTs.1,70 In canonical class I TCs, the conserved catalytic motifs DDxxD and (N,D)D(L,I,V)x(S,T)xxxE (bold letters denote Mg2+-binding residues) bind to the trinuclear Mg2+ complex to abstract the diphosphate group via an electrophilic driving force.1,111 The class II DTCs and STCs do not abstract the diphosphate but the Mg2+ ions participate in substrate binding, although their absence does not typically render these TCs completely nonfunctional.64 Before the structure of SsDMS was determined, the metal-binding motif remained enigmatic.33

Early experiments examined the role of divalent metal ions in class II TCs. In AtCPS, Mg2+ showed superior enzymatic activity compared with ions including Ni2+, Co2+, Ca2+, or Cu2+;64 similar results were seen with PtmT2 and SsDMS.33,34 Also, increased Mg2+ concentration impaired the reaction rate of AtCPS.64 Three possible Mg2+-binding modes were proposed based on the concentration of Mg2+: 1) a productive mode where Mg2+ binds the diphosphate group and promotes the reaction, 2) an Mg2+ inhibition mode where Mg2+ binds both the diphosphate group and the DxDD motif, hindering the reaction, and 3) a synergistic inhibition mode where a double substrate-ion complex inhibits the reaction.64 The data suggested that at least one ion binds in the active site with either the DxDD motif or the diphosphate group or both.

A bioinformatic prediction, based on its high conservation and the co-occurrence with the DxDD motif, proposed an EDxxD motif as the Mg2+-binding region in plant class II DTCs.29 This highly acidic motif resembled DDxxD in the class I TCs, suggesting the capability to bind Mg2+ ions.29 Mutation of the EDxxD motif in AgAS to the corresponding amides (i.e., Gln and Asn) resulted in a dramatic decrease in activity.29 Subsequent mutagenesis studies in AtCPS showed that the proximal E211 plays an essential role in the catalytic activity of the enzyme, as evidenced by the 500-fold decrease in kcat and only a slight reduction in KM when mutated to Ala. While the exact mechanism is still under investigation, it has been postulated that Mg2+, in conjunction with E211, may stabilize the competent conformation of the substrate in the transition state.31,63 In PtmT2, the EDxxD motif appeared to be replaced with a DxxxxE motif, mutation of which confirmed it as a key player in the active site and the loop containing the DxxxxE motif was proposed to lock the Mg2+ ion(s) in place. It is noteworthy that E133 in PtmT2 corresponds to E211 in AtCPS.34

Recently, the structure of bacterial STC SsDMS revealed the Mg2+-binding region, which consists of two Mg2+ ions coordinated by a single Glu residue, E169, located 5 Å away from the diphosphate group of FPP.33 Subsequent mutagenesis studies demonstrated that enzyme activity was completely abolished by both E169A and E169D changes. This suggested that a well-positioned carboxylate is crucial for substrate binding and catalysis, which is further supported by the observed 2.6 Å shift of the E169 residue (Fig. 4E). Notably, this Glu is highly conserved in other putative bacterial class II STCs.33

3.4. QW motif repeat

QW motifs, also known as QxxDGSWG or the Q-X5-W repeats, are widely distributed in class II TCs. The QW repeat was first discovered in 1994 through a sequence alignment of SHCs from A. acidocaldarius and Zymomonas mobilis, as well as the OSCs from Candida albicans and A. thaliana.112 SHC and OSC share this repeated fingerprint sequence with eight and five QW motifs, respectively.19,20,52,112 The QW motif is composed of Q and W residues that stack to form hydrogen bonds with the N-terminal end of the adjacent outer barrel helix and with the C-terminal end of the preceding outer barrel helix.19 These interactions fortify the α66 barrels and provide additional stability for the exergonic reaction.19 Bioinformatics also showed that the QW motif is present in both bacterial and plant class II DTCs, and this feature was used as the original feature to predict the βγ didomain architecture of these TCs.29 Following the structural determination of AtCPS (plant) and PtmT2 (bacteria), this prediction was confirmed.34,63

Bacterial class II DTCs frequently possess two QW motifs, with one motif found in each the β and γ domains, indicating the existence of the βγ didomain structure in bacterial class II DTCs.29 There is only one QW motif connecting the α18 and α19 helices in the β domain of SsDMS.33 The single β domain in MstE also presented two QW motifs in the carboxyl termini of outer-Helix 4 and outer-Helix 6, generating additional stability by absorbing the released enthalpy in the product formation.30

4. Catalytic mechanisms

The formation of terpenoids involves the generation of highly reactive carbocation intermediates that require precise positioning to create specific stereoselective and regioselective products.1 TCs provide a nonpolar environment that is decorated with aliphatic and aromatic residues to stabilize these carbocation intermediates, thus enabling selective product formation. Aromatic residues play a critical role in the catalytic mechanism of TCs through cation-π interactions.63,104 In this section, we will explore the catalytic mechanism of TCs in detail, focusing on the roles of nonpolar active sites and aromatic residues in enabling precise cyclization.

4.1. Mechanisms of the TTCs

After the determination of the structure of SHC, a ligand-bound structure of SHC was shown to be the molecular template for hopene formation.53 After C3 (squalene numbering) of squalene is protonated by D376, bond formations between C2─C7 and C6─C11 lead to the A and B rings. The formation of hopene can occur through two different routes (Fig. 9A). The first route involves Markovnikov addition via C10─C14 bond formation to yield intermediate 1 followed by ring expansion. The second route involves direct anti-Markovnikov formation of the cyclohexane C ring through C10─C15 bond formation with subsequent C14−C18 bond formation to yield an initial 5-membered D ring of intermediate 2 in Markovnikov fashion (Fig. 9A). The cyclization is incomplete until D ring expansion to form the tetracyclic 6-6-6-6 cationic intermediate and Markovnikov addition to form the E ring.36,113 The general base, which is proposed to be a polarized water molecule, deprotonates C29 of the hopenyl cation to release hopene (path I); a side product hopanol is also formed by water quench at C22 (path II) (Fig. 9A).19

Fig. 9.

Fig. 9

Catalytic mechanism of SHC cyclization. (A) The initiation, cyclization, and cation quench of catalyzed by SHC. The numbering on the squalene backbone is according to squalene numbering; the numbering on the intermediate prior to the products is according to hopene numbering. (B) Active site of SHC. The key residues are presented as green sticks; the squalene analog is shown as magenta sticks; the numbering on the azasqualene (magenta) backbone is according to hopene numbering.

Several aromatic residues are important to guide this cationic cascade. The roles of these residues were examined through shunt products resulting from premature cation quench after mutagenesis of the active site (Fig. 9).104 Initially, the catalytic acid D376, which is hydrogen bonded with D447 and D313 and interacts with H451 to increase its acidity, protonates the terminal alkene of squalene. D377 and the π electrons of Y612 stabilize the first cyclized carbocation at C10 (hopene numbering). The following cation at C8 is stabilized by Y420 as well as F365, which has its π-electron density amplified by Y612 and Y609. In contrast to the canonical atom stabilization function, aromatic residues W312 and W169 may be only responsible for binding with substrate, while W489 exhibited cation stabilization and substrate binding (Fig. 9B). Both F601 and F605 stabilize the final C13/C17 and C17/C22 carbocations, respectively. Terminated by a polarized water molecule, the cyclization cascade affords the final product. Y495 may reprotonate D376 for the next cyclization (Fig. 9B).104

OSC only relies on a single residue, D455, for initial ionization. This protonation at C2 (2,3-oxidosqualene numbering) is facilitated by two peripheral Cys.20,36,114 Like SHC, there are many aromatic residues in the active site pocket to stabilize the carbocation intermediates via π-cation interactions including F696 on the C20 protosteryl cation (protosterol numbering) (Fig. 10A).36 The A ring is formed in concert with epoxide ring opening, the B ring is in a pre-boat conformation, and the C ring closed is in a chair conformation from an anti-Markovnikov reaction. Following a series of three 1,2-hydride and two 1,2-methyl shifts to afford to lanosteryl cation, lanosterol is formed by deprotonation of the β-H on C8 by H232 (Fig. 10B).115 The final deprotonation was further validated through MD simulations (Fig. 10B).56

Fig. 10.

Fig. 10

Catalytic mechanism of OSC cyclization. (A) Active site of OSC. The key residues are presented in green sticks. The lanosterol is shown in magenta sticks. The numbering on the lanosterol backbone is according to lanosterol numbering. (B) The initiation, cyclization, and deprotonation of OSC. The numbering on the 2,3-oxidosqualene backbone is according to 2,3-oxidosqualene numbering.

4.2. Mechanisms of the DTCs

DTCs employ GGPP as a substrate to generate a diphosphate-bearing product, in contrast to their triterpenoid counterparts (TTCs). This divergence leads to subtle distinctions in their respective catalytic mechanisms, which entail the inclusion of charge-compensating residues proximal to the diphosphate group to mitigate its negative charge. The revelations of the catalytic mechanisms of DTCs were expedited by multiple high-resolution structures of AtCPS and structure-guided site-directed mutagenesis.63

The most crucial role in AtCPS catalysis is played by the general acid D379, whose function is augmented by several interactions with surrounding residues. The mutations T421A and N425A lead to a decrease in kcat, suggesting that T421 plays a role in orienting D379 while N425 facilitates protonation through a hydrogen bond to D379. R340, which hydrogen bonds with D336, is crucial for blocking the solvent channel and mediating proton transfer to D379, as demonstrated by an 850-fold decrease in kcat when substituted with alanine (Fig. 11A).63 Enhanced by these interactions, D379 protonates the C14 of GGPP to initiate the reaction (Fig. 11A). E211 is 5 Å away from the diphosphate group and is supported as a divalent ion binding residue by a 500-fold reduction in kcat and a slight reduction in KM, suggesting that divalent ions are crucial for stabilizing the cyclization-competent conformation in the transition state (Fig. 11A).31,63 Additionally, the two positively charged Lys residues, K463 and K245, are responsible for the stabilization of the diphosphate group by neutralizing its negative charge (Fig. 11A). The cationic intermediates are stabilized by several aromatic residues during cyclization before the C17 (ent-CPP numbering) methyl is deprotonated by a water molecule fully ligated by side chains of N322 and H263, main chain carbonyl group of N322, and main chain amide of K508 ultimately leading to ent-CPP (Fig. 3 and Fig. 11).63 The bacterial DTC PtmT2 follows a similar catalytic mechanism as AtCPS, albeit with some variations in the identity of the specific catalytic acid and base groups utilized.34

Fig. 11.

Fig. 11

Catalytic mechanism of AtCPS cyclization. (A) Active site of AtCPS. The key residues are presented in cyan sticks; the GGPP analog is shown in magenta sticks; the diphosphate group is shown in orange stick; the numbering on the GGPP analog backbone is according to GGPP numbering. (B) The initiation, cyclization, and deprotonation of the AtCPS enzyme. The numbering on the ent-CPP backbone is according to ent-CPP numbering.

While most products of class II DTCs share a trans-decalin core, there is one DTC that forms a unique bicyclic skeleton via a ring A contraction. Premutilin synthase from Clitopilus passeckerianus (CpPS) is a bifunctional DTC, consisting of canonical class I motifs and a DxDD variation (DxD311M) in its class II domain, that forms a "propellane-like" tricyclic structure in the biosynthesis of pleuromutilin.116,117 Through mutagenesis experiments, D311 and D649 were identified as catalytically essential in the class II and class I TC domains, respectively.118 A D649L variant revealed the presence of an intermediate, mutildienyl-PP (MPP), that has a 5/6-bicyclic skeleton (Fig. 12). The currently proposed mechanism, supported by both mutation and computational data, is that the bicyclization of GGPP results in the formation of halima-13E-en-15-PP-5-yl+, which undergoes a series of 1,2-methyl and 1,2-hydride shifts prior to A ring contraction and deprotonation to produce MPP. Subsequent transformations of MPP within the class I TC domain of CpPS, including further cyclization, a 1,5-hydride shift, and water addition, lead to the premutilin.118

Fig. 12.

Fig. 12

Catalytic mechanism of CpPS-mediated cyclization. The bifunctional CpPS begins the reaction by protonating GGPP, which is followed by an A-ring contraction with MPP serving as the intermediate.

4.3. Mechanisms of the STC

A comprehensive and dynamic mechanism for class II TCs is presented from the information gained from the crystal structures of both apo-SsDMS and its complexes.33 Notably, this mechanism incorporates the first definitive evidence of the Mg2+-binding site, offering a complete catalytic model for this important enzyme class. The active site of the STC SsDMS is smaller than those of the DTCs AtCPS and PtmT2, which is consistent with the differences in substrate chain length. Despite its smaller size, the active site of SsDMS is highly functional and contains residues that are crucial for catalysis. Before FPP binds, the enzyme is in an open form with W165 and L166 positioned in the active site and R132 rotated away from the active site. As FPP and the two Mg2+ ions bind, W165 and L166 move away to create space for FPP, E169 coordinates two Mg2+ ions to anchor the diphosphate moiety, and R132 rotates to function as the diphosphate sensor. The positively charged K133 and R501 also contribute to substrate binding by stabilizing the diphosphate group. FPP is oriented with distances of 3.9 and 3.3 Å between C2─C7 (FPP numbering) and C6─C11, respectively, providing a catalytic competent conformation for cyclization (Fig. 13A). To initiate the reaction, the general acid D303, located 2.3 Å away from C10 of FPP, acts as a proton donor for protonation at C10, leading to the formation of a farnesyl cationic intermediate (Fig. 13A). This intermediate is stabilized by several aromatic residues through cation-π interactions and is cyclized into a trans-decalin bicycle (Fig. 13A). Finally, deprotonation occurs at C7 (DPP numbering) to form DPP, presumably by a water molecule that is positioned 4.4 Å away from C7. This water molecule is fully ligated with the side chains of Y505 and Q497 as well as the backbone carbonyl oxygens of A498 and W393. This binding mode is also observed in AtCPS, suggesting that a fully ligated state of a water molecule ensures that it performs final deprotonation instead of quenching the carbocation with the addition of water (Fig. 13B).33,63

Fig. 13.

Fig. 13

Catalytic mechanism of SsDMS. (A) Precyclization conformation of FPP and the relative locations of catalytic acid group in D303E mutant of SsDMS. The numbering on the FPP backbone is according to FPP numbering; the Mg2+ ions are shown in green spheres. (B) Proposed mechanism of SsDMS. The numbering on the DPP analog backbone is according to DPP numbering.

4.4. Atypical TTCs

Onoceroids, a category of triterpenes, includes the onocerane, serratane, ambrane, and colysane skeletons. The first onoceroid synthase (OnS) BmeTC was identified from Bacillus megaterium. Initial in vivo experiments suggested that BmeTC mirrored that of SHC, leading to the 8α-hydroxypolypoda-13,17,21-triene via squalene.13,119 In further experiments, squalene was incubated with Escherichia coli cell-free extract with overexpressed BmeTC. This resulted in the production of symmetrical onoceranoxide and asymmetric onoceranol (no defined name in original literature) with the labdane-like skeleton at a ratio of 64:36.120

This unique double cyclization was proposed to be a subsequent reaction post the transformation of squalene into 8α-hydroxypolypoda-13,17,21-triene. BmeTC initiates cyclization through protonation at the terminal isopropylidene moiety. Ultimately, this process is terminated by either nucleophilic addition of a water molecule (path a yielding onoceranoxide) or via deprotonation by H26 (path b affording the onoceranol) at the C8 carbocation. It is suggested that BmeTC positions the substrate in the active site again after the initial cyclized product (8α-hydroxypolypoda-13,17,21-triene) is released. Interestingly, BmeTC can also cyclize 3-deoxyachilleol A produced by AaSHCD377C,99 resulting in the synthesis of (+)-ambrein, a precursor to a major component of ambergris, which traditionally requires 19–35 steps of chemical synthesis (Fig. 14A).120

Fig. 14.

Fig. 14

Catalytic mechanism of bacterial, plant, and fungal OnSs. (A) Cyclization process of bacterial BmeTC to yield the onoceranoxide, onoceranol and (+)-ambrein. (B) Plant OnSs LCD and LCE function as onocerin synthase and serratane synthase, respectively, in the biosynthesis of tohogenol, serratenediol, and α-onocerin. (C) Biosynthesis of homomonoceroid A, alliaonoceroid A, and fumionoceroid C. Fungal OnSs (FumiS1, HomoS, and AlliS) and Pyr4-like cyclases (FumiB, HomoB, and AlliB) were identified coexisting in the same genome for the first time.

An alternate substrate, (3S,22S)-2,3,22,23-dioxidosqualene, can be used to form serratane-type onoceroids, as evidenced by the discovery of LCE in Lycopodium clavatum.121 In the α-onocerin biosynthesis route, LCC from L. clavatum, an OSC-like dioxidosqualene cyclase, was identified to convert (3S,22S)-2,3,22,23-dioxidosqualene into pre-α-onocerin.122 This compound is subsequently transformed into α-onocerin by LCD. Interestingly, another OnS, LCE, with moderate sequence identity (58%) to LCD and the hallmark DCTAE and QW motifs characteristic of OSCs, was later discovered from the genome.55 LCE was found to cyclize pre-α-onocerin into the serratane-type onoceroids tohogenol and serratenediol (Fig. 14B).

The initial reaction of LCE with pre-α-onocerin begins with a proton attack on the epoxide moiety. This is succeeded by bicyclization and forming of a C8 cation. The following cyclization, involving the neighbouring C14─C27 exo-methylene, bridges two decalin structures, constructing a seven-membered C-ring and producing a pentacyclic C14 carbocation intermediate (Fig. 14B). This intermediate can either undergo a nucleophilic addition of water, resulting in tohogenol or experience a deprotonation by H15 to produce serratenediol. Considering these findings, the active site of LCE may possess an expanded space around the C─D rings of substrate. This space could potentially accommodate a solvent water molecule, leading to product variations between LCD (onocerin synthase) and LCE (serratane synthase).121

Recently, a developed genome mining strategy, primarily focusing on the Pyr4 homologues, unveiled a unique fungal genome harbouring both the Pyr4-like protein and the OnS. This coexistence, previously unobserved, suggests a two-round cyclization process in fungal onoceroid biosynthesis. Delving into the specifics, three selected biosynthetic gene clusters (BGCs) from Aspergillus homomorphus CBS 101889, Aspergillus fumigatus A1163 (CBS 144.89), and Aspergillus alliaceus CBS 536.65, named homo, fumi, and alli clusters, respectively, were scrutinized.123 This led to a clearer understanding of the fungal onoceroid biosynthesis pathway.

All three BGCs contain characteristic core genes encoding the Pyr4-like proteins (AlliB, FumiB, and HomoB), triterpene synthases (AlliS, FumiS1, and HomoS), and the FAD-dependent monooxygenases (AlliM, FumiM, and HomoM). Initially, HomoS and FumiS convert squalene into α-polypodatetraene. This precursor is then epoxidized by HomoM/FumiM to form the OnS intermediate 1. Subsequently, the pathways diverge under the influence of HomoB and FumiB: HomoB employs a water quenching to form homomonoceroid A, while FumiB utilizes a combination of a 1,2-hydride shift, 1,2-methyl shift, and deprotonation at C19 to generate fumionoceroid A, which is later oxidized by FumiP to produce fumionoceroid C. On another pathway, AlliS transforms squalene into 8α-hydroxypolypoda-13,17,21-triene. Following its epoxidation by AlliM, OnS intermediate 2 is transformed by AlliB, resulting in alliaonoceroid A (Fig. 14C).

4.5. Mechanisms of the MTCs

Meroterpenoids, natural products comprised of a terpenoid and a non-terpenoid moiety, feature complex structural architectures and are known to be selective inhibitors to several biological targets and drug leads.21,124-127 MTCs are distinguished from canonical class II TCs by their noncanonical sequences, which lack the conserved DxDD motif region. Similar to TTCs, MTCs do not rely on Mg2+ ions for catalysis due to the absence of a diphosphate group in their substrates.

The biosynthetic timing in meroterpenoid biosynthesis generally follows a specific pathway. First, the terpenoid moiety is transferred to a non-terpenoid moiety, commonly a shikimate-derived moiety, polyketide system, or indole ring.21,126 The acyclic terpenoid moiety then often undergoes epoxidation, but not always, and is then cyclized by the MTCs in an OSC-like catalytic manner. The variations in cyclases, substrates, and location of initial protonation all shape the early steps of meroterpenoid biosynthesis.

4.5.1. 4-Hydroxy-6-(3-pyridinyl)-2H-pyran-2-one (HPPO)-derived MTCs

The nicotinic acid-derived polyketide 4-hydroxy-6-(3-pyridinyl)-2H-pyran-2-one (HPPO) can be transformed into either farnesyl-HPPO or geranylgeranyl-HPPO by PTs.91,128 The newly added isoprenoid tails are commonly cyclized, by class II-like TCs known as integral membrane cyclases, to diversify the structures of these meroterpenoids. While these MTCs are not as comprehensively understood as the TCs described above, Pyr4 and its homolog OlcD are the most well studied.91,128 Despite sharing 56% sequence similarity, Pyr4 and OlcD display distinct substrate specificities and product outcomes.128 Pyr4 accepts the epoxyfarnesyl-HPPO and forms three additional 6-membered rings that are fused to the pyranone in deacetyl pyripyropene E. OlcD, on the other hand, selectively catalyzes the cyclization of epoxygeranylgeranyl-HPPO forming predecaturin E, which also contains a 6/6/6 tricycle but one that is not fused to the pyranone moiety (Fig. 15).91,128 MTCs differ from canonical class II TCs in that they currently lack a clearly defined catalytic acid. Previous studies indicated that when the conserved acidic residues E63 and D218 of Pyr4 are mutated, the cyclase activity is lost, supporting that both residues are crucial for catalysis, and one of them likely initiates the reaction by protonating the substrate. The hydrogen bond that putatively forms between them could further enhance the acidity of the catalytic acid.91 However, structural data of these integral membrane cyclases will be welcomed to ascertain key sequence and structural motifs.91

Fig. 15.

Fig. 15

Pathway towards HPPO derived meroterpenoids (pyripyropene A and deoxyoxalicine A) catalyzed by Pyr4 and OlcD.

4.5.2. 3,5-Dimethylorsellinic acid (DMOA)-derived MTCs

In contrast to Pyr4 and OlcD, which possess distinct substrates, AusL, Trt1, AdrI, PrhH, InsB2, and InsA7, Pyr4-like MTCs, share a common substrate, 3,5-dimethylorsellinic acid (DMOA). However, AusL, Trt1, AdrI, and PrhH demonstrate diversity in their final deprotonation. After the 6-6-6-6 cationic intermediate is formed, the biosynthetic route diverges to form three different hydrocarbon skeletons. In route a towards the terretonins, Trt1 affords preterretonin A, a 6-6-6-5 tetracyclic system with an exocyclic double bond.129 In route b, AdrI forms the 6-6-6-5 tetracylic system containing a cyclohexene moiety in andrastin E.130 In route c, simple deprotonation by AusL and PrhH results in the preservation of the 6-6-6-6 tetracycle as protoaustinoid A (Fig. 16A).131,132

Fig. 16.

Fig. 16

(A) Pathway towards DMOA-derived meroterpenoids, and related depiction of MTCs reactions catalyzed by Trt1, AdrI, AusL/PrhH, InsB2 and InsA7. (B) Catalytic mechanism for the Pyr4-like cyclases with AdrI and InsA7 as representatives.

Interestingly, despite being derived from the same organism Aspergillus insuetus, InsB2 and InsA7 exhibit dissimilar modes of cyclization. While InsB2 shares a 40% sequence identity with AdrI, it nevertheless adopts a comparable all-chair conformation for epoxy-farnesyl-DMOA as the aforementioned enzymes, leading to the synthesis of insuetusin B1. InsA7 mediates a rare cyclization reaction whereby the substrate epoxy-farnesyl-DMOA adopts an unusual pre-boat-chair conformation, leading to the formation of insuetusin A1, a [3.3.1] bicyclic system with an axial C3 hydroxy group. In contrast to certain SHCs that use substrates like (3R)-oxidosqualene to yield axially hydroxylated cyclized products unnatural, InsA7 represents a TC that adopts a pre-boat-chair conformation to process an epoxyprenyl substrate, as a natural reaction, generating a cyclized product with an axial hydroxy group (Fig. 16A).133,134

Recent research has unveiled intriguing insights into the catalytic mechanisms of enzymes InsA7 and AdrI, thereby solving a longstanding question in MTCs, i.e., which residue initiates the reaction.135 Despite utilizing the identical substrate of DMOA, InsA7 and AdrI generate products (insuetusin A1 and andrastin E) with opposite stereochemistries at terpenoid moieties, except for the C3 hydroxy group, which maintains consistency (Fig. 16B). To elucidate the inherent mechanism of these membrane-bound enzymes, AlphaFold2 was used to obtain structural models. In the model of AdrI, residues E63 and D218, which are crucial in Pyr4, are spatially separated.91 While E63 is highly conserved in the Pyr4-family cyclases and proximal to the C3-hydroxy group of andrastin E in the docking model, it does not act as the catalytic acid as evidenced by the retained activity in the E63D and E63Q variants. Also, D218 forms a hydrogen bond with R177, emphasizing its role in upholding the protein structure. Instead, D59 of AdrI was identified as the acid responsible for initiating the reaction, with H88 enhancing its acidity. Conversely, in InsA7, E63 seems to take lead in catalysis, hydrogen bonded by neighbouring residue Y36. This distinct mechanism might be attributed to the phylogenetically distinct position of InsA7 and AdrI (Fig. 1). Additionally, the stereocourse of the cyclization by InsA7 resembles that of enzymes such as Pyr4, which accept substrates with S-configured epoxides. In contrast, AdrI and its relatives accept substrates with R-configured epoxides. The role of aromatic residues, namely W32 and W62, in substrate recognition further deepen the understanding to these mechanisms (Fig. 16B).

4.5.3. Indole-derived MDTCs

Fungal indole meroditerpenoids comprise a multifaceted and variegated natural product family. The substrate commonly employed by indole-derived MDTCs is 3-geranylgeranyl-indole (GGI). Remarkably, indole-derived MDTCs utilize a unique catalytic mechanism to achieve a broad spectrum of hydrocarbon skeletons.

PaxB represents a paradigmatic example of an enzyme that can engage in iterative cyclizations if there is an oxidase available to act after the primary cyclization has concluded. The cyclization process of paspaline involves two epoxidations and four cyclizations, achieved through iterative utilization of the epoxidase PaxM and PaxB. This internal epoxidation and the internal epoxide opening are rarely seen in other class II TCs (Fig. 17).136

Fig. 17.

Fig. 17

Pathway towards indole-derived meroterpenoids, and related depiction of MTCs reaction catalyzed by PaxB, AtS5B1, AtS2B, AfB, DesB and EstB1.

GGI, which is formed by the geranylgeranylation of 1-(3-indolyl)-glycerol-3-phosphate, undergoes epoxidation at the C10─C11 olefin by PaxM. The oxide ring is then protonated by PaxB, leading to the formation of a tertiary cation (IDT intermediate) that undergoes cyclization. Another cyclization occurs through anti-Markovnikov addition to the C2─C3 alkene of the IDT intermediate, followed by a Wagner-Meerwein rearrangement, yielding emindole SB. After a second epoxidation of the terminal alkene, PaxB predominantly catalyzes another cyclization reaction producing paspaline (Fig. 17).136

Similarly, DesB and EstB1, originating from Aspergillus desertorum and Aspergillus striatus, respectively, catalyze an unusual Markovnikov-like cyclization cascade on the IDT intermediate, resulting in emindole DB and emindole SA with exceptional regio- and stereospecificities.137 Specifically, through a mechanism targeting the Re face of the C2 olefin in the IDT intermediate, DesB produces emindole DA. In contrast, EstB1, which targets the Si face, leads to the formation of emindole SA. Meanwhile, the biosynthesis of emindole DB mirrors that of paspaline, involving another sequential epoxidation and cyclization, catalyzed by DesM (PaxM homologue) and DesB (Fig. 17).

AtS2B, AtS5B1, and AfB each accept epoxy-GGI as a substrate, but they differ in their catalytic processes.138 Specifically, AtS5B1 directs a 1,2-methyl migration at C18 to the C3 carbocation, followed by deprotonation at C8, resulting in an intermediate that deviates from the PaxB mechanism, which is then undergoes the epoxidation by AtmM and another cyclization by AtS5B1, yielding emindole (Fig. 17). Conversely, AtS2B utilizes a cyclization mechanism akin to PaxB for epoxy-GGI, yet deviates in the subsequent steps. After a sequence of 1,2-methyl and 1,2-hydride shifts, the cationic intermediate is terminated directly, producing anominine (path a). In contrast, 10,23-dihydro-24,25-dehydroaflavinine is created via a preceding 1,3-hydride shift and subsequent double bond attack (path b) (Fig. 17). Unlike AtS2B, AfB undergoes an extra 1,2-hydride shift to yield the C15 cation and is deprotonated at C14 resulting in the formation of aflavinine. The absence of structural data pertaining to indole-derived MDTCs poses significant challenges to a comprehensive understanding of their ability to control carbocation intermediates and mediate cyclization cascades. It is noteworthy that PaxB and AtS5B1 clade into the same branch of the phylogenetic tree (Fig. 1), implying similar product selectivity in the formation of emindole-type products. In contrast, the aflavinine skeletons are produced by AtS2B and AfB, which is cladded in the same branch (Fig. 1).

4.5.4. Merosesquiterpene cyclases (MSTCs) and MDTC

There are a few known exceptions to the general meroterpene biosynthetic order of epoxidation, epoxide opening, and cyclization. Among them, MacJ from Penicillium terrestris LM2 produces the immunosuppressive macrophorins and DmtA1 from Streptomyces youssoufiensis OUC68199 produces the cytotoxic drimentine G. Both MacJ and DmtA1 perform cyclization on substrates without epoxides, as the canonical class II TCs (Fig. 18A and 18B).139,140

Fig. 18.

Fig. 18

Pathway towards macrophorins A and drimentine G that are catalyzed by MSTCs of MacJ (A), DmtA1 (B), and TylF (C), respectively.

Despite the fact that cyclization mechanism of MacJ appears similar to that of ent-CPS and SHC, it does not bear any resemblance to the sequences of class II TCs; however, it shares 30% identity with Pyr4. A potent Brönsted acid is required for the protonation of the olefin. Site-directed mutagenesis demonstrated that E72, D96, and D229 are indispensable for cyclization, while the E72 and D229 correspond to E63 and D218 in Pyr4.91 The latest research on AdrI and InsA7 has provided a fresh perspective.135 It suggests that the residue corresponding to D218 in Pyr4 may not be crucial for cyclization. A single Asp is typically unable to protonate the alkene directly. Drawing from this insight, D96 and the adjacent E72 may assume a role similar to the DxDD motif found in canonical class II TCs, potentially protonating the terminal olefin directly.98,141 Although DmtA1 is an indole-derived MTC (only 19% identity with Pyr4), it also initiates polyene cyclization by protonation of an alkene, leading to the formation of the drimane unit (Fig. 18B). Its dyad E60/D214 is analogous to that observed in MacJ and Pyr4.91 Similarly, taking into account the aforementioned considerations on MacJ and the modeled structure of DmtA1, mutation experiments revealed a uniquely conserved and important residue, D94.141 This residue may mirror the role of D96 in MacJ. Combined the previous results, D94 and E60 in DmtA1 may cooperate to facilitate a protonation process.

Recently, a distinct MDTC, TylF, from Brasilonema sp. HT-58-2, was found to cyclize the 3-geranylgeranyl-4-hydroxy benzoic acid and ultimately leading to the production of tolypodiol (Fig. 18C).142 Although TylF shares a similar biological origin with the MDTC MstE and both substrates possess similar non-terpene moieties, specifically p-hydroxybenzoic acid and p-hydroxybenzoate/hydroquinone, their sequence identity is notably low.76 Instead, TylF shows greater resemblance to the fungal MTCs, e.g., Trt1, whose substrate is DMOA.129

Based on the proposed mechanism of Trt1-like enzymes, the catalytic acid in TylF was not found to correspond to D59 in AdrI; instead it aligned with E51, mirroring E63 in InsA7.135 This Glu is highly conserved in enzymes analogous to TylF and is present in the well-conserved sequence L44SANIAWEFLF54. Subsequent MD simulations indicated significant roles played by R62 and R126. Upon substrate entry, R62 and R126 at the top of the cavity define the boundary of the active site. During the MD simulations, the distance between R62 and the carboxylate of the substrate fluctuated between 2.95 and 11.87 Å; for R126, this range was between 3.17 and 11.11 Å. This suggests potential hydrogen bonding between these two residues and the substrate during catalysis. After cyclization, R62 and R126 no longer associate closely with the carboxylic acid, allowing the top of the protein to open and provide an exit route for the cyclized product from the presumed active site cavity.

4.5.5. Atypical MTCs

The discovery of two MTCs, AscF and AscI, in Acremonium egyptiacum F-1392, has illuminated novel enzymatic processes for producing the ascochlorin and ascofuranone.143 This discovery is intriguing as it contradicts the prevailing consensus that most Pyr4-like MTCs conventionally yield tetracyclic ring systems (Fig. 19A).21

Fig. 19.

Fig. 19

Catalytic mechanism of AscF, AscI, CtvD and AtnI/NtnI. (A) Cyclization catalyzed by AscF and AscI. (B) Cyclization catalyzed by CtvD. (C) Cyclization catalyzed by AtnI/NtnI.

AscF plays a pivotal role in transforming ilicicolin A epoxide into ilicicolin C. This transformation begins with protonating the terminal epoxide of ilicicolin A epoxide, and following by cyclization and a series of 1,2-hydride shifts and one 1,2-methyl shift. The culmination of this process sees deprotonation of the alcohol, leading to the creation of the (14S,15R,19R)-trimethylcyclohexanone ring structure of ilicicolin C. While they yield different ring systems, AscF retains the seven transmembrane helices and conserved motifs that are characteristic of Pyr4-like family (Fig. 19A).

AscI presents a contrasting structure and function, sharing no sequence homology with Pyr4 and possessing an eight-transmembrane helical structure.143 AscI acts upon the P450-catalyzed (AscH) hydroxylated product of ilicicolin A epoxide. AscI guides the attack from the hydroxyl group to the terminal epoxide resulting in ascofuranol. It is worth noting that CtvD, a hydrolase from Aspergillus terreus var. aureus, shares a notable 52% sequence similarity with AscI (Fig. 19A). This enzyme plays a pivotal role in citreoviridin biosynthesis. Specifically, CtvD protonates the product formed by CtvA, CtvB, and CtvC, subsequently attacking another epoxide. This sequence of reactions concludes with a water attack, resulting in the formation of the pentacyclic moiety characteristic of citreoviridin (Fig. 19B).144

AtnI/NtnI, key for arthripenoid C biosynthesis in Arthrinium sp. NF2194 and Nectria sp. Z14-w, also lacked sequence homology with the Pyr4-like protein family.145 In fact, AtnI has 43% sequence identity with fungal OSC.146 AtnI/NtnI catalyzes a two-step reaction involving epoxide opening and cyclization of the product derived from AtnH/NtnH-AtnA/NtnA (Fig. 19C).

5. Engineering in class II TCs

With the elucidation of the intricate catalytic mechanisms of class II TCs, the prospect of engineering these enzymes is now within reach. TCs provide several possibilities for leveraging substrate promiscuity, therefore offering potential utilities for organic synthesis. As with class I TCs, mutation of the hydrophobic/aromatic residues in and around the active site may yield premature products or alter the cyclization routes to generate novel skeletons. In the final phase of deprotonation during catalysis, the water molecule in active site governs the process, sometimes ligated to one or two amino acids, or the deprotonation may be mediated directly by a specific amino acid, impelling these entities to the forefront of interest as potential targets for the molecular engineering of class II TCs. Recent studies have unveiled that the strategic modification of the catalytic bases has been effective in diverting the original function of the enzyme to yield alternate reaction products that augment the potential utility of class II TCs. Correspondingly, these studies have led to further appreciation of the cyclization mechanisms.

5.1. TTCs engineering

SHC employs a restricted active site that comprises of several aromatic residues to generate a cationic enclosure that directs the acyclic substrate and the resultant carbocation intermediates to a hopene skeleton. SHC is mostly operational on acyclic terpene substrates, not only accepting squalene but other shorter acyclic substrates including geranylgeraniol and farnesol (Fig. 20A).147 The conversion of linear acyclic alcohols such as farnesol and geranylgeraniol by SHC facilitates the engineering of selectivity towards (−)-isopuleol, the linear precursor of (R)-citronella. Chemically, this reaction can be completed via an acid-catalyzed Prins reaction, a carbon-carbon bond-forming reaction that involves the condensation of olefins with aldehydes in the presence of a Lewis or Brönsted acid.148 However, this reaction is not highly stereoselective. SHC was found to accept non-natural substrates and convert them into important polycyclic molecules, such as the formation of isopuleol from citronella.147,149-151 Native AaSHC showed low conversion (<2%) to isopuleol, but two variants (F486C and W555Y) were reported to increase the isopuleol formation to over 50% but with low diastereoselectivity.150 Active site residues within 15 Å of a docked model with citronella were targeted to increase diastereoselectivity. Impressively, I261A showed a >99% de diastereoselectivity for the formation of (−)-iso-isopuleol from (S)-citronella with a 11% conversion.152,153 To further improve the selectivity towards other isopulegol isomeric products from citronella by AaSHC, single and multiple residues were examined. An active site mutant library containing 114 mutants was created for screening. Finally, the triple mutant A419G/Y420C/G600A increased product conversion by 10-fold with showed a high selectivity (9:1) of (−)-isopuleol and (+)-neo-isopuleol from (R)-citronella. Another triple mutant of AaSHC, F365A/Y420W/G600F, further enhanced selectivity without a decrease in conversion, showing >99% de for cyclizing (R)-citronella into (+)-neo-isopuleol (Fig. 20B).154 Additionally, AaSHC demonstrates the capability to convert homofarnesol into (−)-ambroxide, a valuable ingredient in various perfumes and fragrances with a low catalytic performance (total turnover number <103).155 Recent study revealed that the M132R/W169G/I432T/G600M variant of AaSHC displayed a turnover number in 105 range and high stereoselectivity of 99% ee and 99% de (Fig. 20C).156 Molecular dynamics simulations and tunnel analyses elucidated that W169G generates sufficient space for the G600M side-chain, while M132R appears to interact with F434, causing the loop encompassing F437 to flip. This, in turn, affords greater flexibility to M600, enabling homofarnesol to adopt a more favorable pre-folded conformation.156

Fig. 20.

Fig. 20

Engineering of SHC. (A) Native SHC exhibits substrate promiscuity and cyclizes isoprenoid alcohols. (B) AaSHC variants demonstrate their capacities to convert (R)-citronella into either (−)-isopuleol or (+)-neo-isopuleol through a Prins reaction. (C) AaSHC mutant exhibites great capacity to transform homofarnesol into (−)-ambroxide with high stereo selectivity. (D) TelSHC variants demonstrated their capacities to catalyze semipinacol rearrangements with high yield and enantioselectivity.

To augment the substrate range of SHC, TelSHC, a homolog of AaSHC from Thermosynechococcus vestitus, was identified from 450 SHC variants including several SHC homologues and AaSHC mutants.157 TelSHC facilitated the conversion of allylic alcohols into spiro-containing products through a semipinacol rearrangement mechanism (Fig. 20D). Site-saturation mutagenesis showed that the Y618L/G609F double mutant of TelSHC produced a significantly greater turnover number of 4534 when compared to wild-type AaSHC (Fig. 20D). Moreover, this variant displayed enhanced enantioselectivity by virtue of its efficient semipinacol rearrangement. The Y618L/G609F variant amplified the active site dimensions by adding more steric bulk at positions 609 and broadening the width of the pocket to 12.4 Å. Most notably, it caused the substrates to flip, resulting in a productive protonation distance of 4.0 Å that was not apparent in the native enzyme.157

β-Amyrins are 6-6-6-6-6 pentacyclic products of 2,3-oxidosqualene that are found in the root tips of Oats (Avena genus) and provide protection against soil-borne pathogens.158,159 The enzyme responsible for the initial committed step, β-amyrin synthase (SAD1), was discovered and engineered to form the 6-6-6-5 tetracyclic steroid skeleton by substituting S728 to Phe.160 When the S728F variant was incorporated into the root tips of Avena strigosa, the new triterpenes dammarenediol-II (DM) and epoxydammara-3,25-diol (epDM) were discovered. When expressed in yeast, epDM was found as the major product with DM and β-amyrin seen as minor products suggesting dioxidosqualene (DOS) is preferred over the native substrate 2,3-oxidosqualene (OS) in yeast (Fig. 21).160 Homology modelling of SAD1 supports a loop containing of I554, T260, and Y264 is close to S728. This particular loop is thought to obstruct the entryway to the substrate access channel, and it is expected to experience a conformational transformation to facilitate substrate ingress to the active site. The S728F substitution alters the hydrogen-bond network in this loop, thus regulating the substrate access.160 For additional engineering of plant OSCs, readers can refer to the comprehensive review by Min et al.55

Fig. 21.

Fig. 21

Engineering of SAD1, an OSC enzyme from Avena strigosa. SAD1 catalyzes β-amyrin formation from 2,3-oxidosqualene. When expressed in Avena, SAD1 S728F created the tetracyclic dammaranediol-II; when expressed in yeast, the pentacyclic epoxydammara-3,25-diol was formed.

5.2. DTCs engineering

Most recent engineering studies of class II DTCs have centered on altering the location and identity of the general base. Such changes modify when and how the enzymes quench the carbocation, ultimately modifying the product skeleton or resulting in cyclized alcohols.

5.2.1. Engineering CPS

In AgAS, H348 is appropriately positioned across the substrate binding cavity from the catalytic acid. This His, ligated with the adjacent Y287, is highly conserved in homologous DTCs and was initially regarded as the catalytic base.105,106 Surprisingly, H348A/D621A mutant of AgAS (the D621A mutation abolishes the activity of the class I TC domain) afforded 8α-hydroxy-CPP. This hydroxylation product was also reported from the homologous bifunctional TC from Abies balsamea when Asp replaced the His at this position.161 Therefore, H348 in AgAS was similarly altered to Asp. This H348D/D621A double mutant produced a mixture of 8α-hydroxy-CPP, CPP, and labda-7Z,13E-dienyl diphosphate.105 Based on the product change, a water molecule was proposed to be positioned by H348, mutation of which to Asp repositioned water for nucleophilic attack rather than deprotonation (Fig. 22).

Fig. 22.

Fig. 22

Engineering of AgAS. Water shows three alternative routes to yield distinct products including deprotonation of C17 or C7 to yield CPP or labda-7Z,13E-dienyl diphosphate, respectively, or nucleophilic attack at C8 to give 8α-hydroxy-CPP.

This success in product alteration of a class II TC was further validated in AtCPS. Investigation on AtCPS was initiated by the detection of the well-conserved His at position 263 in known ent-CPSs.35 An Ala substitution at this position produced the 8β-hydroxy-ent-CPP.93 However, an analogous Asp-to-His substitution in AgAS failed to divert product formation in the same way. The N322A variant of AtCPS showed the same product profile as that of H263A (Fig. 23).

Fig. 23.

Fig. 23

Engineering of AtCPS. Native AtCPS yields ent-CPP. Ala substitutions at H263 and N322 allow the introduction of water to C8 from one side, thereby producing 8β-hydroxy-ent-CPP. Tyr substitution at H263 prevents typical deprotonation yielding (−)-KPP.

The observation of the 8β-hydroxy-ent-CPP resulting from the reaction revealed that the labda-13E-en-8-yl+ intermediate can be attacked from one side, suggesting the role fully ligated state of the water molecule to be the catalytic base. Intriguingly, the AtCPS H263F/Y variants produced the skeletal rearranged clerodane product (−)-kolavenyl diphosphate (KPP).95 When the assay was performed in D2O, and thus forcing the catalytic acid to deuterate GGPP, KPP did not retain a deuterium atom supporting final deprotonation occurred with the same proton that was initially added (Fig. 23).109

In the process of gibberellin biosynthesis in land plants and bacteria, there is a His/Asn catalytic base dyad (as LHS and PNV motifs) conserved in the related ent-CPS.93,107 The alanine substitutions were conducted for the N363 and N362 of catalytic base pair of monocots exemplar OsCPS1 from Oryza sativa and the earlier diverging bryophytes PpCPS/KS from Physcomitrella patens, respectively, yielded 8β-hydroxy-ent-CPP and a trace amount of ent-KPP.93 By engineering the corresponding pairs of amino acids (H173/N240, H206/T272, and H196/D502) in the catalytic base dyad of EtCPS from Erwinia tracheiphilia, GfCPS from Gibberella fujikuroi, and SpCPS from Streptomyces platensis, a predominant production of 8β-hydroxy-ent-CPP was observed across all three enzymes.34,93

However, SmCPS, a CPS from Lamicaceae plant Salvia miltiorrhiza involved in the biosynthesis of tanshinones, has a Phe/His catalytic dyad. Several mutants on the F256/H315 pair fail to any interesting results, and the Thr proximal to the His was investigated. The double mutant H315N/T316V showed the production of terpentedienyl diphosphate (TPP) and a trace amount of syn-halima-5,13E-dienyl diphosphate (syn-HPP). The labelling assay with D2O showed the absence of any deuterium atoms on TPP, confirming again that initial acidic proton is deprotonated by the same Asp in the DxDD motif (Fig. 21C).93,162

5.2.2. Engineering CLPP synthase

As ent-CPS could be switched to a KPP synthase by engineering the catalytic base group, it was natural to consider that a clerodienyl diphosphate (CLPP) synthase could converted into an ent-CPS. By changing SdCPS2, a CLPP synthase from Salvia divinorum, into the F255H variant, yielding ent-CPP (Fig. 24A).94 The combined mutations of N313A/F255A and W360A of SdCPS2 produced 8-hydroxy epimers of ent-CPP (Fig. 24). To expand the chemical space of SdCPS2, the C359, the residue proximal to W360, of SdCPS2 was also subjected to mutation, and the double mutant W360S/C359F resulted in the generation of labda-7,13E-dienyl diphosphate (Fig. 24A).94

Fig. 24.

Fig. 24

Engineering of SdCPS2, PcCPS1, SmCPS, and Tpn2. (A) Engineering of SdCPS2 and TwTPS14. SdCPS2 F255H and TwTPS14 Y265H lead the formation of ent-CPP; W360S/C359F leads the formation of labda-7,13E-dienyl diphosphate. (B) Engineering of PvCPS1. The F251A/V/A/S/G variants of PvCPS1 catalyzes GGPP into (5R,8S,9R,10S)-13Z-CLPP and NNPP into (5S,8S,9R,10R)-13Z-CLPP. (C) Engineering of SmCPS and Tpn2. Engineering of SmCPS shows the CPP transformation to TPP, which is the native product of Tpn2.

Also, the position and role of H263 in AtCPS facilitated the identification of F251 in PvCPS1, a cis-trans-CLPP synthase in Panicum virgatum. Mutations of F251V, F251A, F251S, and F251G resulted in the formation of a rare cis-clerodane diastereomer, (5R,8S,9R,10S)-13Z-CLPP (Fig. 24B).163 Intriguingly, when the four variants were tested with (Z,Z,Z)-nerylneryl diphosphate (NNPP), an alternative cis-clerodane diastereomer, (5S,8S,9R,10R)-13Z-CLPP, was produced; the F251A/V mutants produced 5–8 fold more product than the F251G/S (Fig. 24B).163

Reciprocally, the TPP synthase Tpn2 from Kitasatospora sp. CB02891 was engineered into a syn-CPP synthase, adding to the class II DTC engineering map.44 The narrower active site created by the W384 and L291 provides a template for syn-CPP+ formation but no basic residues are positioned for the deprotonation of the intermediate, therefore, a series of methyl and hydride migration forces the intermediate to use D296, the initial general acid, for deprotonation to form TPP. Positioning a basic residue close to C17 of the syn-CPP+ intermediate is key for this transformation. In the case of PtmT2, a homolog DTC of Tpn2, the water molecule ligated with D502 is proposed to deprotonate the final carbocation intermediate. Therefore, G485 in Tpn2, which is situated 3.6 Å from C19 of GGPP, was changed to Asp thus providing a base and allowing the Tpn2 G485D variant to form syn-CPP (Fig. 24C).34,44

5.2.3. Engineering PPPS

Similar product alteration was seen in the peregrinol diphosphate (PPP) synthase from the Lamicaceae plant Marrubium vulgare (MvPPPS). A water molecule must be properly positioned to attack the carbocation at C9 to form PPP.164,165 MvPPPS has a Tyr/Asn dyad, yet substitution of the dyad didn’t show function change. Double mutation of F505 and W323, which are located in the bottom of active site and near to both GGPP and the catalytic acid, to W323L/F505Y and W323F/F505Y allowed the production of halima-5(10),13-dienyl diphosphate (5/10-HPP). The mutant keeps the water molecule distant from the C9 of GGPP and allowed a series of hydride shift and methyl migration to continue the cascade until final deprotonation at C5 afforded 5/10-HPP (Fig. 25).

Fig. 25.

Fig. 25

Engineering on MvPPPS can alter the original PPP formation into 5/10-HPP.

5.2.4. Engineering HPP synthase

Recently, halimadienyl diphosphate synthase (HPPS) from M. tuberculosis, refer to Rv3377c mentioned in Section 2.2, provided new insights into the mechanism and ability to engineer product profiles.45,89

Based on docking experiments, Y479 in HPPS was proposed to be the catalytic base due to its expected proximity to C6 of final intermediate halima-13-en-5-yl+.89 The Y479F variant shunted the cationic cascade to yield labda-8,13-dienyl diphosphate. Initially, a secondary base, Y328, was proposed; however, Y328F mediated the efficient production of HPP, labda-8,13-dienyl diphosphate, and CPP. The double mutant Y328F/Y479F selectively produced labda-8,13-dienyl diphosphate. The results suggested the actual secondary base might be the backbone carbonyl of I474, which is oriented 3.2 Å away from C8 in the labda-13-en-9-yl+ diphosphate intermediate.

Testing the F171H variant showed altered production formation to CPP, along with small amounts of HPP, suggesting a role in deprotonation of C17. HPPS F171A also showed production of small amounts of HPP and labda-8,13-dienyl diphosphate. The mutants constructed on the catalytic base of HPPS failed to generate fully rearranged KPP, which may be attributed to the higher energy barrier in the last 1,2-methyl migration than the preceding 1,2-shifts in AtCPS. In combination with the product outcome generated by the mutants may support that the reactant may shift between different carbocation intermediates when the one is properly oriented for deprotonation (Fig. 26).89

Fig. 26.

Fig. 26

Engineered HPPS alters the product specificity from HPP into CPP, labda-7,13-dienyl PP, and labda-8,13-dienyl PP. The F171A is coloured in orange; the F171H is coloured in red; the Y479F is coloured in purple; the Y328F is coloured in green; the double mutant Y328F/Y479F is coloured in blue.

6. Conclusion and future perspectives

TCs represent one the most powerful biocatalysts in nature, guiding hydrocarbon substrates through a series of C─C bond formations, hydride and alkyl shifts, rearrangements, and sometimes proton shifts to build structurally and stereochemically complex terpene skeletons.1 The class II TCs are unique members of the TC superfamily with mechanisms that mirror that of class I TCs in some respects (i.e., carbocation chemistry, cyclizations, and final cation quench) but are themselves distinct. Specifically, they initiate reaction via protonation and binding of a diphosphate moiety without abstraction (notably for DTCs and STCs). Class II TCs are involved in the early-stage biosynthetic routes of many natural products with examples seen and characterized in bacteria, fungi, and plants.13,20,39,42,59,166,167 In many pathways, the substrates of these enzymes are isoprenoid diphosphates. As class II TCs leave the diphosphate moiety intact, the products are primed for further structural diversification by class I TCs. However, the natural substrate scope of class II TCs reaches far beyond that of FPP and GGPP. These enzymes can recognize purely acyclic hydrocarbons such as squalene or diverse meroterpenes, significantly contributing to the creating of diverse terpenoid scaffolds.9,21,126 To add further complexity, noncanonical TCs that are distinct in their sequences, structures, mechanisms, and cellular locations (i.e., soluble vs membrane-bound) have been identified and characterized.46,81,168

Research on class II TCs has spanned almost three decades since the initial structural determination of SHC. However, the community is only in its infancy in understanding how to successfully and logically engineer these enzymes to alter the products. The revelations of the intricate catalytic mechanisms utilized by class II TCs has facilitated substantial engineering advancements and endowed examples with greater functionality, improved catalytic properties, or altered substrate selectivities. With only 12 structures of class II TCs reported, there are limited structural data to inform engineering efforts. And while researchers have benefitted and will continue to benefit from advanced in artificial intelligence and machine learning approaches, such as the protein structure prediction program Alphafold2,141 high-resolution structures in complex with natural and unnatural substrates is still highly desired.33,169 In addition, combining theoretical modeling studies, sophisticated docking algorithms, and progressive molecular dynamic simulations with classical biochemical and structural experiments will propel discoveries in enzyme mechanism and engineering.

Finally, genome mining has proven to be an effective strategy for discovering novel class II TCs, of both the canonical and noncanonical varieties.9,14,33,170 This is true for both the microbial systems, as well as sources that have not yet been tapped such as MTCs from plants and animals.171,172 The extensive chemical space that remains unexplored in the realm of class II TCs represents a vast opportunity for new discoveries. The anticipation is high, and it remains to be seen what will unfold in the future.

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China Grant 82073746 (L.-B.D.), the National Institutes of Health Grant R35 GM142574 (J.D.R.), the Thousand Youth Talents Program of China (L.-B.D.), the Jiangsu Specially Appointed Professor Program (L.-B.D.), the Double First-Class University Project (CPUQNJC22_04) (L.-B.D.), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province KYCX23_0873 (X.P.).

Footnotes

Conflicts of interest

There are no conflicts of interest to declare.

9. References

RESOURCES