Significance
More than 400 million tons of plastic waste is produced each year, the overwhelming majority of which ends up in landfills. Bioconversion strategies aimed at plastics have emerged as important components of enabling a circular economy for synthetic plastics, especially those that exhibit chemically similar linkages to those found in nature, such as polyesters. The enzyme system described in this work is essential for mineralization of the xenobiotic components of poly(ethylene terephthalate) (PET) in the biosphere. Our description of its structure and substrate preferences lays the groundwork for in vivo or ex vivo engineering of this system for PET upcycling.
Keywords: Poly(ethylene terephthalate), plastic, Rieske dioxygenase, terephthalate, oxygenase
Abstract
Several bacteria possess components of catabolic pathways for the synthetic polyester poly(ethylene terephthalate) (PET). These proceed by hydrolyzing the ester linkages of the polymer to its monomers, ethylene glycol and terephthalate (TPA), which are further converted into common metabolites. These pathways are crucial for genetically engineering microbes for PET upcycling, prompting interest in their fundamental biochemical and structural elucidation. Terephthalate dioxygenase (TPADO) and its cognate reductase make up a complex multimetalloenzyme system that dihydroxylates TPA, activating it for enzymatic decarboxylation to yield protocatechuic acid (PCA). Here, we report structural, biochemical, and bioinformatic analyses of TPADO. Together, these data illustrate the remarkable adaptation of TPADO to the TPA dianion as its preferred substrate, with small, protonatable ring 2-carbon substituents being among the few permitted substrate modifications. TPADO is a Rieske [2Fe2S] and mononuclear nonheme iron-dependent oxygenase (Rieske oxygenase) that shares low sequence similarity with most structurally characterized members of its family. Structural data show an α-helix–associated histidine side chain that rotates into an Fe (II)–coordinating position following binding of the substrate into an adjacent pocket. TPA interactions with side chains in this pocket were not conserved in homologs with different substrate preferences. The binding mode of the less symmetric 2-hydroxy-TPA substrate, the observation that PCA is its oxygenation product, and the close relationship of the TPADO α-subunit to that of anthranilate dioxygenase allowed us to propose a structure-based model for product formation. Future efforts to identify, evolve, or engineer TPADO variants with desirable properties will be enabled by the results described here.
The discovery of a bacterium that assimilates poly(ethylene terephthalate) (PET) (1, 2), the synthetic polymer used for clothing, single-use plastic bottles, and carpets, has generated great excitement over the prospect of using biological catalysis for recycling this abundant waste product (3–7). The pathway for PET assimilation by Ideonella sakaiensis begins with a pair of esterases that hydrolyze the polymer to its constituent monomers, ethylene glycol and terephthalate (TPA). Ethylene glycol is a natural product that is metabolized by multiple bacteria (8–10). TPA resembles plant-derived aromatic compounds, but it is not widely known as a substrate for bacterial growth. Because of its size and charge (−2 at pH 7), TPA must be actively transported into the cell, where it is cis-dihydroxylated and dearomatized to yield 1,2-dihydroxy-3,5-cyclohexadiene-1,4-dicarboxylate (DCD) (11–17). The initial dihydroxylation is catalyzed by an O2-dependent terephthalate dioxygenase (TPADO) working in conjunction with an NAD(P)H, flavin, and iron–sulfur-dependent reductase. A zinc-dependent dehydrogenase finally reductively decarboxylates DCD to produce protocatechuate (protocatechuic acid [PCA]) (Fig. 1) (1). Interest in using this pathway, either in vitro or engineered into microbes optimized for PET conversion, is motivated by the societal, ecological, and economic benefits of plastics reclamation, recycling, and upcycling (7, 18, 19). Obtaining a structural and functional understanding of each of the four enzymes is an essential step toward these future applications.
Native systems for TPA import and cis-dihydroxylation have thus far been identified in several bacteria, including I. sakaiensis and several strains that do not use PET as a carbon source. The latter includes Comamonas sp. strain E6 (12, 17), Comamonas testosteroni T-2 (15), C. testosteroni YZW-D (13), Delftia tsuruhatensis T7 (20), Rhodococcus sp. DK17 (11), Rhodococcus jostii RHA1 (14), Pseudomonas umsongensis GO16 (8), and Acinetobacter baylyi ADP1 (TPA importer) (21). Several studies have introduced TPA catabolism into microbes for metabolic engineering applications as well (22–26).
The TPADO enzyme is a member of a large family of Rieske oxygenases (ROs). Several members of this family permit bacteria to aerobically assimilate and thereby, remediate a wide range of environmental contaminants (27), such as naphthalene, pyrene, toluene, and chlorobenzoate. Several of these compounds resemble natural metabolites, including benzoate, cinnamate, picolinate, and salicylate, all of which are also RO substrates (28). Naphthalene dioxygenase (NDO), a paradigmatic RO (29–31), catalyzes stereo- and regioselective dihydroxylations on a range of aromatic substrates. Other family members likewise catalyze an array of mono- and dihydroxylations, O- and N-dealkylations (28), desaturations, and sulfoxidations, where several prior studies have suggested a structural basis for reaction type or substrate preference (27, 32). ROs possess a family-defining [2Fe-2S] Rieske cluster that delivers electrons sequentially to an adjacent mononuclear nonheme iron center, where O2 is reductively activated. The ultimate electron source is NAD(P)H, which transfers a hydride to a flavin adenine dinucleotide cofactor in a separate reductase enzyme. One to two additional [Fe-S] clusters serve as one-electron (1e−) shuttles to the active site (SI Appendix, Fig. S1). These clusters are found in diverse protein domain architectures that have traditionally been used for subtyping ROs in types I through V (33). Among the type II family members, which include TPADO, sequence conservation can be surprisingly low (34). Additionally, until recently, there were no structurally characterized examples of type II enzymes. Predicting TPA-directed activity among type II ROs based solely on the primary sequence of the catalytic (α)-subunit is consequently challenging, limiting efforts to prospect for TPADO homologs in sequence databases.
In this work, we describe multiple crystal structures of the TPADO from Comamonas sp. strain E6 (17) in complex with both TPA and an ortho-substituted analog that links TPADO to salicylate 1-hydroxylase (AhdA1c) (35) and anthranilate 1,2-dioxygenase (AndAc) (36) in mechanism. Further, we define catalytic parameters and describe the specificity of TPADO for para-dicarboxylate anions that is well explained by the binding mode observed in the structures. Finally, we show that TPADO has functionally significant sequence relationships with aryl-carboxylate–directed Ros, which in conjunction with the structural data, can be used to propose a TPA recognition sequence motif. The results establish this TPADO and its cognate reductase as a foundation for further protein and strain engineering efforts to achieve biological upcycling of PET plastic.
Results
A Two-Enzyme System Dioxygenates TPA In Vitro.
TPADO, a 196-kDa complex of TphA2 and TphA3 subunits, and its cognate reductase (TphA1, 36.4 kDa) from Comamonas sp. strain E6 were heterologously expressed in Escherichia coli (SI Appendix, Figs. S2 and S3 and Supplementary Methods), with conditions then optimized to maximize both protein and cofactor yields as guided by studies from Ballou and coworkers (37). The α- and β-subunits of TPADO were coexpressed from a single isopropyl β-D-1-thiogalactopyranoside (IPTG)-inducible pET-DUET vector, where the α-subunit contained a C-terminal His6 tag that was used for affinity purification at yields >20 mg/L culture. Native mass spectrometry identified the α3β3-oligomer as the major component, with the individual monomers as minor constituents (SI Appendix, Fig. S4). The monomeric reductase was purified via a C-terminal His6 tag, with yields >10 mg/L culture.
Ultraviolet (UV)/visible (vis), atomic absorption, and electron paramagnetic spectroscopies confirmed the presence and type of [2Fe-2S] cluster in each protein (Rieske and plant types in TPADO and the reductase, respectively) (SI Appendix, Figs. S5–S7) based on their characteristic g values and absorbance maxima. Combined metal and protein analyses predicted approximately stoichiometric occupancy of the expected cofactors in TPADO; however, this assumes that all of the protein (by mass) was present as an intact α3β3-oligomer and that all iron was cofactor associated. Consequently, the concentrations of TPADO used throughout this study, reported in terms of TPADO active sites (three per α3β3-unit), are undoubtedly overestimated since at least some catalytically inactive α- and β-monomers and unpopulated cofactor binding sites were likely present.
TPA was converted by the TPADO/reductase system to an oxidized product specifically in the presence of nicotinamide adenine dinucleotide hydride (NADH) rather than its phosphorylated analog, NADPH. The oxygenation product could be resolved from the NAD+ coproduct by reverse-phase high-performance liquid chromatography (HPLC) when subjected to a complex gradient (SI Appendix, Supplementary Methods). The distinctive UV/vis absorbance spectrum for the product (Fig. 2A) was consistent with absorbance maxima reported for DCD (15). However, this molecule is not commercially available, nor does extensive characterization exist in published literature. Here, we chromatographically resolved the product from NAD+ by HPLC and obtained its UV/vis absorbance spectrum and mass spectral (MS) analyses. Under the ionization conditions used, DCD exhibited fragmentation consistent with the loss of CO2 and/or H2O (Fig. 2B). The expected exact mass of the intact, positively ionized DCD has a molecular ion peak with mass/charge (m/z) = 199, while the predominant ion observed at the retention time of the product had an observed m/z = 137. TPA also displayed similar behavior, losing CO2 during source ionization (SI Appendix, Fig. S8). Targeted fragmentation of the weak-intensity, intact DCD and TPA peaks by liquid chromatography coupled to tandem mass spectrometry (LC-MS-MS) yielded fragmentation profiles that were consistent with the observed source ionization fragmentation, supporting the definitive identification of the latter compound.
Steady-state kinetic parameters were measured as a function of variable [TPA] in ambient air (SI Appendix, Fig. S9). The data readily fit the Michaelis–Menten model, yielding kcat (apparent) = 12 ± 0.3 min−1 and KM[TPA] (apparent) = 9.6 ± 1 μM. The parameters are apparent as saturating concentrations for NADH and O2 have not been determined, and kcat (maximal velocity × [enzyme]−1) is likely underestimated. The kcat and low-micromolar KM are nonetheless similar to values measured for the TPADO homolog salicylate 5-monooxygenase (NagGH) from Ralstonia sp. strain U2 (kcat = 7.81 min−1, KM = 22.4 μM) (38), suggesting that TPADO is well adapted to its substrate, although slow in absolute terms. A total turnover number (TTN) of 704 (per apparent TPADO active site) was measured for the system via determination of unreacted TPA by HPLC when all other reactants were in excess. This TTN is low compared with several enzymes used in applied work, potentially due to overcounting of intact metallo-active sites here or to intrinsic instability of the enzymes. Either explanation suggests the need for engineered improvements to enzyme stability for future applications (39).
TPADO Exhibits Remarkable Specificity for Para-Dicarboxylates.
Many ROs are known either to accept multiple substrates (32) or to uncouple reductive O2 activation from substrate oxygenation, expending NAD(P)H without oxygenating the organic substrate and releasing either H2O2 or water. The substrate or analog binds near to the mononuclear Fe(II), displacing water and opening a coordination position where O2 can be reductively activated. The Fe/O2 species can either productively oxygenate the substrate or when an uncoupler is bound, break down to yield water or H2O2 (29, 40, 41) (Scheme 1 in Discussion).
The specificity of TPADO was assessed by analyzing substrate consumption at a fixed time point (90 min) after incubating TPADO/reductase with stoichiometrically limiting NADH and either TPA (Fig. 2A) or a structural analog (Fig. 3 and SI Appendix, Figs. S12–S14). The stoichiometry of the aromatic substrate as a function of NADH usage was measured by integrating their respective HPLC peaks using a no substrate/no analog control sample as a baseline. Of a series of compounds screened, only TPA and closely related derivatives with hydroxyl and amino substituents at the ring C2 position were oxidized (Fig. 3). Percentage coupling was calculated as (moles substrate consumed) (moles NADH consumed above baseline)−1 × 100%. Both TPA and 2-hydroxy terephthalate (2-OH-TPA) exhibited 100% coupling, while 2-amino terephthalate (2-NH2-TPA) showed 22% coupling (Fig. 3).
Analogs with conservative substitutions to one of the carboxylates (4-nitrobenzoic acid [4-NBA], 4-carbamoylbenzoic acid [4-CBA], and 4-formylbenzoic acid [4-FBA]) yielded no detectable oxygenated products (Fig. 3C). 4-FBA and 4-CBA did not stimulate NADH oxidation above baseline (SI Appendix, Figs. S12 and S13), while 4-NBA acted as an efficient uncoupler, promoting full consumption of NADH without reduction in the substrate analog signal (SI Appendix, Fig. S14). Taken together, these results demonstrate an extraordinarily high level of fidelity for TPA and very closely related diacids as principal substrates of the enzyme.
TPADO Converts 2-OH-TPA and 2-NH2-TPA to PCA.
TPA analogs with -NH2 or -OH substituents at the ring 2 position were converted to a product having a retention time and UV/vis absorbance spectrum identical to those obtained for a PCA analytical standard (CAS Registry number 99-50-3) (Fig. 3 A and B and SI Appendix Fig. S10 and Table S1). 1,2-Bis-hydroxylation uniquely leads to an intermediate where decarboxylation and ring rearomatization can proceed in conjunction with protonation and spontaneous loss of the ring 2 substituent (Discussion). The observation of PCA as the product in both cases, therefore, indicates that hydroxylation must take place at the 1,2-ring (and not, for example, the 1,6-ring) carbons.
TPA Dioxygenase Has a Canonical α3β3-Structure.
To understand the structural arbiters of substrate recognition, we determined the crystal structures of TPADO in the ligand-free state to 2.28 Å resolution and ligand-bound structures with TPA to 2.08-Å resolution and 2-OH-TPA to 1.95 Å resolution. During model building and refinement, clear continuous protein density was observed outside the α3β3-domains and identified as a single lysozyme molecule (SI Appendix, Fig. S15). In a rather fortuitous crystal packing, the lysozyme protein effectively acts as a crystallization chaperone that “glues” the TPADO molecules together (SI Appendix, Fig. S16), a function that, to our knowledge, has not been previously reported for lysozyme. More specifically, one lysozyme molecule forms interactions to five TPADO molecules and binds to both α- and β-subunits. TPADO forms an α3β3-heterohexamer, in which three catalytic α-subunits form a trimeric head-to-tail assembly atop a triad of noncatalytic β-subunits (Fig. 4A). Like other ROs, a Rieske domain containing a [2Fe-2S] cluster and a catalytic domain that comprises the TPA substrate and ferrous ion binding site (Fig. 4 B and C) were located in each α-subunit (25). Residues H210, H215, and D356 coordinate the ferrous ion in the active site, which is 12.2 Å away from the [2Fe-2S] cluster of a neighboring α-subunit (Fig. 4A) and connected via the amino acid side chains H210, D207, and H105 (Fig. 4 B–D). The [2Fe-2S] cluster within the same α-subunit is, by contrast, 42 Å away, where it contributes to the neighboring reaction site.
The mononuclear Fe-coordinating residues H210 and H215 are located on an α-helix spanning residues 208 to 220. In the substrate-free structure, H215 does not coordinate the ferrous ion, and in all three α-subunits, the helix conformation is the same. From residue H215 onward, this helix is disordered in two of the three α-subunits, while the third is stabilized by crystal contacts with a symmetry-related molecule (SI Appendix, Fig. S17 A and B). After crystal soaking with substrates TPA or 2-OH-TPA, the formerly disordered helix residues are ordered but partially unfolded, thereby allowing H215 to coordinate the ferrous ion (Fig. 4E and SI Appendix, Fig. S17 C and D). This suggests altered dynamics of the system upon substrate binding. The helix stabilized by crystal contacts in the apo structure remained in its orientation upon substrate binding and did not unfold to coordinate the iron.
TPA binding is supported by multiple ionic, hydrogen-bonding, and hydrophobic interactions. The substrate was located next to the mononuclear iron in an orientation confirmed by clear unbiased Fo–Fc difference electron density (SI Appendix, Fig. S18). One of the carboxylate groups from the bound TPA forms a salt bridge with R309, positioning the adjacent ring carbons 3.9 to 4.1 Å away from the reactive iron in an orientation that could promote reaction with an activated Fe/O2 species. The side chain of I290 forms a hydrophobic π-interaction with the aromatic ring of TPA, while the second carboxylate group forms a hydrogen bond with S243. Additionally, a potential salt bridge between R390 and this carboxylate is observed, but the distance varies among the α-subunits between 2.9 and 4.6 Å. This apparent flexibility in the R390 side chain is reflected in its high B factors and could have functional significance. For example, residue R390, as well as the α-helix mentioned above and subsequent residues, forms the opening of a potential site for TPA entry or product release that may have a gate function (SI Appendix, Fig. S19). The TPA binding pocket is more open within the substrate-free structure because the α-helical residues from H215 onward are disordered and not part of the model; in total, residues 215 to 227 are missing.
TPA Exhibits a Unique Binding Mode.
A search in the current structural databases utilizing the Dali protein comparison server (42), highlighted NagGH (36, 43) as the closest structural homolog to TPADO, with rmsd values of 1.5 and 1.3 Å and sequence identities of 42 and 26% for the α- and β-subunits, respectively. Furthermore, the α-subunits of related toluene 2,3-dioxygenases, biphenyl dioxygenases, and NDOs are significantly more divergent, with rmsd values ranging between 2.4 and 2.8 Å. A superposition of TPADO with NagGH illustrates their similar quaternary structures (SI Appendix, Fig. S20). While no substrate is present in the NagGH structure for comparison, it appears unlikely that NagGH could bind TPA because the side chain of M257, which replaces the S243-TPA hydrogen-bonding interaction observed in TPADO, would interfere with the carboxylate group of TPA (Fig. 5A).
The salt bridge involving one of the TPA carboxylate groups is conserved in both TPADO (R309) and the NagGH (R323) from Ralstonia sp. (Fig. 5A) (38), although an equivalent H bond to R390 on the opposite end of the substrate is absent. Several additional Rieske dioxygenases possess a positively charged residue (R or K) at the same position as R309 in a pocket that is otherwise largely hydrophobic, but in those cases, the homologous side chain points away from the substrate, as observed in the NDO structure (Fig. 5B) (44). Moreover, the residues involved in electron transfer from the Rieske cluster to the mononuclear iron superimpose well between NagGH and NDO (SI Appendix, Fig. S20).
The structure of TPADO bound to 2-OH-TPA is close to identical to the complex with TPA, but surprisingly, the electron density for this ligand strongly suggests that the 2-hydroxy group is oriented toward the hydrophobic part of the active site and not toward the polar part, where hydrogen-bonding interactions with the carbonyl oxygen of V205 would appear possible (SI Appendix, Fig. S21). It is possible that this binding orientation of 2-OH-TPA promotes catalysis by substrate destabilization, that it favors product release, or that it represents a nonproductive binding mode. To investigate the latter possibility, we aligned the 2-OH-TPA–bound structure with a product-bound NDO structure (Protein Data Bank [PDB] ID code 1O7P) (30). From this alignment (Fig. 5 B and C), the cis-dioxygenated carbons of the (1R,2S)-cis-1,2-dihydroxy-1,2-dihydronaphthalene product appear to align with the 1- and 2-carbons of 2-OH-TPA. These carbons are consequently implicated as the sites of hydroxylation based on the observed formation of PCA as the ultimate product, suggesting that 2-OH-TPA occupies a productive binding mode. This also suggests, by analogy, that the stereochemistry of product of the TPA reaction is predicted to be cis-1S,2R-DCD.
Sequence Similarity Network Analyses Relate the TPADO α-Subunit to a Subgroup of ROs That Hydroxylate Hydrophilic Monoaryls.
Classic RO subtyping by Kweon et al. (33) focused on the different domain organizations that mediate electron flow between the reductase and RO active site. A subsequent structure-based sequence alignment of 121 then available catalytic α-subunits was carried out by Capyk and Eltis (34). This analysis resulted in a first-ever phylogenomic map and suggested how gene fusion events gave rise to the Rieske/mononuclear Fe α-subunit.
Just over 45,000 α-subunit homologs are currently known, permitting subtyping of a much broader scope. Their sequences, members of protein family (Pfam) PF00848, were submitted to an “all-by-all” comparison via the Enzyme Function Initiative’s network analysis algorithm (Fig. 6 and SI Appendix, Fig. S22) (45). Consistent with earlier findings from Capyk and Eltis (34), we observed that a single subgroup of the massive and diverse family has been experimentally oversampled in prior work (cluster 2) (SI Appendix, Fig. S22). This node contains several aromatic/polyaromatic hydrocarbon hydroxylases of interest for bioremediation purposes, including NDO and biphenyl dioxygenase (46).
Beyond this large subgroup, ROs with similar substrate and/or reaction types, where known, were grouped together in this analysis, suggestive of sequence-based family subdivision along functional lines (SI Appendix, Fig. S22). The cluster of sequences to which the α-subunit of TPADO (TphA2) belongs (Fig. 6) also contains NagG, the α-subunit of salicylate 5-hydroxylase, consistent with identification of the latter via the DALI search for TPADO structural relatives. Additionally, other functionally annotated members of the TPADO subfamily were all associated with aryl carboxylate substrates [anthranilate (36), picolinate (47), and salicylate] like TPA, although TPA is the only dicarboxylate of the group.
A sequence alignment of the α-subunits of these annotated members of the TphA2 sequence cluster showed that, of all the residues making direct contact with the substrate in the TPADO structure, only R309 is conserved. Key portions of this alignment are shown in Fig. 6B and SI Appendix, Fig. S23 for TPA-consuming organisms including I. sakaiensis, where the degree of conservation is especially high. As highlighted in Fig. 5A, the analogous arginine in NagG (R323) is proposed to form a salt bridge with the lone carboxylate of the substrate, positioning it for monooxygenation at the ring 5-carbon [yielding gentisate (48)]. This active site arginine is also conserved in the α-subunits of hydroxypicolinate 3-monooxygenase (47), AndAc (36), and AhdA1c (catechol product) (49, 50). A similar role in positioning the substrate via the carboxylate could be proposed for the conserved arginine in the other enzymes, although the sites of ring mono- or dioxygenation vary (Fig. 6A). In TPADO, N224, S243, and R390 interact with the carboxylate at the nonreactive end of TPA but are not conserved across the subfamily. This observation suggests that they may be important for productively positioning dicarboxylate substrates.
Discussion
Plastic bioconversion using microbial enzymes depends on understanding and ultimately, improving the properties of the responsible enzymes. Hydrolyzable ester linkages are ubiquitous in biology, and aromatic compounds bearing polar substituents are abundant in the metabolic pathways of diverse organisms. The presence of both ester linkages and an aromatic building block with a polar substituent in PET is consistent with this plastic constituting a readily available carbon source for bacterial consumption in the biosphere.
TPADO is an α3β3-nonheme RO, which catalyzes the NADH-dependent dioxygenation of TPA, the aromatic subunit of PET. Like the better-studied cytochrome P450s, ROs catalyze a wide range of oxidations and oxygenations of diverse substrates, increasing their water solubility and activating them for further metabolism (51). However, unlike cytochromes P450, certain members of the RO family are capable, perhaps uniquely, of catalyzing the cis-dihydroxylation of an aryl ring. The reaction depends on the proper positioning of the substrate relative to an activated O2 species that forms at a mononuclear iron center, where both O atoms are poised to react on the same side of the plane defined by the substrate’s aromatic ring (32). Dihydroxylation results in both the loss of aromaticity and the formation of two new chiral centers in the product, DCD. This product, in turn, is well situated for further catabolism via an exergonic step, in which CO2 is produced, NADH is regenerated, and the ring is rearomatized to yield a valuable and versatile metabolite, PCA (52). This elegant metabolic arrangement, in which reductant is intrinsically recycled within the pathway, suggests that, despite the complexity of the TPADO/reductase system and ROs in general, this system offers a compelling starting point for engineering an efficient route for TPA bioconversion to PCA.
PET is an abundant, xenobiotic source of environmental TPA, suggesting that TPA might not be the primary but perhaps, a secondary substrate of TPADO. We noted that the bacterial TPADO studied here was nonetheless highly efficient and extraordinarily specific for TPA (KM = 9.6 μM, kcat/KM[TPA] = 2.1 M−1 s−1), coupling NADH oxidation to TPA dihydroxylation with 100% fidelity and excluding a variety of structurally related compounds as potential substrates (Fig. 3). This observation is consistent with an early report on the Comamonas E6 TPADO indicating that neither the iso- nor the ortho-benzene dicarboxylate regioisomers of TPA (para-benzene dicarboxylate) were substrates for the enzyme (17). Two major exceptions identified here are 2-OH-TPA and to a lesser extent, 2-NH2-TPA, where TPADO catalyzed the conversion of each to PCA.
Each of the benzene dicarboxylate regioisomers is used in the production of plastics, whether as a repeating subunit in PET, as a synthetic precursor for monomers used in plastics, or as a noncovalently bound plasticizer (51). The iso- and ortho-benzene dicarboxylates (phthalates) serve one or both of the latter functions in several plastics. Ortho-phthalate has been of special concern as an endocrine disruptor with the potential to leach out of consumer products and into water (53). Bacteria that can degrade each of these compounds have been identified (54), indicating that enzymatic adaptation to these plastic-relevant compounds is not unique to one or a few strains. Recent work even suggests possible biogenic sources for and derivatives of phthalates (55), which could have helped drive RO diversification.
The Comamonas E6 strain has served as a paradigmatic plastic monomer degrader, capable of using each of the benzene dicarboxylate regioisomers as a sole source of carbon and energy (54), via separate operons encoding three distinct RO enzymes. While the TPADO has an α3β3-subunit structure, the ortho-phthalate dioxygenase (PDO) from Comamonas E6 displays an α3α3-hexameric structure. It is possible that iso-PDO and ortho-PDOs from related strains have similar architecture. The strain E6 iso-/ortho-PDO α-subunits share 32% identity with one another, forming a separate lineage from the α-subunits of TPADO and other α3β3-ROs with which they share ≤19% identity (56). Accordingly, the α-subunits of TPADO and a recently reported ortho-PDO structure superimpose poorly (C. testosteroni KF1 PDO) (SI Appendix, Fig. S24) (54). The active site of this PDO exhibited ionic/hydrogen-bonding interactions between the bound ortho-phthalate carboxylates and an Arg-Arg-Ser triad that is unconserved in the TPADO sequence or tertiary structure. Hydrophobic interactions were observed surrounding the benzylic ring. TPA was able to bind with some observed hydroxylation in the same PDO pocket, although with 80% uncoupling and a much lower kcat/KM than for ortho-phthalate. These results suggest that TPADO and PDO are each well adapted to their preferred benzene dicarboxylate regioisomer.
A large-scale network analysis of >45,000 sequences of RO catalytic α-subunits was generated, aimed at understanding the origins of TPA bioconversion. Consistent with the unique homohexameric structures identified by Capyk and Eltis (34), PDO α-subunits from Comamonas are sufficiently sequence distant that they are not grouped with the same PFAM family (PF00848) and are not found in this sequence similarity network (SSN). Analysis of the network identified a subcluster containing the closest sequence relatives to TPADO (Fig. 6 and SI Appendix, Fig. S24). While only a few members of the subcluster are characterized in the literature, they are all known to catalyze reactions with aryl-carboxylic acids, which structurally resemble TPA. Many of the nodes in the cluster shown in Fig. 6A are annotated as similar to AhdA1c, the catalytic subunit of salicylate-1-monooxyenase (IPR043264). A multiple sequence alignment revealed a cluster-wide conserved arginine, which forms a salt bridge to one of the carboxylate groups of TPA (R309) (Fig. 4) and which is proposed to engage in a similar interaction with salicylate based on structural characterization of NagGH (38). An additional set of residues (N224, S243, and R390) has side chains within hydrogen-bonding distance of the second carboxylate in the TPA and 2-OH-TPA costructures with TPADO (Figs. 4 and 5) described here. These are not conserved in the other functionally annotated sequences highlighted in Fig. 6. This conserved motif may, therefore, offer a means of identifying diverse TPADOs from sequence databases. Additional interesting findings resulting from the SSN analysis cannot be discussed here at length. We are supplying the network file for readers to explore the vast sequence space. Many of the clusters in the SSN are made up entirely of sequences with no known function, suggesting that a plethora of catalytic diversity remains to be discovered.
A structure-based mechanism can be proposed to explain the observed products of the TPADO-catalyzed reactions with TPA, 2-OH-TPA, and 2-NH2-TPA considering the data presented here and two additional observations. First, the network analysis revealed a close sequence relationship between the α-subunits of TPADO and anthranilate dioxygenase from Burkholderia cepacia DBO1 (36). This organism can grow with anthranilic acid as a sole carbon source, where it is proposed that its RO functions to dihydroxylate anthranilate, which then spontaneously deaminates and decarboxylates to yield PCA (36). Second, extensive prior work has shown that NDO, perhaps the best-studied RO, can catalyze a variety of oxidations depending on the position of the substrate in the wild type (WT )and variant enzymes relative to the site of O2 activation. Aryl carbons that are closest to the mononuclear iron (30, 57) are generally prioritized for hydroxylation.
A mechanism (Scheme 1A), taking these observations and prior work with ROs into account, would proceed as follows. Substrate binding near to the mononuclear Fe in fully reduced TPADO is expected to displace an iron-bound water molecule from TPADO (29, 40, 41). Here, we observed an unexpectedly large change in the conformation of a helix containing two Fe-ligating histidine residues in response to substrate binding (Fig. 4C), consistent with the role of the substrate in permitting binding and reductive activation of O2. The initially formed ferric-η1-superoxy intermediate (48) or the related Fe-oxo/Fe-hydroxo species could in principle be the oxygenating species. This species would attack the nearest available substrate carbon (Fe ring C2 = 3.6 Å). Addition of a second active site electron from the Rieske cluster yields the ring 1,2-epoxide, adjacent to the mononuclear ferric-hydroxyl. Nucleophilic attack of the hydroxyl at ring C1 forms the hydrogenated, dearomatized, cis-diol product.
Dioxygenation of the ring 1,2-carbons of 2-OH-TPA by this route would yield an unstable gem-diol at the ring C2 position. This is expected to readily dehydrate and decarboxylate to yield PCA and CO2 in the presence of an aqueous proton source to react with the hydroxyl-leaving group. An analogous mechanism has been postulated for the close sequence relative of TPADO, anthranilate dioxygenase, in which the substrate, a 2-amino benzoate, is initially dihydroxylated and hydrogenated at the ring 1,2-carbons. Protonation of the C2-NH2 group would catalyze the breakdown of the product to yield catechol, CO2, and ammonia, which would rapidly acquire a proton to form ammonium cation under neutral, aqueous conditions. The dicarboxylate analog of anthranilate (2-NH2-TPA) serves as a substrate of TPADO with the corresponding deaminated product PCA, although the poorer amino-leaving group leads to a lower level of productive turnover compared with 2-OH-TPA. The close relationship between TPADO and anthranilate dioxygenase and the proximity of the reactive Fe(II) to the 1,2-carbons of TPA or 2-OH-TPA in the TPADO structures presented here (Fig. 5C) suggest an analogous route to DCD or PCA production in TPADO (Scheme 1B).
Together, these observations connect the experimentally determined binding interactions between TPADO and its substrates to an RO sequence subtype and a potential sequence motif specific for para-aryl-dicarboxylate substrates like TPA. These, in turn, provide strong support for a proposed pathway for the TPADO-catalyzed reaction that can now be optimized for future applied work.
Materials and Methods
TPADO and the reductase were heterologously expressed in E. coli and isolated in high yields via nickel affinity chromatography. Enzyme activity was monitored continuously via UV/vis monitoring of NADH disappearance and discontinuously via separation of reaction components via HPLC. Identities of starting material and products were detected with both mass spectrometric and diode array UV/vis detectors. Quantification of reaction materials was carried out by integrating peak areas from chromatograms recorded at distinct absorption maxima. TPADO was crystallized in the substrate-free state and soaked with substrates TPA and 2-OH-TPA overnight. The crystal structures were solved by molecular replacement using structural homologs for the α- and β-subunits. Coordinates for the resulting apo and substrate-bound structures have been deposited in PDB (ID codes 7Q04, 7Q05, and 7Q06). Structural superpositions of TPADO with NagG (7C8Z) and NDO (1O7P) were generated and visualized in PyMol. An SSN of the family of proteins that the catalytic domain of TPADO belongs to (PF00848) was generated with EFI-EST web tools and visualized in Cytoscape. Detailed methods are provided in SI Appendix.
Supplementary Material
Acknowledgments
We thank Dr. Eric M. Shepard for collection of electron paramagnetic resonance spectra, the Diamond Light Source (Didcot) for beam time (proposal no. MX-23269), and the staff at beamline I03 for support. NIH Grant R35GM136390 (to W.M.K., R.C., A.R., J.L.B., and J.L.D.), NSF Grant MCB1715176 (to W.M.K., R.C., A.R., J.L.B., and J.L.D.), and an NSF Graduate Research Fellowship Grant W9057 (to R.C.) supported work at Montana State University. M.Z. and J.E.M. acknowledge Research England for Expanding Excellence in England (E3) funding. Funding was provided by US Department of Energy, Office of Energy Efficiency and Renewable Energy, Advanced Manufacturing Office (AMO) and Bioenergy Technologies Office (BETO). This work was performed as part of the Bio-Optimized Technologies to Keep Thermoplastics out of Landfills and the Environment (BOTTLE) Consortium and was supported by AMO and BETO under National Renewable Energy Laboratory (NREL) Contract DE-AC36-08GO28308, operated by Alliance for Sustainable Energy, LLC. The BOTTLE Consortium includes members from Montana State University and the University of Portsmouth funded under NREL Contract DE-AC36-08GO28308. The views expressed in the article are those of the authors and do not necessarily represent the views of the NSF, the Department of Energy, or the US Government.
Footnotes
The authors declare no competing interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2121426119/-/DCSupplemental.
Data Availability
Crystallographic coordinates have been deposited in PDB (ID codes 7Q04, 7Q05, and 7Q06) (58). All additional raw sequence and spectroscopic data used in this paper are available in the SI Appendix.
References
- 1.Yoshida S., et al. , A bacterium that degrades and assimilates poly(ethylene terephthalate). Science 351, 1196–1199 (2016). [DOI] [PubMed] [Google Scholar]
- 2.Bornscheuer U. T., MICROBIOLOGY. Feeding on plastic. Science 351, 1154–1155 (2016). [DOI] [PubMed] [Google Scholar]
- 3.Tournier V., et al. , An engineered PET depolymerase to break down and recycle plastic bottles. Nature 580, 216–219 (2020). [DOI] [PubMed] [Google Scholar]
- 4.Wei R., Zimmermann W., Biocatalysis as a green route for recycling the recalcitrant plastic polyethylene terephthalate. Microb. Biotech. 10, 1302–1307 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ellis L. D., et al. , Chemical and biological catalysis for plastics recycling and upcycling. Nat. Catal. 4, 539–556 (2021). [Google Scholar]
- 6.Martin A. J., Mondelli C., Jaydev S. D., Perez-Ramirez J., Catalytic processing of plastic waste on the rise. Chem 7, 1487–1533 (2021). [Google Scholar]
- 7.Nicholson S. R., Rorrer N. A., Carpenter A. C., Beckham G. T., Manufacturing energy and greenhouse gas emissions associated with plastics consumption. Joule 5, 673–686 (2021). [Google Scholar]
- 8.Narancic T., et al. , Genome analysis of the metabolically versatile Pseudomonas umsongensis GO16: The genetic basis for PET monomer upcycling into polyhydroxyalkanoates. Microb. Biotechnol. 14, 2463–2480 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Li W. J., et al. , Laboratory evolution reveals the metabolic and regulatory basis of ethylene glycol metabolism by Pseudomonas putida KT2440. Environ. Microbiol. 21, 3669–3682 (2019). [DOI] [PubMed] [Google Scholar]
- 10.Franden M. A., et al. , Engineering Pseudomonas putida KT2440 for efficient ethylene glycol utilization. Metab. Eng. 48, 197–207 (2018). [DOI] [PubMed] [Google Scholar]
- 11.Choi K. Y., et al. , Molecular and biochemical analysis of phthalate and terephthalate degradation by Rhodococcus sp. strain DK17. FEMS Microbiol. Lett. 252, 207–213 (2005). [DOI] [PubMed] [Google Scholar]
- 12.Sasoh M., et al. , Characterization of the terephthalate degradation genes of Comamonas sp. strain E6. Appl. Environ. Microbiol. 72, 1825–1832 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wang Y. Z., Zhou Y., Zylstra G. J., Molecular analysis of isophthalate and terephthalate degradation by Comamonas testosteroni YZW-D. Environ. Health Perspect. 103 (suppl. 5), 9–12 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hara H., Eltis L. D., Davies J. E., Mohn W. W., Transcriptomic analysis reveals a bifurcated terephthalate degradation pathway in Rhodococcus sp. strain RHA1. J. Bacteriol. 189, 1641–1647 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Schläfli H. R., Weiss M. A., Leisinger T., Cook A. M., Terephthalate 1,2-dioxygenase system from Comamonas testosteroni T-2: Purification and some properties of the oxygenase component. J. Bacteriol. 176, 6644–6652 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kasai D., Kitajima M., Fukuda M., Masai E., Transcriptional regulation of the terephthalate catabolism operon in Comamonas sp. strain E6. Appl. Environ. Microbiol. 76, 6047–6055 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Fukuhara Y., Kasai D., Katayama Y., Fukuda M., Masai E., Enzymatic properties of terephthalate 1,2-dioxygenase of Comamonas sp. strain E6. Biosci. Biotechnol. Biochem. 72, 2335–2341 (2008). [DOI] [PubMed] [Google Scholar]
- 18.Zheng J., Suh S., Strategies to reduce the global carbon footprint of plastics. Nat. Clim. Chang. 9, 374–378 (2019). [Google Scholar]
- 19.Singh A., et al. , Techno-economic, life-cycle, and socioeconomic impact analysis of enzymatic recycling of poly(ethylene terephthalate). Joule 5, 2479–2503 (2021). [Google Scholar]
- 20.Shigematsu T., Yumihara K., Ueda Y., Morimura S., Kida K., Purification and gene cloning of the oxygenase component of the terephthalate 1,2-dioxygenase system from Delftia tsuruhatensis strain T7. FEMS Microbiol. Lett. 220, 255–260 (2003). [DOI] [PubMed] [Google Scholar]
- 21.Pardo I., et al. , Gene amplification, laboratory evolution, and biosensor screening reveal MucK as a terephthalic acid transporter in Acinetobacter baylyi ADP1. Metab. Eng. 62, 260–274 (2020). [DOI] [PubMed] [Google Scholar]
- 22.Tiso T., et al. , Towards bio-upcycling of polyethylene terephthalate. Metab. Eng. 66, 167–178 (2021). [DOI] [PubMed] [Google Scholar]
- 23.Kim H. T., et al. , Biological valorization of poly(ethylene terephthalate) monomers for upcycling waste PET. ACS Sustain. Chem. Eng. 7, 19396–19406 (2019). [Google Scholar]
- 24.Kim H. T., et al. , Chemo-biological upcycling of poly(ethylene terephthalate) to multifunctional coating materials. ChemSusChem 14, 4251–4259 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kim D. H., et al. , One-pot chemo-bioprocess of PET depolymerization and recycling enabled by a biocompatible catalyst, betaine. ACS Catal. 11, 3996–4008 (2021). [Google Scholar]
- 26.Werner A. Z., et al. , Tandem chemical deconstruction and biological upcycling of poly(ethylene terephthalate) to β-ketoadipic acid by Pseudomonas putida KT2440. Metab. Eng. 67, 250–261 (2021). [DOI] [PubMed] [Google Scholar]
- 27.Ferraro D. J., Gakhar L., Ramaswamy S., Rieske business: Structure-function of Rieske non-heme oxygenases. Biochem. Biophys. Res. Commun. 338, 175–190 (2005). [DOI] [PubMed] [Google Scholar]
- 28.Venturi V., Zennaro F., Degrassi G., Okeke B. C., Bruschi C. V., Genetics of ferulic acid bioconversion to protocatechuic acid in plant-growth-promoting Pseudomonas putida WCS358. Microbiology 144, 965–973 (1998). [DOI] [PubMed] [Google Scholar]
- 29.Wolfe M. D., Parales J. V., Gibson D. T., Lipscomb J. D., Single turnover chemistry and regulation of O2 activation by the oxygenase component of naphthalene 1,2-dioxygenase. J. Biol. Chem. 276, 1945–1953 (2001). [DOI] [PubMed] [Google Scholar]
- 30.Karlsson A., et al. , Crystal structure of naphthalene dioxygenase: Side-on binding of dioxygen to iron. Science 299, 1039–1042 (2003). [DOI] [PubMed] [Google Scholar]
- 31.Resnick S. M., Lee K., Gibson D. T., Diverse reactions catalyzed by naphthalene dioxygenase from Pseudomonas sp. strain NCIB 9816. J. Ind. Microbiol. 17, 438–457 (1996). [Google Scholar]
- 32.Ferraro D. J., Okerlund A., Brown E., Ramaswamy S., One enzyme, many reactions: Structural basis for the various reactions catalyzed by naphthalene 1,2-dioxygenase. IUCrJ 4, 648–656 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kweon O., et al. , A new classification system for bacterial Rieske non-heme iron aromatic ring-hydroxylating oxygenases. BMC Biochem. 9, 11 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Capyk J. K., Eltis L. D., Phylogenetic analysis reveals the surprising diversity of an oxygenase class. J. Biol. Inorg. Chem. 17, 425–436 (2012). [DOI] [PubMed] [Google Scholar]
- 35.Ambrose K. V., et al. , Functional characterization of salicylate hydroxylase from the fungal endophyte Epichloë festucae. Sci. Rep. 5, 10939 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Chang H. K., Mohseni P., Zylstra G. J., Characterization and regulation of the genes for a novel anthranilate 1,2-dioxygenase from Burkholderia cepacia DBO1. J. Bacteriol. 185, 5871–5881 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jaganaman S., Pinto A., Tarasev M., Ballou D. P., High levels of expression of the iron-sulfur proteins phthalate dioxygenase and phthalate dioxygenase reductase in Escherichia coli. Protein Expr. Purif. 52, 273–279 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hou Y.-J., Guo Y., Li D.-F., Zhou N.-Y., Structural and biochemical analysis reveals a distinct catalytic site of salicylate 5-monooxygenase NagGH from Rieske dioxygenases. Appl. Environ. Microbiol. 87, e01629-20 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rogers T. A., Bommarius A. S., Utilizing simple biochemical measurements to predict lifetime output of biocatalysts in continuous isothermal processes. Chem. Eng. Sci. 65, 2118–2124 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wolfe M. D., et al. , Benzoate 1,2-dioxygenase from Pseudomonas putida: Single turnover kinetics and regulation of a two-component Rieske dioxygenase. Biochemistry 41, 9611–9626 (2002). [DOI] [PubMed] [Google Scholar]
- 41.Ohta T., Chakrabarty S., Lipscomb J. D., Solomon E. I., Near-IR MCD of the nonheme ferrous active site in naphthalene 1,2-dioxygenase: Correlation to crystallography and structural insight into the mechanism of Rieske dioxygenases. J. Am. Chem. Soc. 130, 1601–1610 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Holm L., Using DALI for protein structure comparison. Methods Mol. Biol. 2112, 29–42 (2020). [DOI] [PubMed] [Google Scholar]
- 43.Fang T., Zhou N. Y., Purification and characterization of salicylate 5-hydroxylase, a three-component monooxygenase from Ralstonia sp. strain U2. Appl. Microbiol. Biotechnol. 98, 671–679 (2014). [DOI] [PubMed] [Google Scholar]
- 44.Ferraro D. J., Okerlund A. L., Mowers J. C., Ramaswamy S., Structural basis for regioselectivity and stereoselectivity of product formation by naphthalene 1,2-dioxygenase. J. Bacteriol. 188, 6986–6994 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Gerlt J. A., et al. , Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks. Biochim. Biophys. Acta 1854, 1019–1037 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Colbert C. L., et al. , Structural characterization of Pandoraea pnomenusa B-356 biphenyl dioxygenase reveals features of potent polychlorinated biphenyl-degrading enzymes. PLoS One 8, e52550 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Qiu J., et al. , Identification and characterization of a novel pic gene cluster responsible for picolinic acid degradation in Alcaligenes faecalis JQ135. J. Bacteriol. 201, e00077-19 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Rogers M. S., Lipscomb J. D., Salicylate 5-hydroxylase: Intermediates in aromatic hydroxylation by a Rieske monooxygenase. Biochemistry 58, 5305–5319 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Pinyakong O., Habe H., Yoshida T., Nojiri H., Omori T., Identification of three novel salicylate 1-hydroxylases involved in the phenanthrene degradation of Sphingobium sp. strain P2. Biochem. Biophys. Res. Commun. 301, 350–357 (2003). [DOI] [PubMed] [Google Scholar]
- 50.Jouanneau Y., Micoud J., Meyer C., Purification and characterization of a three-component salicylate 1-hydroxylase from Sphingomonas sp. strain CHY-1. Appl. Environ. Microbiol. 73, 7515–7521 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Guengerich F. P., Common and uncommon cytochrome P450 reactions related to metabolism and chemical toxicity. Chem. Res. Toxicol. 14, 611–650 (2001). [DOI] [PubMed] [Google Scholar]
- 52.Harwood C. S., Parales R. E., The beta-ketoadipate pathway and the biology of self-identity. Annu. Rev. Microbiol. 50, 553–590 (1996). [DOI] [PubMed] [Google Scholar]
- 53.Jamarani R., Erythropel H. C., Nicell J. A., Leask R. L., Marić M., How green is your plasticizer? Polymers (Basel) 10, 834 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Fukuhara Y., et al. , Characterization of the isophthalate degradation genes of Comamonas sp. strain E6. Appl. Environ. Microbiol. 76, 519–527 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Roy R. N., Bioactive natural derivatives of phthalate ester. Crit. Rev. Biotechnol. 40, 913–929 (2020). [DOI] [PubMed] [Google Scholar]
- 56.Mahto J. K., et al. , Molecular insights into substrate recognition and catalysis by phthalate dioxygenase from Comamonas testosteroni. J. Biol. Chem. 297, 101416 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Parales R. E., et al. , Substrate specificity of naphthalene dioxygenase: Effect of specific amino acids at the active site of the enzyme. J. Bacteriol. 182, 1641–1649 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.M. Zahn, J. E. McGeehan, 7Q04, 7Q05, and 7Q06 PDB entries with coordinate files. Protein Structural Data Bank. http://www.rcsb.org/pdb/explore/explore.do?structureId=7Q04, http://www.rcsb.org/pdb/explore/explore.do?structureId=7Q05, and http://www.rcsb.org/pdb/explore/explore.do?structureId=7Q06. Deposited 14 October 2021. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Crystallographic coordinates have been deposited in PDB (ID codes 7Q04, 7Q05, and 7Q06) (58). All additional raw sequence and spectroscopic data used in this paper are available in the SI Appendix.