Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 20.
Published in final edited form as: ACS Chem Biol. 2020 Nov 5;15(11):3013–3020. doi: 10.1021/acschembio.0c00663

Genome Mining and Metabolomics Uncover a Rare d-Capreomycidine Containing Natural Product and Its Biosynthetic Gene Cluster

James H Tryon 1, Jennifer C Rote 2, Li Chen 3, Matthew T Robey 4, Marvin M Vega 5, Wan Cheng Phua 6, William W Metcalf 7, Kou-San Ju 8, Neil L Kelleher 9, Regan J Thomson 10
PMCID: PMC7830813  NIHMSID: NIHMS1662462  PMID: 33151679

Abstract

We report the metabolomics-driven genome mining of a new cyclic-guanidino incorporating non-ribosomal peptide synthetase (NRPS) gene cluster and full structure elucidation of its associated hexapeptide product, faulknamycin. Structural studies unveiled that this natural product contained the previously unknown (R,S)-stereoisomer of capreomycidine, d-capreomycidine. Furthermore, heterologous expression of the identified gene cluster successfully reproduces faulknamycin production without an observed homologue of VioD, the pyridoxal phosphate (PLP)-dependent enzyme found in all previous l-capreomycidine biosynthesis. An alternative NRPS-dependent pathway for d-capreomycidine biosynthesis is proposed.

Graphical Abstract

graphic file with name nihms-1662462-f0001.jpg

INTRODUCTION

Human use of natural product mixtures for both medical and industrial purposes dates to the beginnings of recorded history.1,2 Technological advancements during the 20th century in microbiology, chemical purification, structural elucidation, and synthetic organic chemistry allowed for explosive growth in the utilization and understanding of natural products and their derivatives from a myriad of producing organisms.3 Within the microbial world, the Gram-positive actinomycete bacteria proved particularly fruitful, yielding upward of 75% of microbially derived pharmaceutical compounds discovered during the 20th century.4 Despite the enormous benefit reaped from actinomycetes, limitations facing canonical discovery methods began pushing industrial and academic drug discovery away from natural products. Rediscovery of known compounds from isolated actinomycetes emerged as the primary limitation facing bioactivity-driven fractionation efforts. This situation led to the realization that many of the most potent bioactive compounds isolated during the past century are observable at high frequency within actinomycete chemical space, frustrating the discovery of less frequent metabolites.5

Recent analyses of microbial genomes, however, have revealed that a vast number of natural products encoded within actinomycete genomes have evaded isolation.68 This realization sparked global efforts to leverage genomic information to access novel metabolites, which resulted in a range of interdisciplinary approaches collectively known as genome mining.9 These genome mining platforms, fueled by developments in molecular biology, have proven fruitful in recent years, affording access to many novel compounds, reinvigorating interest in natural products.1012

Despite the many successes of genome mining approaches, most developed methods rely on a “one by one” approach to analyzing biosynthetic gene clusters and finding their associated metabolites, often through heterologous expression of the gene cluster of interest in a modified host organism. Without prior details regarding the metabolite’s structure or biological activity, detection and subsequent elucidation of the actual natural product from the heterologous host can still be challenging. We developed a platform termed “metabologenomics” as one option to accelerate natural product discovery by correlating gene cluster inheritance patterns to compound diversity across actinomycete chemical space.13 Metabologenomics only requires two types of general input data for each strain in the study: gene cluster families (GCFs) extracted from genome assemblies and untargeted LC/MS2 metabolomics files. In contrast with discovery pipelines based on bioassays or chemical derivatization, untargeted metabolomics grants our approach minimal structural biases and facilitates detection of compounds produced at low concentrations. Previously we have leveraged metabologenomics in the discovery of several natural product families alongside their corresponding GCFs.1316

Recently, we became interested in using metabologenomics to mine for rare biosynthetic motifs that would otherwise be difficult to target. One such motif, represented by the cyclic guanidino-amino acids, l-capreomycidine (1) and l-epi-capreomycidine (2), has been previously identified within a limited range of actinomycete natural products (Figure 1).1720 For example, l-capreomycidine (1) is incorporated into the tuberactin antibiotics capreomycin IA (3) and viomycin (4), while l-epi-capreomycidine (2) is found within the protease inhibitor chymostatin (5) and the nucleoside-containing antibiotic muraymycin C4 (6). Guanidino functionalized metabolites generally have increased membrane permeability, and cyclization of the arginine to generate cyclic urea structures such as 1 and 2 is hypothesized to confer resistance to degradation pathways, thus making molecules possessing such motifs desirable targets for discovery.21,22

Figure 1.

Figure 1.

(a) Previously identified stereoisomers of the arginine-derived cyclic-guanadino amino acid, l-capreomycidine (1) and l-epi-capreomycidine (2). (b) Representative previously observed capreomycidine-incorporating natural products. (c) Previously unidentified stereoisomers of capreomycidine, d-capreomycidine (7) and d-epi-capreomycidine (8), possessing enantiomeric structures to 1 and 2.

Prior to the work reported herein, only l-capreomycidine (1), possessing the 2S,3R stereochemistry, and l-epi-capreomycidine (2), possessing the 2S,3S stereochemistry have been observed in nature; neither of the corresponding enantiomeric capreomycidine stereoisomers (i.e., 7 and 8) have been found in any natural product (Figure 1c). Given the important activity of guanadino-containing natural products and the interesting aspects of their biosynthetic formation, we sought to mine our metabologenomics data set for cyclic-guanidino encoding GCFs. Here we report the discovery, complete structure, and proposed biosynthesis of a natural product incorporating the previously unobserved d-capreomycin amino acid (7).

RESULTS AND DISCUSSION

Genome Mining and Correlations Data.

In studies of tuberactin biosynthesis (viomycin and capreomycin), l-capreomycidine (1) has been shown to be biosynthesized from arginine (9) in three steps: C3 hydroxylation, dehydration, and subsequent urea cyclization. Hydroxylation was demonstrated to be catalyzed by homologues of VioC, while dehydration and cyclization was catalyzed by VioD homologues via a pyridoxal phosphate (PLP)-mediated mechanism (Figure 2).22,23

Figure 2.

Figure 2.

Established synthesis of l-capreomycidine (1) from studies of tuberactin biosynthesis.

Our metabologenomics data set is composed of metabolomics data and genomic sequences from diverse actinomycete strains. Informed by knowledge of tuberactin biosynthesis, we searched our genomic data for homologues of VioC, an α-ketoglutarate dependent oxygenase known to catalyze arginine β-hydroxylation. We identified a gene cluster family in our data set, NRPS_GCF.259, containing a VioC homologue (Figure S1, NRPS = non-ribosomal peptide synthetase). Curiously, no homologue of VioD, the enzyme involved in the second step of cyclization, was observed within the identified gene cluster (Figure 3). Furthermore, one AT didomain in NRPS_GCF.259 showed a high similarity to the didomain responsible for cyclic-arginine incorporation in mannopeptimycin biosynthesis (Table S1). Interestingly, mannopeptimycin incorporates a β-hydroxylated species instead of the 6-membered capreomycidine ring found in capreomycin.

Figure 3.

Figure 3.

Annotation of ORFs within the NRPS_GCF.259 biosynthetic gene cluster and their proposed functions.

Our metabologenomics platform allowed us to prioritize putative metabolites synthesized by NRPS_GCF.259. NRPS_GCF.259 was observed in 4 of the 241 studied strains. A pair of ions, 14323_P113 and 13338_P113, were identified by our scoring method as the most strongly related ions to the target cluster. These ions were observed in 3 of the 4 strains, resulting in a p-value for each to NRPS_GCF.259 of P = 5.15 × 10−24 (Figure S2). Pragmatically, 13338_P113 was observed at 31 times higher relative abundance than 14323_P113 and was thus selected for further analysis. Ion 13338_P113 was detected as a 2+ ion with a m/z of 375.7107. The corresponding 1+ charge state, an ion with m/z of 750.4144, was also observed but at a lower relative intensity (Figure S3). The neutral molecular formula was calculated to be C34H55O10N9 with a ppm error of 0.060. In further support of our hypothesis, a strong fragment ion for a capreomycidine ring was observed in the MS2 scan at 98.072. At this point, the molecule associated with ion 13338_P113 was assigned the name faulknamycin.

Long Read Sequencing and NRPS_GCF.259 Cluster Ontology.

Within the metabologenomics data set, representative gene clusters for NRPS_GCF.259 were observed in 4 strains out of 241 included in the analysis at the time of target selection. Identical contig break patterns were observed within 3 of the 4 clusters, suggesting that a highly repetitive region within the NRPS gene was frustrating genome assembly. To obtain the complete gene cluster sequence, Streptomyces griseus NRRL B-2307 was sequenced using an Oxford Nanopore MinION kit.24 High MW DNA was isolated using a Circulomics kit with a modified protocol.25 DNA was prepared for sequencing using the Oxford Nanopore Technologies Rapid Sequencing Kit according to the manufacturer’s instructions. Sequencing reads were acquired for 12 hours using a MINion device, with live base-calling using MinKNOW software. Long-read sequencing data was assembled along with Illumina reads (corresponding to GenBank accession NZ_JNZI00000000) using SPADes 3.11.1 with default hybrid assembly parameters. The resulting assembly successfully closed the contig gap and afforded the complete cluster (Figure 3).26 Notably, the NRPS genes, FauE and FauG, were observed without an encoded offloading domain. Two freestanding open reading frames that could fulfill this role were observed, however. FauL, a type-two thioesterase, was encoded five ORFs downstream from the NRPS genes. FauD, a peptidase with homology to the offloading enzymes from surugamide and mannopeptimycin biosyntheses, was observed adjacent to FauE. This enzyme is involved in trans-offloading of NRPs from biosynthetic machinery.27 While the VioC homologue FauH was observed, no VioD equivalent or PLP-binding domain was observed within the vicinity of the gene cluster. This indicated that cyclization to produce capreomycidine either was occurring through a potentially novel mechanism or was catalyzed by enzymatic machinery located outside of the NRPS_GCF.259 biosynthetic cluster. FauJ was identified as a homologue of a tyrosine β-hydroxylase.

Heterologous Expression.

Heterologous expression of faulknamycin was performed to verify the MS/GCF correlation obtained by our metabologenomics methodology and to obtain additional insights into the function of the cluster. Because the cluster contained no VioD homologue (see above), we were particularly interested to establish whether capreomycidine formation was catalyzed by the standalone gene cluster or whether its formation required additional enzymes encoded by genes located elsewhere in the B-2307 genome. If the latter were true, we anticipated formation and observation of a faulknamycin-like structure possessing a β-hydroxyarginine in place of the cyclized capreomycidine unit.

A fosmid library was constructed from the genomic DNA of B-2307, and clones encoding the faulknamycin gene cluster were identified by PCR. After screening 2304 colonies, no fosmid bearing the complete cluster was observed. Partial clones (Figures S4S6) were combined and trimmed by lambda red-mediated recombination before verification by MinION sequencing.2830 The resulting 60 kB clone was recombined into the chromosome of Streptomyces lividans 66 via ΦC31 integrase. The resulting strain produced faulknamycin after growth on mannitol soy plates, as clearly evidenced by LC-MS2 (Figure 4), establishing that NRPS_GCF.259 was responsible for the production of faulknamycin. The S. lividans genome shows no genes bearing significant similarity to VioD when queried for highly similar sequences via Megablast using an expect threshold of 0.05. Therefore, it is likely that NRPS_GCF.259 is capable of cyclizing arginine in the absence of a VioD-like enzyme.

Figure 4.

Figure 4.

Extracted ion chromatograms for the 750.415 m/z ion corresponding to faulknamycin production. Relative abundances normalized to 7 × 107 ion count.

Faulknamycin Production Conditions.

Streptomyces griseus NRRL B-2307 was identified as the most robust producer from the previously analyzed pooled growth conditions and in comparison to S. lividans 66-pFau43, which produced the metabolite at significantly lower abundance. A glycerol stock of B-2307 mycelium was streaked out onto an ISP2 agar plate and grown at 30 °C for 5 days. A single colony was used to inoculate 5 mL liquid cultures of ISP2 media. This starter culture was incubated at 30 °C for 5 days before aliquots were withdrawn to inoculate a variety of liquid and agar media. These were grown for 10 days before screening spent media for faulknamycin by LC-MS. B-2307 produced detectable amounts of the target ion only when grown as a lawn on ISP2 and ISP4 plates. Production of the target ion was not observed on other substrates or liquid growths. Higher relative production was observed from ISP2 media, and all subsequent studies were facilitated by growth on ISP2 agar plates.

Isotope Incorporation.

A series of isotope incorporation studies were designed based upon the NRPS monomer predictions to probe the biosynthesis and observe structural data. The NRP gene sequences from NRPS_GCF.259 were analyzed by AntiSMASH/SANDPUMA, and resulting A-domain loading predictions were used to choose a panel of heavy amino acids for isotope feeding experiments.31,32 Isotopically enriched agar plates were prepared by addition of 1 mg of labeled amino acid to each 5 mL of agar via sterile filtration. Excess moisture was dried in a biosafety cabinet overnight prior to inoculation. Spent media was screened after 10 days by LC-MS. Incorporation of valine, phenylalanine, threonine, and arginine was observed (Figure S7). To further probe our cyclic-arginine hypothesis, ornithine-d7 was fed to strain B-2307. Ornithine is readily converted to arginine by Streptomyces strains and allows for the observation of side chain modification events.33 In agreement with our cyclic-arginine hypothesis, ornithine-d7 incorporation was observed with the loss of 2 side chain deuterons (Figure 5). No incorporation of glutamine, aspartic acid, cysteine, tryptophan, or histidine was observed. Isotope incorporation patterns corroborated NRPS_259 as the cyclic guanidino-incorporating gene cluster.

Figure 5.

Figure 5.

Ornithine-d7 incorporation pattern suggested two enzymatic steps acting upon the arginine side chain during biosynthesis.

Growth and Purification of Faulknamycin.

Faulknamycin was purified from two separate large-scale cultures. Separate batches consisting of 8 and 12 L of ISP2 agar plates were prepared and inoculated with liquid cultures of B-2307. Mycelium grown in liquid culture was added to the plate surface and shaken with glass beads to disrupt mycelium and enable even lawn growth. Inoculated plates were incubated at 30 °C for 10 days. The agar was processed by freezing plates at −20 °C for 24 h and then allowing them to thaw at ambient temperature overnight. Plates were then manually squeezed to collect liquid, and the resulting liquid mixture was filtered to remove cellular material, spores, and other solids. The liquid was stirred overnight with Amberlite XAD16N beads. After LC-MS analysis of the supernatant showed complete absorption, the beads were collected and rinsed with 4 L of deionized water. More polar metabolites were eluted with four successive 500 mL volumes of 20% methanol/water. Enriched faulknamycin was eluted with four 500 mL volumes of 50% methanol. This solution was concentrated, and the resulting powder was dissolved in water before binding to 1 g HLB cartridges. Cartridges were washed with 10 mL each of water, 5% acetonitrile, and 20% acetonitrile. The target ion was then collected with 50% acetonitrile. This material was further purified by reverse phase HPLC until fractions containing the target ion appeared pure by LC-MS, affording 0.9 mg of material.

Structure Elucidation by NMR Spectroscopy and Edman Sequencing.

Solubility limitations with purified material became immediately apparent. Attempts to solubilize the peptide in organic solvents, including DMSO, proved futile. Furthermore, solubility was weak in deuterium oxide: 0.5 mL only partially solubilized the purified material after extended sonication. Despite these challenges, enough material was dissolved and filtered to provide TOCSY, COSY, and HSQC data used to characterize individual spin systems (Figures S9S12). Two nonproteinogenic amino acids observed by mass shifts in MS2 scans were verified to be a six-membered capreomycidine and a β-hydroxylated phenylalanine by COSY spectroscopy. While these individual amino acid fragments could be established using NMR spectroscopy, confirming their sequence within the natural product proved challenging due to solubility constraints. Due to N-hydrogen exchange in D2O, we were unable to observe the peptide through intraresidue HMBC spectroscopy. To verify the linear sequence, we turned to Edman sequencing and MS2 de novo peptide sequencing for the purified peptide and observed the sequence order of three proteinogenic amino acids and three nonproteinogenic amino acids, respectively (Figures S13S20). Edman sequencing showed that the first two amino acids were nonproteinogenic and eluted near the standards for threonine and phenylalanine. The third and fourth amino acids were unambiguously confirmed to be leucine and valine, respectively. The fifth amino acid standard eluted closely to the arginine standard. The final amino acid eluted as a threonine monomer. This sequence was further verified by de novo peptide sequencing. Interestingly, the sequence disagreed with the linear sequence predicted bioinformatically. Based upon the genetic sequence of NRPS_GCF.259, we would expect FauE and FauG to biosynthesize Phe-Leu-Val and Arg-Thr-Thr derived peptide segments, respectively. Conjugation of these peptide segments would then lead to either Phe-Leu-Val-Arg-Thr-Thr or Arg-Thr-Thr-Phe-Leu-Val derived hexapeptides. However, we observed a Thr-Phe-Leu-Val-Arg-Thr derived hexapeptide. In the absence of an observable cyclic precursor, this sequence rearrangement raises questions regarding the domain firing order during biosynthesis.

Proposed Linear Biosynthesis.

The unusual peptide sequence observed for faulknamycin prompted questions regarding the biosynthesis of the isolated compound. We initially hypothesized the existence of a cyclic precursor undergoing peptide bond cleavage to yield the isolated compound. However, no mass corresponding to a cyclic precursor was observed above the limit of detection during a 10-day time course monitored by LC-MS (Figure S21). Furthermore, we searched MS2 spectra from the time course data for possible precursor species bearing the diagnostic capreomycidine fragment ion but observed no larger putative precursor ions. The putative offloading enzyme, FauD, bears significant similarity to the trans-offloading enzyme SurE identified from the surugamide biosynthetic gene cluster.27 We propose an unusual initiation step for faulknamcyin biosynthesis production, namely, that the C-terminal module of FauG initiates faulknamcyin biosynthesis, with its adenylation domain (A3) activating, epimerizing, and loading Thr to start the assembly line. In support of this, both FauE A1 and FauG A1 are preceded by condensation domains, indicating that neither initiates NRP biosynthesis (Figure 6).

Figure 6.

Figure 6.

Proposed linear biosynthesis and peptidase-mediated offloading of faulknamycin (12).

Determination of Stereochemistry by Marfey’s Analysis.

Stereoisomers of threonine, leucine, and valine were obtained commercially. The four stereoisomers of β-hydroxy phenylalanine were synthesized via a diastereoselective aminohydroxylation method developed by Davies and co-workers (see Supporting Information for details).34 Equivalents to the four possible capreomycidine stereoisomers were obtained semisynthetically from hydrolyzed capreomycin and chymostatin. All stereocenters were determined by comparison of hydrolyzed faulknamycin to these standards using a standard Marfey’s assay (Figures S22S25). In conjunction with the NMR spectroscopic analysis and Edman sequencing, these final Marfey’s assays enabled the complete structure of faulknamycin to be assigned as shown in Figure 6 (i.e., compound 12). The natural product contains l-valine, d-leucine, both l-threonine and d-allo-threonine, and the (2R,3S)-stereoisomer of β-hydroxylphenylalanine, presumably formed by tyrosine β-hydroxylase homologue FauJ (see Figure 3). The observed d-capreomycidine stereochemistry (i.e., 2R,3S) has not been previously described or observed in any known natural product until now (see, compound 7 in Figure 1 and Figure 5).

Proposed Capreomycidine Biosynthesis.

As described earlier, faulknamycin production was observed in the heterologous host in the absence of a VioD homologue, which was unusual since all previous reports of capreomycidine biosynthesis utilized this enzyme. Analysis of the Stachelhaus code from FauG-A1 shows high similarity to the binding pocket found in mannopeptimycin biosynthesis, indicating incorporation of a β-hydroxylated arginine species (i.e., 10) onto FauG (Figure 7).

Figure 7.

Figure 7.

Previously established PLP-dependent biosynthesis of l-capreomycidine (1) and the proposed NRPS-dependent biosynthesis of d-capreomycidine (7) observed in the structure of faulknamycin (12).

This suggests that the epimerase domain on FauG catalyzes the dehydration of 10 to yield a FauG-bound 2,3-dehydroarginine species (i.e., 14), which then undergoes a stereocontrolled cyclization to deliver d-capreomycidine (7). Related dehydrations of hydroxylated amino acids to form unsaturated amino acids are observed in the biosynthesis of methoxyvinylglycine; however, in methoxyvinylglycine biosynthesis, there is a single divergent C domain.35 In this study, we propose that this reaction is catalyzed by an epimerase domain followed by a C domain. This divergent cyclization mechanism allows formation of the novel (2R,3S)-stereochemistry through a pathway distinct from the previously known PLP-catalyzed pathway seen in tuberactin biosynthesis.

Preliminary Bioactivity Assays.

Faulknamycin was tested in a dilution series for growth inhibition of Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter. Faulknamycin showed no activity up to its solubility limit (~1 mg·mL−1). Faulknamycin also showed no detectable inhibition of human cancer cell lines or Aspergillus fumigatus.

CONCLUSION

Correlative approaches to natural product discovery, such as metabologenomics (our hybrid -omics discovery platform),1316,36 provide early structural information from targeted metabolites and their associated gene clusters concurrently. This approach allows genome mining projects to target a wide diversity of compounds without the biases found in many other genome mining approaches. Previous reports from our laboratories focused on discovering novel metabolites and their gene clusters initially agnostic toward molecular structure substructures. Now, the discovery of faulknamycin described here illustrates the power of our hybrid -omics approach when adapted for a more targeted genome mining strategy. Substructure-focused genome mining was conducted for a novel cyclic-guanidino incorporating natural product resulting in the discovery of faulknamycin (12), a novel natural product incorporating the rare arginine-derived cyclic-guanadino amino acid capreomycidine. Full structural studies revealed that faulknamycin (12) incorporates a capreomycidine stereoisomer that had not been observed in previous natural products, namely, d-capreomycidine (7). Alongside this structural novelty, a biosynthetic divergence between faulknamycin and tuberactin was observed by the absence of VioD homologues seen in all prior capreomycidine biosynthesis studies. Similarity between the Stachelhaus codes in faulknamycin (12) and mannopeptimycin domains suggest that faulknamycin (12) loads β-hydroxylated arginine (10) to FauG prior to C-domain mediated dehydration and cyclization, a mechanism distinct from the PLP-mediated VioD pathway. Efforts to fully elucidate the mechanism of the FauG-mediated synthesis of d-capreomycidine (7) are ongoing.

Additionally, the work herein illustrates the structural information that can be gained by combining MS2 analysis with the predicted structures generated via bioinformatics. As the tools available for MS2 analysis and structural predictions improve, natural product discovery can transition to an era in which natural product discovery and structural elucidation proceeds without large-scale isolation efforts.

METHODS

Details of experimental procedures are provided in the Supporting Information.

Supplementary Material

Supplementation Information

ACKNOWLEDGMENTS

The research was supported by the National Institutes of Health under Award Numbers AT009143 (to N.L.K. and R.J.T.) and D012016 (Integrated Molecular Structure Education and Research Center at Northwestern University). J.C.R gratefully acknowledges the National Science Foundation for the award of a Graduate Research Fellowship. W.C. P gratefully acknowledges the award of an Undergraduate Research Grant from Northwestern University. K.S.J. acknowledges support from the Center for Applied Plant Sciences (Scientific Team Grant BPBFP) at The Ohio State University.

Footnotes

Supporting Information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acschembio.0c00663.

Database construction, metabologenomics correlation scores, strains and plasmids for heterologous expression, LC-MS data for stable isotope incorporation, NMR spectra of 12, Edman degradation peptide sequencing data for 12, full experimental procedures for the synthesis of β-hydroxyphenylalanine standards, and LC chromatograms for Marfey’s analysis of 12 (PDF)

Complete contact information is available at: https://pubs.acs.org/10.1021/acschembio.0c00663

The authors declare the following competing financial interest(s): W.W.M., N.L.K., and R.J.T, declare competing financial interests in MicroMGx, Inc.

Contributor Information

James H. Tryon, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States

Jennifer C. Rote, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States

Li Chen, Department of Microbiology, The Ohio State University, Columbus, Ohio 43210, United States.

Matthew T. Robey, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States

Marvin M. Vega, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States

Wan Cheng Phua, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States.

William W. Metcalf, Carl R. Woese Institute for Genomic Biology and The Department of Microbiology, University of Illinois at Urbana−Champaign, Urbana, Illinois 61801, United States;.

Kou-San Ju, Department of Microbiology and The Division of Medicinal Chemistry and Pharmacognosy, Center for Applied Plant Sciences, The Ohio State University, Columbus, Ohio 43210, United States;.

Neil L. Kelleher, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States;.

Regan J. Thomson, Department of Chemistry, Northwestern University, Evanston, Illinois 60208, United States;.

REFERENCES

  • (1).Melo MJ (2009) History of Natural Dyes in the Ancient Mediterranean World. Handbook of Natural Colorants, 1–20. [Google Scholar]
  • (2).Brownstein MJ (1993) A brief history of opiates, opioid peptides, and opioid receptors. Proc. Natl. Acad. Sci. U. S. A 90 (12), 5391–5393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Katz L, and Baltz RH (2016) Natural product discovery: past, present, and future. J. Ind. Microbiol. Biotechnol 43 (2–3), 155–176. [DOI] [PubMed] [Google Scholar]
  • (4).Bérdy J (2005) Bioactive Microbial Metabolites. J. Antibiot 58 (1), 1–26. [DOI] [PubMed] [Google Scholar]
  • (5).Baltz RH (2006) Marcel Faber Roundtable: Is our antibiotic pipeline unproductive because of starvation, constipation or lack of inspiration? J. Ind. Microbiol. Biotechnol 33 (7), 507–513. [DOI] [PubMed] [Google Scholar]
  • (6).Bachmann BO, Van Lanen SG, and Baltz RH (2014) Microbial genome mining for accelerated natural products discovery: is a renaissance in the making? J. Ind. Microbiol. Biotechnol 41 (2), 175–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Baltz RH (2019) Natural product drug discovery in the genomic era: realities, conjectures, misconceptions, and opportunities. J. Ind. Microbiol. Biotechnol 46 (3–4), 281–299. [DOI] [PubMed] [Google Scholar]
  • (8).Bentley SD, Chater KF, Cerdeño-Tárraga AM, Challis GL, Thomson NR, James KD, Harris DE, Quail MA, Kieser H, Harper D, Bateman A, Brown S, Chandra G, Chen CW, Collins M, Cronin A, Fraser A, Goble A, Hidalgo J, Hornsby T, Howarth S, Huang CH, Kieser T, Larke L, Murphy L, Oliver K, O’Neil S, Rabbinowitsch E, Rajandream MA, Rutherford K, Rutter S, Seeger K, Saunders D, Sharp S, Squares R, Squares S, Taylor K, Warren T, Wietzorrek A, Woodward J, Barrell BG, Parkhill J, and Hopwood DA (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417 (6885), 141–147. [DOI] [PubMed] [Google Scholar]
  • (9).Ziemert N, Alanjary M, and Weber T (2016) The evolution of genome mining in microbes - a review. Nat. Prod. Rep 33 (8), 988–1005. [DOI] [PubMed] [Google Scholar]
  • (10).Maxson T, Tietz JI, Hudson GA, Guo XR, Tai H-C, and Mitchell DA (2016) Targeting Reactive Carbonyls for Identifying Natural Products and Their Biosynthetic Origins. J. Am. Chem. Soc 138 (46), 15157–15166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Ju K-S, Gao J, Doroghazi JR, Wang K-KA, Thibodeaux CJ, Li S, Metzger E, Fudala J, Su J, Zhang JK, Lee J, Cioni JP, Evans BS, Hirota R, Labeda DP, Van Der Donk WA, and Metcalf WW (2015) Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes. Proc. Natl. Acad. Sci. U. S. A 112 (39), 12175–12180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Castro-Falcón G, Millán-Aguiñaga N, Roullier C, Jensen PR, and Hughes CC (2018) Nitrosopyridine Probe To Detect Polyketide Natural Products with Conjugated Alkenes: Discovery of Novodaryamide and Nocarditriene. ACS Chem. Biol 13 (11), 3097–3106. [DOI] [PubMed] [Google Scholar]
  • (13).Goering AW, McClure RA, Doroghazi JR, Albright JC, Haverland NA, Zhang Y, Ju K-S, Thomson RJ, Metcalf WW, and Kelleher NL (2016) Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer. ACS Cent. Sci 2 (2), 99–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).McClure RA, Goering AW, Ju K-S, Baccile JA, Schroeder FC, Metcalf WW, Thomson RJ, and Kelleher NL (2016) Elucidating the rimosamide-detoxin natural product families and their biosynthesis using metabolite/gene cluster correlations. ACS Chem. Biol 11, 3452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Parkinson EI, Tryon JH, Goering AW, Ju K-S, McClure RA, Kemball JD, Zhukovsky S, Labeda DP, Thomson RJ, Kelleher NL, and Metcalf WW (2018) Discovery of the Tyrobetaine Natural Products and Their Biosynthetic Gene Cluster via Metabologenomics. ACS Chem. Biol 13 (4), 1029–1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Navarro-Muñoz JC, Selem-Mojica N, Mullowney MW, Kautsar SA, Tryon JH, Parkinson EI, De Los Santos ELC, Yeong M, Cruz-Morales P, Abubucker S, Roeters A, Lokhorst W, Fernandez-Guerra A, Cappelini LTD, Goering AW, Thomson RJ, Metcalf WW, Kelleher NL, Barona-Gomez F, and Medema MH (2020) A computational framework to explore large-scale biosynthetic diversity. Nat. Chem. Biol 16 (1), 60–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Bycroft BW, Cameron D, Croft LR, Hassanali-Walji A, Johnson AW, and Webb T (1971) Total Structure of Capreomycin IB, a Tuberculostatic Peptide Antibiotic. Nature 231 (5301), 301–302. [DOI] [PubMed] [Google Scholar]
  • (18).Bycroft BW, Cameron D, Croft LR, Hassanali-Walji A, Johnson AW, and Webb T (1971) The total structure of viomycin, a tuberculostatic peptide antibiotic. Experientia 27 (5), 501–503. [DOI] [PubMed] [Google Scholar]
  • (19).Tatsuta K, Mikami N, Fujimoto K, Umezaw S, Umezawa H, and Aoyagi T (1973) The Structure of Chymostatin, a Chymotrypsin Inhibitor. J. Antibiot 26 (11), 625–646. [DOI] [PubMed] [Google Scholar]
  • (20).Katsuyama A, and Ichikawa S (2018) Synthesis and Medicinal Chemistry of Muraymycins, Nucleoside Antibiotics. Chem. Pharm. Bull 66 (2), 123–131. [DOI] [PubMed] [Google Scholar]
  • (21).Fair RJ, Hensler ME, Thienphrapa W, Dam QN, Nizet V, and Tor Y (2012) Selectively Guanidinylated Aminoglycosides as Antibiotics. ChemMedChem 7 (7), 1237–1244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Yin X, McPhail KL, Kim K-J, and Zabriskie TM (2004) Formation of the Nonproteinogenic Amino Acid 2S,3R-Capreomycidine by VioD from the Viomycin Biosynthesis Pathway. ChemBioChem 5 (9), 1278–1281. [DOI] [PubMed] [Google Scholar]
  • (23).Felnagle EA, Rondon MR, Berti AD, Crosby HA, and Thomas MG (2007) Identification of the Biosynthetic Gene Cluster and an Additional Gene for Resistance to the Antituberculosis Drug Capreomycin. Appl. Environ. Microbiol 73 (13), 4162–4170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Jain M, Olsen HE, Paten B, and Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17 (1), 239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Zhang Y, Zhang Y, Burke JM, Gleitsman K, Friedrich SM, Liu KJ, and Wang T-H (2016) A Simple Thermoplastic Substrate Containing Hierarchical Silica Lamellae for High-Molecular-Weight DNA Extraction. Adv. Mater 28 (48), 10630–10636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, and Pevzner PA (2012) SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J. Comput. Biol 19 (5), 455–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).Thankachan D, Fazal A, Francis D, Song L, Webb ME, and Seipke RF (2019) A trans-Acting Cyclase Offloading Strategy for Nonribosomal Peptide Synthetases. ACS Chem. Biol 14 (5), 845–849. [DOI] [PubMed] [Google Scholar]
  • (28).Costantino N, and Court DL (2003) Enhanced levels of Red-mediated recombinants in mismatch repair mutants. Proc. Natl. Acad. Sci. U. S. A 100 (26), 15748–15753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Murphy KC (2016) λ Recombination and Recombineering. EcoSal Plus, DOI: 10.1128/ecosalplus.ESP-0011-2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Yu D, Ellis HM, Lee EC, Jenkins NA, Copeland NG, and Court DL (2000) An efficient recombination system for chromosome engineering in Escherichia coli. Proc. Natl. Acad. Sci. U. S. A 97 (11), 5978–5983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Blin K, Shaw S, Steinke K, Villebro R, Ziemert N, Lee SY, Medema MH, and Weber T (2019) antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47 (W1), W81–W87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Chevrette MG, Aicheler F, Kohlbacher O, Currie CR, and Medema MH (2017) SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria. Bioinformatics 33 (20), 3202–3210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Borodina I (2005) Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res. 15 (6), 820–829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Davies SG, Fletcher AM, Frost AB, Roberts PM, and Thomson JE (2014) Trading N and O. Part 2: Exploiting aziridinium intermediates for the synthesis of β-hydroxy-α-amino acids. Tetrahedron 70 (35), 5849–5862. [Google Scholar]
  • (35).Patteson JB, Dunn ZD, and Li B (2018) In Vitro Biosynthesis of the Nonproteinogenic Amino Acid Methoxyvinylglycine. Angew. Chem., Int. Ed 57 (23), 6780–6785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).Doroghazi JR, Albright JC, Goering AW, Ju K-S, Haines RR, Tchalukov KA, Labeda DP, Kelleher NL, and Metcalf WW (2014) A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat. Chem. Biol 10 (11), 963–968. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementation Information

RESOURCES