Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 20.
Published in final edited form as: ACS Chem Biol. 2020 Nov 10;15(11):2976–2985. doi: 10.1021/acschembio.0c00620

Bioinformatic and Reactivity-Based Discovery of Linaridins

Matthew A Georgiou , Shravan R Dommaraju , Xiaorui Guo , David H Mast , Douglas A Mitchell †,‡,§,*
PMCID: PMC7680433  NIHMSID: NIHMS1643134  PMID: 33170617

Abstract

Linaridins are members of the ribosomally synthesized and post-translationally modified peptide (RiPP) family of natural products. Five linaridins have been reported, which are defined by the presence of dehydrobutyrine, a dehydrated, alkene-containing amino acid derived from threonine. This work describes the development of a linaridin-specific scoring module for Rapid ORF Description and Evaluation Online (RODEO), a genome-mining tool tailored towards RiPP discovery. Upon mining publicly accessible genomes available in the NCBI database, RODEO identified 561 (382 non-redundant) linaridin biosynthetic gene clusters. Linaridin BGCs with unique gene architectures and precursor sequences markedly different from previous predictions were uncovered during these efforts. To aid in dataset validation, two new linaridins, pegvadin A and B, were detected through reactivity-based screening and isolated from Streptomyces noursei and Streptomyces auratus, respectively. Reactivity-based screening involves the use of a probe that chemoselectively modifies an organic functional group present in the natural product. The dehydrated amino acids present in linaridins as α/β-unsaturated carbonyls were appropriate electrophiles for nucleophilic 1,4-addition using a thiol-functionalized probe. The data presented within significantly expands the number of predicted linaridin biosynthetic gene clusters and serves as a road map for future work in the area. The combination of bioinformatics and reactivity-based screening is a powerful approach to accelerate natural product discovery.

Graphical Abstract

graphic file with name nihms-1643134-f0001.jpg


Genome sequencing continues to grow exponentially and provides numerous opportunities for microbial natural product discovery through genome-mining.1,2 Automated bioinformatics tools have accelerated the identification of putative natural product biosynthetic gene clusters (BGCs). As our understanding of these pathways improves, the chemical structure of many natural products can be predicted to varying degrees of accuracy through computational analysis.3 Such gene-to-structure methods permit a logical approach to prioritize the discovery of novel natural products from uncharacterized BGCs.46 One natural product family where a gene-to-molecule approach is increasingly applied is with the ribosomally synthesized and post-translationally modified peptides (RiPPs).

RiPPs have attracted attention for several reasons. The biological activities associated with RiPPs is expansive, including roles in intercellular signaling, cellular redox metabolism, and therapeutic applications ranging from antimicrobials and analgesics to the treatment of cystic fibrosis.7 Mature RiPPs have been shown to contain a diverse set of enzymatically installed post-translational modifications including, but not limited to: dehydration, heterocycle formation [i.e., thiazol(in)e, oxazol(in)e, (dehydro)piperidine, pyridine], various thioether crosslinks, macrolactones/macrolactams, glycosylations, acetylations, epimerizations, and methylations.8 Enzymes involved in RiPP biosynthesis act on the precursor peptide, generally composed of N-terminal “leader” and C-terminal “core” regions. The leader region facilitates the binding of enzymes involved in modifying the core and is removed during maturation of the natural product.9 The bipartite nature of RiPP precursor peptides, coupled with often substrate-tolerant modifying enzymes, make RiPPs attractive targets for analog generation through genetic mutation.10,11

Knowledge of enzymes involved in RiPP biosynthesis permits the genomic identification of new members using search algorithms, such as BLAST. Compared with other natural product classes, identification of RiPP BGCs presents a unique challenge: not only is there no universally conserved protein to define the natural product family, the precursor peptides are often short and hypervariable. Consequently, these precursor peptides are often not predicted as protein-coding genes by automated gene finders unless they are unusually long and/or nearly identical to a known or annotated sequence.12 Without reliable precursor peptide identification, the novelty of a hypothetical RiPP cannot be confidently predicted. Therefore, there has been a demand for a computational method to accurately identify RiPP precursor peptides.

The bioinformatics program Rapid ORF Description and Evaluation Online (RODEO) automates the RiPP precursor peptide identification process.12 Prior to submission to RODEO, potential members of a RiPP class are typically identified through iterative BLASTP queries using a protein involved in a known biosynthetic pathway of interest.13 The NCBI accession identifiers for these proteins are then provided as input for RODEO, which queries NCBI databases to retrieve the local genomic neighborhood. Functional prediction of the coding sequences is automatically provided by profile hidden Markov models (pHMMs) from the PFAM14 and TIGRFAM15 databases. The end-user can also supply a custom pHMM library to provide additional gene function annotation. RODEO subsequently performs a six-frame translation within the intergenic regions of the genomic neighborhood to identify all potential open-reading frames (ORFs) that may encode the precursor peptide(s). The hypothetical ORFs are then scored through a combination of heuristic scoring, MEME motif analysis,16 and support vector machine (SVM) classification, each tailored toward a specific RiPP class. A predetermined scoring threshold separates valid precursors from hypothetical/non-coding sequences. Recent publications have shown that RODEO, alongside other bioinformatic tools, facilitates the identification of new BGCs from a variety of RiPP classes.12,1723 Through the assembly of a comprehensive dataset of BGCs specific to a RiPP class, new insights into the natural product group are gleaned as well as information that can be used to prioritize the discovery of novel members of RiPP class.

Linaridins (linear arid peptides) are an understudied class of RiPPs, with only five characterized members: cypemycin, grisemycin, legonaridin, salinipeptin, and mononaridin (Figure 1).24 The linaridin RiPP class is defined by the presence of Thr-derived dehydrobutyrines (Dhb) in the final product, although additional tailoring modifications are known.25 The enzyme(s) responsible for linaridin Dhb installation remain unconfirmed but are known to be unrelated to those involved in lanthipeptide biosynthesis, where Dhb formation is well established.26 Previous work on cypemycin and legonaridin demonstrated that genetic deletion of cypL, cypH, or legG and legE (the two domains of LinH), prevented linaridin formation (Figure 1, Table S1).27,28 All characterized linaridins display Nα,Nα-dimethylation of the N-terminus through the action of a locally encoded methyltransferase (LinM). Some linaridins are also adorned with aminovinyl cysteine (AviCys) at the C-terminus. AviCys formation in the archetypal linaridin cypemycin involves decarboxylation by a flavin-dependent decarboxylase (CypD); however, the enzymes responsible for ring formation have yet to be determined.29,30 Although it is suggested that AviCys biosynthesis in linaridins parallel that found in lanthipeptides.29 Subsequent deletions of genes related to these ancillary modifications (i.e. cypM and cypD within the cypemycin BGC) yielded Dhb-containing linaridins lacking methylation and AviCys, respectively.

Figure 1. Biosynthetic gene clusters and structures of linaridins.

Figure 1.

(A) BGCs responsible for cypemycin and legonaridin. The “Lin” protein naming scheme unify the nomenclature for all linaridins with known functions indicated.24 (B) Abbreviated structures of cypemycin and legonaridin. Purple, N-terminal demethylation; Blue, Dhb; Red, AviCys; a-Ile, allo-isoleucine.

Linaridins remain underrepresented within the literature. Potential reasons include a poorly understood biosynthetic pathway and that only cypemycin and salinipeptin exhibit a reported bioactivity. Cypemycin shows growth suppression activity towards Micrococcus luteus and murine P388 leukemia cells.31 Salinipeptin shows modest antibacterial activity against Streptococcus pyogenes and is toxic towards U87 giloblastoma and HCT-116 colon carcinoma cells.32 Considering the less-studied status of linaridins, along with the class-defining Dhb modification being an ideal functional group for reactivity-based discovery, we sought to create a comprehensive dataset of all observable linaridins. This goal was achieved by the development of an automated, linaridin-specific RODEO scoring module. A combination of various bioinformatic methods were used to construct a comprehensive linaridin dataset (n = 561 BGCs). The most-probable precursor peptide(s) from each linaridin BGC were identified using common features gleaned from characterized and high-confidence, predicted linaridins. As in previous reports,12,17,18,23,24,33 the RODEO-enabled dataset was retrospectively analyzed to extract new insights into the linaridins, including the identification of previously unreported leader region motifs, assessment of the structural diversity of the core region, and broad-scale categorization of potential tailoring enzyme(s). Additionally, our analysis also highlights linaridin BGCs of unusual composition, both in precursor peptide sequence and presence of biosynthetic enzymes, illuminating a subset that differ dramatically from characterized examples. Lastly, leveraging the linaridin dataset, a reactivity-guided discovery campaign was conducted. A thiol-functionalized probe that engages the Dhb moieties of linaridins via nucleophilic 1,4-addition aided in the isolation and characterization of two new linaridins. We termed these compounds pegvadin A and pegvadin B, which derive from Streptomyces noursei NRRL B-1714 and Streptomyces auratus NRRL B-8097 respectively.

Results and Discussion

Linaridin genome-mining and precursor peptide scoring

A previous publication suggested a standard nomenclature for linaridin biosynthetic proteins, which we have adopted for the current study (Figure 1).34 Gene-deletion studies support that the class-defining Dhb modifications are installed by either LinE, LinG, LinH (a LinE-LinG fusion), or LinL (Figure 1).28 Therefore, a dataset of all observable homologs of these proteins encoded in RiPP-like genomic contexts was constructed. A multi-round PSI-BLAST13 was performed on recognized linaridin biosynthetic proteins until the number of retrieved sequences converged (Supplemental Methods and Table S1). The local genome context was then assessed to determine the presence of these key biosynthetic genes, which are strongly indicative of linaridin production.

The class-defining linaridin biosynthetic protein(s) remain to be experimentally validated and the existing HMMs that define these protein families were found to be insufficient for a broad-scale survey of all linaridins. To expedite the identification of linaridin BGCs, custom pHMMs were generated using HMMER335 for the α/β hydrolase (LinE), transmembrane protein domain (LinG), the LinE-LinG fusion (LinH), and another protein of unknown function (LinL) (Supplemental Methods and Supplemental Dataset 1). Existing search models for ancillary proteins found within linaridin BGCs were deemed adequate, including the N-methyltransferase (LinM, PF13649), ABC transporter (LinT, TIGR02204), and flavin-dependent decarboxylase (LinD, TIGR00521). The custom pHMMs for the core linaridin biosynthetic proteins were used to identify the presence or absence of each gene, aiding in the rapid identification of complete linaridin BGCs. A final dataset containing 561 linaridin BGCs with 382 non-redundant members was compiled (Supplemental Datasets 23). This dataset expands upon a recent linaridin genome-mining study.34

A maximum-likelihood phylogenetic tree was produced for sequences encoding distinct LinE domains, theorized to be involved in Dhb formation (Figures S1S2).27,36 Computational identification of the LinE-like domain boundaries employed a script that leverages a custom LinE pHMM (Supplemental Methods and Supplemental Note S1). LinE domains from type A linaridins formed a distinct clade as did LinE domains from type B linaridins. Such co-evolution of biosynthetic enzymes and precursor peptides is well-documented within RiPP BGCs.12 Therefore, the precursor peptides originating from these two established clades were sufficient for use as a training set to develop a linaridin-specific RODEO scoring module.

Analysis of precursor peptides from the type A and B linaridin clades identified key features (Figure S2), including a PxxxTP motif used to determine the potential leader peptide cleavage site, noted in previous publications.34,36 Other features of interest included gene directionality, distance from key biosynthetic genes, and core region hydrophobicity (>55% of the core residues are hydrophobic; this value includes Thr, given its complete conversion to Dhb in all characterized linaridins). A heuristic scoring scheme was devised using with these characteristics and applied to all potential precursor peptides (Table S2). When combined with a custom LinA pHMM and support vector machine (SVM) classification (Supplemental Dataset 1 and Supplemental Methods), the linaridin scoring module identified probable precursor peptide(s) with maximum precision and recall statistics achieved at a threshold score of 12. Thus, sequences scoring 12 or higher are predicted linaridin precursor peptides (Supplemental Figures S3S4).

Comparison of RODEO to other bioinformatic tools

The final list of candidate linaridin precursor peptides found by RODEO (Supplemental Datasets 23) was cross-referenced against previous predictions reported in the literature.20,22,24,37 Comparison to the most recent linaridin-specific dataset published, which catalogued 204 BGCs,34 shows that the linaridin RODEO module identified 294 of the 303 (97%) previously predicted precursor peptides, returning an average score of 29.5 (Table S3 and Figure S5). All nine of the reported precursor peptides that were not identified by RODEO contain highly irregular sequences. Many display a non-canonical RAVSTP motif near the predicted leader-core region junction and/or a polybasic region near the C-terminus (Supplemental Dataset 4). Of the nine, RODEO identified high-scoring an alternative linaridin precursor peptide in four cases, the other five showed no clear precursor sequence. This indicates that the previously reported peptides are misidentified.

Another published report produced a generalized RiPP dataset utilizing a bioinformatic tool named DeepRiPP. This study predicted a total of 135 linaridin precursor peptides (86 unique sequences).20 When compared to the RODEO-derived dataset, only 76 (~56%) sequences scored 12 or higher and thus are not considered confident predictions (Table S3). Equally troubling is that of the 135 DeepRiPP-predicted linaridin precursor peptides, only 26 are encoded next to known linaridin biosynthetic proteins. The remaining 109 precursors are reported as distantly encoded (i.e. not within the BGC).20 Unfortunately, DeepRiPP concludes that the cypemycin and legonaridin precursor peptides are not encoded within the BGC and this is clearly erroneous (Figure 1). Furthermore, the 26 precursor peptides classified as within the BGC are actually not adjacent to linaridin biosynthetic genes, rather they occur near genes associated with other RiPP classes, including thiopeptides. Contrary to the report, 93 of the 109 precursor peptides reported as distantly encoded are found within 3 kb of a linaridin biosynthetic gene, which was evident upon cross-referencing with the RODEO-derived dataset (Supplemental Dataset 4).

A final comparison was drawn against datasets from other RiPP discovery tools, namely NeuRiPP,19 MetaMiner,37 and RiPPMiner38 (Table S3). Nearly all (94%) of the precursor peptides identified by NeuRiPP, and the single linaridin identified by MetaMiner, were also present in the RODEO dataset. Of the 13 precursor peptides found by RiPPMiner, 9 lacked key linaridin BGC features, such as omission of a LinE homolog (not only from the local genomic region, but from the entire genome), the predicted precursor reported cannot be found in the local region, or the putative precursor peptide scored below the validity threshold. Only four members of the RiPPMiner dataset were judged to be valid by the RODEO module. Upon manual inspection, we believe the abovementioned 9 out of 13 RiPPMiner-identified sequences deemed invalid by RODEO are misidentified (Supplemental Dataset 4). These comparisons show that the linaridin module of RODEO enabled a compilation of the most comprehensive and accurate dataset of linaridin BGCs, with each entry containing at least the minimum genes required for linaridin biosynthesis. Additionally, the module demonstrates greater efficacy in identifying precursor peptide sequences compared to automated gene finders, as nearly 10% of the linaridin precursor peptides identified by RODEO were not annotated as protein-coding sequences.

Content analysis of linaridin precursor peptides and BGCs

Linaridin structural diversity is dependent on primary sequence and the location and extent of the post-translational modifications.27,32,41,42 A gene encoding an ABC transporter, LinT, co-occurs in 96% of linaridin BGCs and is presumed to be involved in compound export. The most widespread, co-occurring enzyme responsible for linaridin modification is the methyltransferase, LinM, found in 70% of BGCs (Table S4). LinM-dependent N-terminal dimethylation is vital for cypemycin bioactivity.27 Statistical analysis of the first amino acid of the precursor core region (+1 position) in LinM-containing BGCs most frequently appears as Ala, Gly, Leu, or Phe; however, when considering the expected codon usage, there is a clear enrichment for Ala, and to a lesser extent, Phe. Other large aromatic residues (Trp, Tyr), charged residues (Asp, Glu, Lys, Arg), as well Asn, Gln, and Pro and have no representation at core position +1; further, three additional residues (i.e. His, Thr, and Val) are significantly depleted relative to the expected residue frequency (Table S5). Analysis of the residue identity for core position +1 in BGCs lacking a LinM homolog reveals a different picture. Although Ala remains significantly enriched, Cys and Thr are also very common at position +1, comprising 86% of the total sequences.

Another co-occurring modification enzyme is the flavin-dependent decarboxylase, LinD, present in 9% of linaridin BGCs. LinD acts at the C-terminus of the core region and plays a role in AviCys formation.27,29,30 BGCs that include LinD co-occur exclusively with precursor peptides with a C-terminal CxxC motif. The final, prevalent co-occurring gene, present in 40% of BGCs, encodes LinC, a protein showing distant homology to a FAD-dependent desaturase (TIGR02734). LinC has no known biochemical role although it is reported as being essential for legonaridin expression.28

RODEO identified 568 high-confidence precursor peptides from 382 non-redundant linaridin BGCs. Several BGCs contain multiple precursor peptides (Figure S6), with a maximum of six unique sequences found in Leifsonia xyli subsp. xyli. To visualize the diversity of the linaridin precursor peptides, a sequence similarity network (SSN) was generated (Figure 2). Of the 568 precursor peptides identified by RODEO, 457 had unique primary sequences. Within the dataset, only groups 1 (type A) and 2 (type B) contained characterized linaridins, meaning 87% of precursor peptides predicted by RODEO are substantially different compared to isolated linaridins. To better visualize the dominant sequence trends in each of these groups, an alignment sequence logo43 was produced for the leader and core and regions of each group of the SSN containing >4 members (Figures S7S8).

Figure 2. Sequence similarity network of linaridin precursors.

Figure 2.

Nodes within the SSN are color based on the co-occurrence of a LinM (methyltransferase, purple), LinD (decarboxylase, red), or both (yellow). Groups containing >5 members are numbered. Nodes representing known linaridins are shown as triangles and labeled. This SSN was generated using EFI-EST39 and visualized with Cytoscape40. Protein sequences are conflated at 100% identity (i.e. identical sequences are only represented once), resulting in 457 nodes. Edges indicate an alignment score of 7 (expectation value of <10−7). Phylogenetic tree information for LinE homologs is available in Supplemental Dataset 5.

Research on RiPP biosynthesis has demonstrated that conserved sequences within the leader region, correspond to key recognition motifs for biosynthetic proteins.44,45 Within linaridins, the foremost conserved sequence is found at the predicted leader peptide cleavage site (Figure S9).36 Due to this high level of conservation, this motif may play a role in substrate recognition by the leader peptidase. Further, various other amino acid motifs are conserved within the leader region (Figure S7). Most notably, groups 1 and 2 display a conserved FANxxL motif, which was evident in published alignments but never specifically commented upon.36 As LinM and LinD enzymes are predicted to be leader peptide-independent,28,29 the FANxxL motif may facilitate the recognition or binding of an enzyme(s) involved in the conversion of Thr to Dhb. Further experimental analysis will be required to test this hypothesis.

The core regions of linaridin precursor peptides show varying levels of conservation, but there is a broad sequence diversity between the groups (Figure S8). The length of the core region was assessed and found to range from 14 to 80 amino acids (Figure S10). This is considerably wider than characterized linaridins, which span 19 to 37 core residues.27,32,41,42 Within the predicted linaridin core, the position and abundance of Thr residues is highly variable, yet all known examples convert all Thr to Dhb. While unmodified Ser has been observed in all characterized linaridins, mutational studies on cypemycin show that replacement of Thr with Ser led to dehydroalanine (Dha) formation,36 suggesting that dehydration is site-specific. To determine the maximum number of potential Dhb moieties in yet uncharacterized linaridins, Thr content was assessed and found to range from 1 to 13 (mean = 3.1 ± 1.9 Thr per core sequence, Figure S11). Factoring in the potential for Dha moieties to be found in future examples, Thr+Ser content was calculated, which ranged from 1 to 22 (mean = 4.2 ± 2.9). Taking these variables into account, the linaridin class of RiPPs contains greater sequence diversity than what is represented by known members.

Identification of BGCs with unusual architectures

The bioinformatic survey of linaridins revealed various BGCs containing unusual gene architectures. One striking example was replicated in three strains: Streptomyces NRRL F-2890, Streptomyces xiamensis, and Streptomyces atratus. These BGCs contain the three key linaridin biosynthetic genes (LinE, LinG, and LinL) in an apparent hybrid BGC with a LanB dehydratase (PF04738) and LanC cyclase (PF05147), known from class I lanthipeptide biosynthesis (Figure S12). The precursor peptides more closely resemble class I lanthipeptides than any known than linaridin, with a polyanionic leader region and Cys residues distributed throughout the core region.23 To determine the most likely leader peptide cleavage site, these precursor peptides were provided as input for RODEO with the module for lanthipeptide detection selected and manually adjusted.23 In combination with the lanthipeptide dehydratase, the only plausible role for the LinE, LinG, and LinL homologs is Dha/Dhb formation.

All but two linaridin BGCs are encoded by Actinobacteria, despite this phylum accounting for less than 10% of genomes within NCBI (Figures S1, S1314). The two exceptions are from Kroppenstedtia sanguinis (Firmicutes)46 and Burkholderiales bacterium PBB3 (Proteobacteria). The K. sanguinis BGC contains LinE, LinG, LinL, and LinT homologs, alongside another ABC transporter gene, and a gene of unknown function annotated as a metallopeptidase protein (PF01551). The most probable linaridin precursor peptide from K. sanguinis lacks several traits associated with known linaridins and thus was judged “invalid” by RODEO. Notably, the sequence contains 11 Ser residues. This abundance of Ser in the precursor may indicate a Dha-rich linaridin, which would be unprecedented. The BGC found in Burkholderiales bacterium PBB3 contains a LinH homolog, a potential decarboxylase, but lacks a LinL homolog (Figure S13). Future work is warranted on these non-actinobacterial linaridin BGCs.

Other precursors with unusual core sequences were also identified. A notable trend was the presence of core Cys residues without a co-occurring LinD homolog (Supplemental Datasets 23). Unmodified thiol groups on Cys are rare in mature RiPPs; therefore, it is plausible that such positions undergo additional post-translational modification. Cys content was calculated for all linaridin core regions from non-redundant precursor peptides and ranged from 0–3, with a mean distribution of 0.4 ± 0.7 (Figure S15). A particularly unique subset of Cys-containing precursors was found in Saccharomonospora sp. which display core regions predicted to begin with Cys followed by a sequence rich in dehydratable Thr/Ser residues (Table S6). These unusual BGCs and precursors further suggest that linaridins characterized in the literature represent only a small subset of those naturally encoded.

New linaridin discovery through reactivity-based screening

Reactivity-based screening (RBS) is a natural product discovery strategy that involves metabolite detection through chemoselectively targeting a specific organic functional group. An RBS-based approach is associated with several advantages such as: (i) prioritization of novelty through bioinformatic analysis (for instances when a genome is known and functional group presence within the natural product is predictable), (ii) screening can be performed directly on cellular extracts, obviating the need for laborious purification, and (iii) compound detection is via mass spectrometry (MS) and thus agnostic to biological activity. Covalent modification observed by comparative MS enables the rapid and sensitive detection of natural products bearing the targeted functional group.47 The class-defining Dhb modifications in linaridins are Michael acceptors susceptible to 1,4-nucleophilic addition.48 This reactivity of dehydrated amino acids has been previously harnessed to map the location of amino acids within a target compound,49 identify lanthipeptides within microbial cultures,50 and in semi-synthetic modification of thiostrepton.17,47,51 Dithiothreitol (DTT) was selected as the Michael donor owing to its commercial availability, reactivity, and literature precedence (Supplemental Methods).47 Each DTT-labeling event yields an increase in mass of 154 Da. Detection of labeling, along with high-resolution and tandem MS (HRMS/MS), allows identification of the corresponding parental mass, which can be readily cross-referenced with existing literature and natural product databases to assess novelty.

Cross-referencing the compendium of linaridins with our in-house bacterial strain collection identified 34 organism harboring the genetic capacity to produce a novel linaridin (Table S7). These target organisms were cultivated, the bacterial cells were harvested, and a methanolic cell-surface extraction was performed on all candidates. Cell lysis is generally avoided during RBS based on the assumption that the culture medium would be less complex than the bacterial cytosolic fraction and that the mature natural product would be exported from the producing cell. The cell-surface extracts of suspected novel linaridin producers were then reacted with DTT supplemented with a non-nucleophilic, hindered base (Supplemental Methods). Both unreacted and reacted extracts were analyzed by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) MS to identify compounds in the extract and subsequent labeling events. Upon completion of the screen, extracts from S. noursei NRRL B-1714 and S. auratus NRRL B-8097 contained DTT-labeled masses that corresponded to predicted linaridins. These natural products labeled upwards of three times with DTT (Figure 3). With Dhb representing a sterically hindered Michael acceptor, incomplete labeling was observed under conditions that would show quantitative labeling for Dha residues.47

Figure 3. Reactivity-based screening of novel linaridins.

Figure 3.

MALDI-TOF mass spectra of unreacted (top) and DTT-labeled (bottom) for (A) S. noursei and (B) S. auratus extract.

The target compound from each strain was purified using reversed-phase high-performance liquid chromatography (HPLC) over several different gradients (Supplemental Methods). Each putative linaridin was then analyzed by HRMS, which confirmed the molecular formula for both compounds. The product from S. noursei had an observed mass ([M + 2H]2+ = 1629.3778 Da) corresponding to a molecular formula of C150H239N39O42 ([M + 2H]2+ = 1629.3804; error = 1.6 ppm). The compound isolated from S. auratus had an observed mass ([M + 2H]2+ = 1623.3868 Da) corresponding to a molecular formula of C149H239N39O42 ([M + 2H]2+ = 1623.3882; error = 0.9 ppm, Tables S8S9). These formulas were in agreement with the bioinformatically predicted linaridin core regions, where all Thr had been converted to Dhb and Nα,Nα-dimethylation of the N-terminus had taken place (Figure 4). Further, MS/MS analysis via collision-induced dissociation revealed a series of daughter ions matching the expected linaridin core sequence and location of the abovementioned modifications (Figure S16). The newly discovered linaridins from S. noursei and S. auratus were termed pegvadin A and B, respectively.

Figure 4. BGC and structure of pegvadin A and B.

Figure 4.

(A) BGCs for pegvadin A and B. Specific gene names are noted, derived from Pegvadin A (Pva) and Pegvadin B (Pvb) respectively. (B) Structures of pegvadin A and B with the C-alpha stereochemical configuration for pegvadin A being determined by Marfey’s analysis. Post-translational modifications are color-coded: purple, dimethylation (n=1); blue, Dhb (n=6), orange, epimerization (n=22).

While no other known linaridins are reported to contain d-amino acids, salinipeptin contains 9 epimerized positions (d-Ala, d-Gln, d-Ile, and d-Pro).32 The epimerase responsible remains to be identified. Owing to a sufficient isolated yield, the C-alpha stereochemical configuration of pegvadin A was assessed using Marfey’s method (Figure S17, Table S11).52 The configuration at the N-terminal Nα,Nα-dimethylleucine was not determined and all Thr were previously determined to be converted to Dhb, which lack a chiral center. All other positions, except Asp29 and Arg32, were determined to be in the d-configuration. The chiral analysis unfortunately did not permit a definitive assignment of l-Asp29; however, the C-terminal residue was conclusively assigned as l-Lys. To further substantiante the finding that most of the residues were d-amino acids, pegvadin A was treated with several commercially available proteases. The proteolytic susceptibility assays employed endoproteinase Glu-C (cleaves C-terminal to Glu), endoproteinase Asp-N (cleaves N-terminal to Asp), trypsin, and carboxypeptidase Y (C-terminal proteolysis). No notable digestion of pegvadin A was observed, even after extended reaction times, supporting the presence of multiple d-amino acids (Figure S18). Treatment of partially purified pegvadin B with the same proteases resulted in comparable results (Figure S19). Given the sequence identity (31/32 residues) in the core region of the pegvadin precursor peptides and enzymes encoded in the BGC (Table S10), we suspect that pegvadin B will contain several amino acids in the d-configuration.

Sequence-Function comparison of pegvadin A & B to characterized linaridins

Pegvadin A and B are type B linaridins (Figures S2 and 4)36. The overall architecture of the pegvadin BGCs, as well as the sequence similarity of the coding sequences (Table S10), resembles that of legonaridin and mononaridin. The leader regions of the known type B linaridins share the previously mentioned, conserved FANxxL motif with pegvadin A and B sharing the highest overall sequence similarity with legonaridin. The core regions also share some commonalities, as the N-terminal region of the pegvadin A and legonaridin core sequences only differ at position 1 (Leu versus Ile). However, the remainder of the pegvadin core regions are substantially different compared to other linaridins (Figure 4). Compared to legonaridin and mononaridin, the pegvadins are shorter (32-residue) and harbor fewer Dhb residues. Finally, the pegvadins display Nα,Nα-dimethylation of a Leu residue, which previously has only been observed on Ile (legonaridin), Ala (cypemycin), and Val (mononaridin). Leu at core position +1 is predicted to occur in <5% of linaridin BGCs that include a LinM homolog (Table S5).

Cypemycin and salinipeptin A exhibit modest activity against cancer cell lines, with cypemycin also demonstrating antimicrobial activity against Micrococcus luteus.27,32 To assess if pegvadin A displays growth-suppressive activity towards bacteria, the minimum inhibitory concentration (MIC) against several species was determined (Table S12). Pegvadin A was inactive (up to a concentration of 32 μg/mL) against Staphylococcus aureus (Firmicute), Pseudomonas aeruginosa (Proteobacteria), M. luteus (Actinobacteria), and Streptomyces puniceus (Actinobacteria).

Conclusion

Despite a number of reports over the past decade, the linaridins remain an undercharacterized RiPP class. This dearth of knowledge is exemplified by the fact that there are only a few characterized members, the putative protein(s) responsible for the class-defining modification (i.e. Dhb installation) remain to be biochemically validated, and the biological role for any mature linaridins remains speculative. To facilitate future work on the linaridins, we developed an automated genome-mining method specific for this RiPP class. During the construction of the RODEO scoring module, pHMMs were created for each of the genes known to play a role in linaridin biosynthesis, allowing rapid identification of potential linaridin BGCs. Bioinformatic analysis of the expanded set of linaridin BGCs led to new insights, including rare linaridin BGCs encoded beyond the Actinobacteria, hybrid BGCs containing both lanthipeptide and linaridin biosynthetic enzymes, and trends relating methyltransferase co-occurrence to the identity of the first core position. Also, the linaridin sequence-function space was significantly expanded, indicating that isolated members are representative of only two precursor peptide subclasses, containing less than a quarter of all currently identifiable linaridins.

Leveraging the dataset provided by the linaridin detection module of RODEO, a targeted set of potential linaridin producing organisms were cultivated. Using the inherent electrophilicity of the class-defining Dhb residues, a thiol-functionalized probe enabled the discovery of two new linaridins, pegvadin A and B. The structure of these linaridins was confirmed through a comination of HRMS/MS, Marfey’s analysis, and corroborating protease resistance assays. Pegvadin A contains 22 confirmed amino acids in the d-configuration. The data presented illustrate a plethora of linaridins remain to be discovered, which provide numerous organisms and enzymes for future biosynthetic, natural product isolation, and biological function studies.

Supplementary Material

SI document
SI datasets

Supplemental Dataset 1: custom profile Hidden Markov Models (ZIP)

ACKNOWLEDGEMENT

We would like to acknowledge Mitchell group members G. Hudson and A. Kretsch for their editorial input on the manuscript and compilation of data related to Streptomyces sp. codon frequency, respectively.

Funding Sources

This work was supported by the National Institute of General Medical Sciences (GM123998 to D.A.M). Funds to purchase the Bruker UltrafleXtreme MALDI TOF/TOF mass spectrometer were from the National Institutes of Health (S10 RR027109 A).

Footnotes

Notes

The authors declare no competing financial interest.

The Supporting Information is available free of charge on the ACS Publications website at DOI: TBA. Additional experimental data, full methods description, algorithm description and stand-alone scripts, Figures S1– S16, and Tables S1–S10 (PDF).

Supplemental Dataset 1: custom profile Hidden Markov Models (ZIP)

Supplemental Dataset 2: tabular RODEO output (XLSX)

Supplemental Dataset 3: graphical RODEO output (PDF)

Supplemental Dataset 4: comparison of RODEO to other bioinformatic tools (XLSX)

Supplemental Dataset 5: PhyloXML data for the phylogenetic tree generation (TXT)

References

  • (1).Walsh CT; Fischbach MA Natural Products Version 2.0: Connecting Genes to Molecules. J. Am. Chem. Soc 2010, 132 (8), 2469–2493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Hadjithomas M; Chen I-MA; Chu K; Huang J; Ratner A; Palaniappan K; Andersen E; Markowitz V; Kyrpides NC; Ivanova NN New Features for Bacterial Secondary Metabolism Analysis and Targeted Biosynthetic Gene Cluster Discovery in Thousands of Microbial Genomes. Nucleic Acids Res. 2017, 45 (D1), D560–D565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Khater S; Anand S; Mohanty D In Silico Methods for Linking Genes and Secondary Metabolites: The Way Forward. Synth Syst Biotechnol 2016, 1 (2), 80–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (4).Medema MH; Fischbach MA Computational Approaches to Natural Product Discovery. Nat. Chem. Biol 2015, 11 (9), 639–648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Ziemert N; Alanjary M; Weber T The Evolution of Genome Mining in Microbes – a Review. Nat. Prod. Rep 2016, 33 (8), 988–1005. [DOI] [PubMed] [Google Scholar]
  • (6).I. Tietz J; A. Mitchell D Using Genomics for Natural Product Structure Elucidation. Curr. Top. Med. Chem 2016, 16 (15), 1645–1694. [DOI] [PubMed] [Google Scholar]
  • (7).Arnison PG; Bibb MJ; Bierbaum G; Bowers AA; Bugni TS; Bulaj G; Camarero JA; Campopiano DJ; Challis GL; Clardy J; Cotter PD; Craik DJ; Dawson M; Dittmann E; Donadio S; Dorrestein PC; Entian K-D; Fischbach MA; Garavelli JS; Göransson U; Gruber CW; Haft DH; Hemscheidt TK; Hertweck C; Hill C; Horswill AR; Jaspars M; Kelly WL; Klinman JP; Kuipers OP; Link AJ; Liu W; Marahiel MA; Mitchell DA; Moll GN; Moore BS; Müller R; Nair SK; Nes IF; Norris GE; Olivera BM; Onaka H; Patchett ML; Piel J; Reaney MJT; Rebuffat S; Ross RP; Sahl H-G; Schmidt EW; Selsted ME; Severinov K; Shen B; Sivonen K; Smith L; Stein T; Süssmuth RD; Tagg JR; Tang G-L; Truman AW; Vederas JC; Walsh CT; Walton JD; Wenzel SC; Willey JM; van der Donk WA Ribosomally Synthesized and Post-Translationally Modified Peptide Natural Products: Overview and Recommendations for a Universal Nomenclature. Nat Prod Rep 2013, 30 (1), 108–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Montalbán-López M; Scott TA; Ramesh S; Rahman IR; van Heel AJ; Viel JH; Bandarian V; Dittmann E; Genilloud O; Goto Y; Grande Burgos MJ; Hill C; Kim S; Koehnke J; Latham JA; Link AJ; Martínez B; Nair SK; Nicolet Y; Rebuffat S; Sahl H-G; Sareen D; Schmidt EW; Schmitt L; Severinov K; Süssmuth RD; Truman AW; Wang H; Weng J-K; van Wezel GP; Zhang Q; Zhong J; Piel J; Mitchell DA; Kuipers OP; van der Donk WA New Developments in RiPP Discovery, Enzymology and Engineering. Nat. Prod. Rep 2020, 10.1039.D0NP00027B. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Oman TJ; van der Donk WA Follow the Leader: The Use of Leader Peptides to Guide Natural Product Biosynthesis. Nat. Chem. Biol 2010, 6 (1), 9–18. 10.1038/nchembio.286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Melby JO; Nard NJ; Mitchell DA Thiazole/Oxazole-Modified Microcins: Complex Natural Products from Ribosomal Templates. Curr. Opin. Chem. Biol 2011, 15 (3), 369–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Goto Y; Suga H Artificial In Vitro Biosynthesis Systems for the Development of Pseudo-Natural Products. Bull. Chem. Soc. Jpn 2018, 91 (3), 410–419. 10.1246/bcsj.20170379. [DOI] [Google Scholar]
  • (12).Tietz JI; Schwalen CJ; Patel PS; Maxson T; Blair PM; Tai H-C; Zakai UI; Mitchell DA A New Genome-Mining Tool Redefines the Lasso Peptide Biosynthetic Landscape. Nat Chem Biol 2017, 13 (5), 470–478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Altschul S Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs. Nucleic Acids Res. 1997, 25 (17), 3389–3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).El-Gebali S; Mistry J; Bateman A; Eddy SR; Luciani A; Potter SC; Qureshi M; Richardson LJ; Salazar GA; Smart A; Sonnhammer ELL; Hirsh L; Paladin L; Piovesan D; Tosatto SCE; Finn RD The Pfam Protein Families Database in 2019. Nucleic Acids Res. 2019, 47 (D1), D427–D432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Haft DH The TIGRFAMs Database of Protein Families. Nucleic Acids Res. 2003, 31 (1), 371–373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Bailey TL; Boden M; Buske FA; Frith M; Grant CE; Clementi L; Ren J; Li WW; Noble WS MEME SUITE: Tools for Motif Discovery and Searching. Nucleic Acids Res. 2009, 37 (Web Server), W202–W208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Schwalen CJ; Hudson GA; Kille B; Mitchell DA Bioinformatic Expansion and Discovery of Thiopeptide Antibiotics. J Am Chem Soc 2018, 140 (30), 9494–9501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Hudson GA; Burkhart BJ; DiCaprio AJ; Schwalen CJ; Kille B; Pogorelov TV; Mitchell DA Bioinformatic Mapping of Radical S -Adenosylmethionine-Dependent Ribosomally Synthesized and Post-Translationally Modified Peptides Identifies New Cα, Cβ, and Cγ-Linked Thioether-Containing Peptides. J. Am. Chem. Soc 2019, jacs.9b01519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (19).de los Santos ELC NeuRiPP: Neural Network Identification of RiPP Precursor Peptides . Sci. Rep 2019, 9 (1), 13406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Merwin NJ; Mousa WK; Dejong CA; Skinnider MA; Cannon MJ; Li H; Dial K; Gunabalasingam M; Johnston C; Magarvey NA DeepRiPP Integrates Multiomics Data to Automate Discovery of Novel Ribosomally Synthesized Natural Products. Proc. Natl. Acad. Sci 2020, 117 (1), 371–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Blin K; Shaw S; Steinke K; Villebro R; Ziemert N; Lee SY; Medema MH; Weber T AntiSMASH 5.0: Updates to the Secondary Metabolite Genome Mining Pipeline. Nucleic Acids Res. 2019, 47 (W1), W81–W87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Santos-Aberturas J; Chandra G; Frattaruolo L; Lacret R; Pham TH; Vior NM; Eyles TH; Truman AW Uncovering the Unexplored Diversity of Thioamidated Ribosomal Peptides in Actinobacteria Using the RiPPER Genome Mining Tool. Nucleic Acids Res. 2019, 47 (9), 4624–4637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Walker MC; Eslami SM; Hetrick KJ; Ackenhusen SE; Mitchell DA; van der Donk WA Precursor Peptide-Targeted Mining of More than One Hundred Thousand Genomes Expands the Lanthipeptide Natural Product Family. BMC Genomics 2020, 21 (1), 387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Ma S; Zhang Q Linaridin Natural Products. Nat. Prod. Rep 2020, 10.1039.C9NP00074G. [DOI] [PubMed] [Google Scholar]
  • (25).Sit CS; Yoganathan S; Vederas JC Biosynthesis of Aminovinyl-Cysteine-Containing Peptides and Its Application in the Production of Potential Drug Candidates. Acc. Chem. Res 2011, 44 (4), 261–268. [DOI] [PubMed] [Google Scholar]
  • (26).Repka LM; Chekan JR; Nair SK; van der Donk WA Mechanistic Understanding of Lanthipeptide Biosynthetic Enzymes. Chem. Rev 2017, 117 (8), 5457–5520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).Claesen J; Bibb M Genome Mining and Genetic Analysis of Cypemycin Biosynthesis Reveal an Unusual Class of Posttranslationally Modified Peptides. Proc. Natl. Acad. Sci 2010, 107 (37), 16297–16302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Wang F; Wei W; Zhao J; Mo T; Wang X; Huang X; Ma S; Wang S; Deng Z; Ding W; Liang Y; Zhang Q Genome Mining and Biosynthesis Study of a Type B Linaridin Reveals a Highly Versatile α-N-Methyltransferase. CCS Chem. 2020, 1–17. [Google Scholar]
  • (29).Ding W; Yuan N; Mandalapu D; Mo T; Dong S; Zhang Q Cypemycin Decarboxylase CypD Is Not Responsible for Aminovinyl–Cysteine (AviCys) Ring Formation. Org. Lett 2018, 20 (23), 7670–7673. [DOI] [PubMed] [Google Scholar]
  • (30).Liu L; Chan S; Mo T; Ding W; Yu S; Zhang Q; Yuan S Movements of the Substrate-Binding Clamp of Cypemycin Decarboxylase CypD. J. Chem. Inf. Model 2019, 59 (6), 2924–2929. [DOI] [PubMed] [Google Scholar]
  • (31).Komiyama K; Otoguro K; Segawa T; Shiomi K; Yang H; Takahashi Y; Hayashi M; Oxani T; Omura S A New Antibiotic, Cypemycin. Taxonomy, Fermentation, Isolation and Biological Characteristics. J. Antibiot. (Tokyo) 1993, 46 (11), 1666–1671. [DOI] [PubMed] [Google Scholar]
  • (32).Shang Z; Winter JM; Kauffman CA; Yang I; Fenical W Salinipeptins: Integrated Genomic and Chemical Approaches Reveal Unusual d -Amino Acid-Containing Ribosomally Synthesized and Post-Translationally Modified Peptides (RiPPs) from a Great Salt Lake Streptomyces Sp. ACS Chem. Biol 2019, 14 (3), 415–425. [DOI] [PubMed] [Google Scholar]
  • (33).DiCaprio AJ; Firouzbakht A; Hudson GA; Mitchell DA Enzymatic Reconstitution and Biosynthetic Investigation of the Lasso Peptide Fusilassin. J. Am. Chem. Soc 2019, 141 (1), 290–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Ma S; Zhang Q Linaridin Natural Products. Nat. Prod. Rep 2020, 10.1039.C9NP00074G. [DOI] [PubMed] [Google Scholar]
  • (35).Mistry J; Finn RD; Eddy SR; Bateman A; Punta M Challenges in Homology Search: HMMER3 and Convergent Evolution of Coiled-Coil Regions. Nucleic Acids Res. 2013, 41 (12), e121–e121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).Mo T; Liu W-Q; Ji W; Zhao J; Chen T; Ding W; Yu S; Zhang Q Biosynthetic Insights into Linaridin Natural Products from Genome Mining and Precursor Peptide Mutagenesis. ACS Chem. Biol 2017, 12 (6), 1484–1488. [DOI] [PubMed] [Google Scholar]
  • (37).Cao L; Gurevich A; Alexander KL; Naman CB; Leão T; Glukhov E; Luzzatto-Knaan T; Vargas F; Quinn R; Bouslimani A; Nothias LF; Singh NK; Sanders JG; Benitez RAS; Thompson LR; Hamid M-N; Morton JT; Mikheenko A; Shlemov A; Korobeynikov A; Friedberg I; Knight R; Venkateswaran K; Gerwick WH; Gerwick L; Dorrestein PC; Pevzner PA; Mohimani H MetaMiner: A Scalable Peptidogenomics Approach for Discovery of Ribosomal Peptide Natural Products with Blind Modifications from Microbial Communities. Cell Syst. 2019, 9 (6), 600–608.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (38).Agrawal P; Khater S; Gupta M; Sain N; Mohanty D RiPPMiner: A Bioinformatics Resource for Deciphering Chemical Structures of RiPPs Based on Prediction of Cleavage and Cross-Links. Nucleic Acids Res. 2017, 45 (W1), W80–W88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (39).Zallot R; Oberg N; Gerlt JA The EFI Web Resource for Genomic Enzymology Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways. Biochemistry 2019, 58 (41), 4169–4182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Shannon P Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003, 13 (11), 2498–2504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (41).Claesen J; Bibb MJ Biosynthesis and Regulation of Grisemycin, a New Member of the Linaridin Family of Ribosomally Synthesized Peptides Produced by Streptomyces Griseus IFO 13350. J. Bacteriol 2011, 193 (10), 2510–2516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (42).Rateb ME; Zhai Y; Ehrner E; Rath CM; Wang X; Tabudravu J; Ebel R; Bibb M; Kyeremeh K; Dorrestein PC; Hong K; Jaspars M; Deng H Legonaridin, a New Member of Linaridin RiPP from a Ghanaian Streptomyces Isolate. Org. Biomol. Chem 2015, 13 (37), 9585–9592. [DOI] [PubMed] [Google Scholar]
  • (43).Crooks GE WebLogo: A Sequence Logo Generator. Genome Res. 2004, 14 (6), 1188–1190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Zhang Z; Hudson GA; Mahanta N; Tietz JI; van der Donk WA; Mitchell DA Biosynthetic Timing and Substrate Specificity for the Thiopeptide Thiomuracin. J. Am. Chem. Soc 2016, 138 (48), 15511–15514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (45).Repka LM; Hetrick KJ; Chee SH; van der Donk WA Characterization of Leader Peptide Binding During Catalysis by the Nisin Dehydratase NisB. J. Am. Chem. Soc 2018, 140 (12), 4200–4203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (46).Arthur RA; Nicholson AC; Humrighouse BW; McQuiston JR; Lasker BA Draft Genome Sequence of Kroppenstedtia Sanguinis X0209T, a Clinical Isolate Recovered from Human Blood. Microbiol. Resour. Announc 2019, 8 (24), MRA.00354–19, e00354–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (47).Cox CL; Tietz JI; Sokolowski K; Melby JO; Doroghazi JR; Mitchell DA Nucleophilic 1,4-Additions for Natural Product Discovery. ACS Chem. Biol 2014, 9 (9), 2014–2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (48).Bonauer C; Walenzyk T; König B α,β-Dehydroamino Acids. Synthesis 2006, 2006 (01), 1–20. 10.1055/s-2005-921759. [DOI] [Google Scholar]
  • (49).Wells L; Vosseller K; Cole RN; Cronshaw JM; Matunis MJ; Hart GW Mapping Sites of O -GlcNAc Modification Using Affinity Tags for Serine and Threonine Post-Translational Modifications. Mol. Cell. Proteomics 2002, 1 (10), 791–804. [DOI] [PubMed] [Google Scholar]
  • (50).Li J; Girard G; Florea BI; Geurink PP; Li N; van der Marel GA; Overhand M; Overkleeft HS; van Wezel GP Identification and Isolation of Lantibiotics from Culture: A Bioorthogonal Chemistry Approach. Org. Biomol. Chem 2012, 10 (43), 8677. [DOI] [PubMed] [Google Scholar]
  • (51).Schoof S; Baumann S; Ellinger B; Arndt H-D A Fluorescent Probe for the 70 S-Ribosomal GTPase-Associated Center. ChemBioChem 2009, 10 (2), 242–245. [DOI] [PubMed] [Google Scholar]
  • (52).Livnat I; Tai H-C; Jansson ET; Bai L; Romanova EV; Chen T; Yu K; Chen S; Zhang Y; Wang Z; Liu D; Weiss KR; Jing J; Sweedler JV A d -Amino Acid-Containing Neuropeptide Discovery Funnel. Anal. Chem 2016, 88 (23), 11868–11876. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI document
SI datasets

Supplemental Dataset 1: custom profile Hidden Markov Models (ZIP)

RESOURCES