Abstract
Many peptide-based natural products require a leader peptide to reach their final modified form, but identifying general rules for leader peptide interactions have been stymied by the diversity of these molecules. Two papers reporting crystallographic and bioinformatic analysis of these systems now reveal a structurally-conserved domain that mediates leader peptide binding.
Ribosomally-synthesized and posttranslationally modified peptides (RiPPs) are a diverse family of natural products, many of which exhibit therapeutically-relevant activities.1 RiPPs produced by bacteria run the gamut from small molecules such as microcin C72 and pyrroloquinoline quinone3 to the proteusins,4 molecules made up of >40 amino acids with dozens of posttranslational modifications. One feature that connects these otherwise disparate molecules is the presence of a peptide sequence, the leader peptide, which is proteolytically removed either during or after the maturation of the natural product.5 These leader peptides, however, vary tremendously in both length and sequence composition (Figure 1A). Thus, understanding the role of these leader peptides in determining the final product has been difficult, and mostly tackled on a system-by-system basis. In this issue, two papers reverse that trend, using structural and bioinformatic approaches to provide general insights into the role of leader peptides in RiPP biosynthesis6,7.
All RiPPs start out as linear chains of amino acids, referred to as the precursor peptide. This is in contrast to natural products made by polyketide synthases (PKS) or nonribosomal peptide synthetases (NRPS), which are assembled one monomer at a time. The RiPP precursor peptide can be further broken down into the leader peptide, which is eventually cleaved off, and the core peptide, which becomes the natural product via the action of one or more maturation enzymes. While it has been assumed that the leader peptide is a substrate recognition element that directs the maturation of RiPPs in some way, only recently have crystal structures emerged that show the interaction of a RiPP precursor peptide with a maturation enzyme. The structure of the microcin C7 biosynthetic enzyme MccB bound to its precursor peptide substrate MccA has been determined.2 More recently, a structure of the nisin precursor NisA leader peptide in complex with the lanthipeptide biosynthetic enzyme NisB was reported.8 These structures provided the first glimpses of leader peptide interactions with maturation enzymes.
Koehnke et al. have added to this list by solving the structure of LynD, an ATP-dependent heterocyclase involved in cyanobactin biosynthesis, with bound leader peptide from an artificial cyanobactin precursor PatE’.6 The LynD structure was solved with bound peptide substrate as well as bound ATP and analogs. The structure of these complexes revealed that substrate recognition (via leader peptide binding) and the heterocyclase activity are located in two distinct domains of the protein. This modularity is reminiscent of the MccB and NisB structures. The functional form of LynD is a dimer in which the leader peptide binding domain of one chain is in close proximity to the catalytic domain of the other chain. Using these structural insights, Koehnke et al. also engineered a version of the heterocyclase in which the leader peptide is provided in cis via its fusion to the N-terminus of LynD.6 This enzyme has the advantage of being able to process “leader peptide-free” cyanobactin precursors, reducing the size of the required substrate from >50 aa to as little as 12 aa.
Burkhart et al. realized that the leader peptide binding domains of LynD, MccB, and NisB were structurally similar (Figure 1B), even though sequence identity was low, prompting the authors to develop a bioinformatic method to identify leader peptide-binding domains in other RiPP gene clusters.7 These authors show that such domains are widespread among many different classes of RiPP gene clusters, suggesting that the leader peptide binding domain is a defining feature of RiPP biosynthesis. With the observation of similar leader peptide-binding domains across three different RiPPs families, Burkhart et al. proposed the name RiPP precursor protein recognition element, or RRE, for such domains. Using the sequence alignment tool HHpred, the authors found that the NisB and LynD leader peptide binding domains exhibited structural similarity to the protein PqqD. This is yet another connection to RiPP biosynthesis as PqqD has recently been shown to bind the putative peptide precursor to pyrroloquinoline quinone, PqqA.3 The RRE was found in a total of 11 different classes of RiPP gene clusters, over half of the currently known RiPP classes. The interaction between leader peptide and RRE was demonstrated using fluorescence polarization binding assays and mutagenesis for three additional classes of RiPPs: linear azoline-containing peptides, thiopeptides, and lasso peptides.
Collectively, these papers show that leader peptide recognition by PqqD-like domains is well-conserved across RiPP biosynthetic pathways. In many ways, this is an extraordinary observation: a single ~80 aa protein domain is involved in coordinating the biosynthesis of more than ten different classes of natural products, each with their own idiosyncrasies in biosynthetic logic. The only commonality amongst these natural products is their origin as linear chains of amino acids produced on the ribosome. RRE domains engage RiPP precursor peptides long and short, polar and hydrophobic. The RRE binds these peptides in different poses (Figure 1C), perhaps explaining how this simple domain can interact with so many different sequences. RREs from different RiPP classes have low sequence homology, so it is interesting to think about how such a domain evolved. Phylogenetic analyses of the RRE may reveal as yet unappreciated connections between different RiPP classes. Given their robustness in binding peptide substrates, it is also of interest to search for PqqD homologs in contexts beyond RiPP biosynthesis. Such searches may reveal further functions for what seems to be a highly capable protein.
References
- 1.Arnison PG, et al. Natural Product Reports. 2013;30:108–160. doi: 10.1039/c2np20085f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Regni CA, et al. EMBO Journal. 2009;28:1953–1964. doi: 10.1038/emboj.2009.146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Latham JA, Iavarone AT, Barr I, Juthani PV, Klinman JP. Journal of Biological Chemistry. 2015 doi: 10.1074/jbc.M115.646521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Freeman MF, et al. Science. 2012;338:387–390. doi: 10.1126/science.1226121. [DOI] [PubMed] [Google Scholar]
- 5.Oman TJ, van der Donk WA. Nature Chemical Biology. 2010;6:9–18. doi: 10.1038/nchembio.286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Koehnke J, et al. Nature Chemical Biology. 2015 [Google Scholar]
- 7.Burkhart BJ, Hudson GA, Dunbar KL, Mitchell DA. Nature Chemical Biology. 2015 doi: 10.1038/nchembio.1856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ortega MA, et al. Nature. 2015;517:509–512. doi: 10.1038/nature13888. [DOI] [PMC free article] [PubMed] [Google Scholar]