Abstract
Antimicrobial peptides (AMPs) are promising compounds for the treatment of antibiotic-resistant bacteria and are found across all organisms, including plants. Unlike most antibiotics, AMPs tend to act on more generalized and multiple targets, making development of resistance more difficult. Conventional approaches towards AMP identification include bioactivity-guided fractionation and genome mining. Complementary methods leveraging bioactivity-guided fractionation, cysteine motif-guided in silico AMP prediction, and mass spectrometric approaches can be combined to expand botanical AMP discovery. Herein, we present an integrated workflow which serves to streamline implementation towards a robust botanical AMP discovery pipeline.
Keywords: Antimicrobial peptides, Mass spectrometry, PepSAVI-MS, SignalP, Cysmotif Searcher, Mass shift
1. Introduction
Antimicrobial resistance is a public health issue that results in at least 2.8 million infections and more than 35,000 deaths each year (Centers for Disease Control and Prevention, 2020). Antimicrobial peptides (AMPs) are explored as a way to combat ever increasing multi-drug resistant bacteria (Browne et al., 2020; Magana et al., 2020; Mahlapuu et al., 2020; Sierra et al., 2017). AMPs are ubiquitously expressed across all life, and are generally amphipathic, highly basic, and cysteine-rich, containing disulfide bonds which influence tertiary structure folding (Tam et al., 2015). Generally, their basic nature facilitates interaction with bacterial membranes and promotes specificity towards these bacterial membranes rather than zwitterionic mammalian membranes (da Costa et al., 2015). AMPs commonly act via a membrane-lytic mechanism of action, in which they will disrupt the cytoplasmic barrier, following interaction with the negatively charged surface and insertion into the hydrophobic membrane (da Costa et al., 2015; Moretta et al., 2021). As AMPs tend to act on these more generalized targets, resistance towards them is not as prevalent as with antibiotics (Assoni et al., 2020). Among sources of AMPs, plant AMPs remain relatively underexplored compared to their mammalian counterparts (Wang et al., 2015) and are a potential wealth of novel bioactives. Plant AMPs are ribosomally derived and often expressed as precursor proteins containing N-terminal signal peptides, which are cleaved in the processing of the mature AMP (Tam et al., 2015). They are classified into AMP families based on characteristics such as higher order structure, cysteine motifs, and disulfide bonding patterns (Tam et al., 2015).
Methods for AMP discovery can be challenging. Most conventionally, bioactivity-guided fractionation techniques are employed in which bioactive fractions are subjected to iterative chromatographic separations and bioassays to lead to the compound of interest (Sharma & Gupta, 2015). This can be time consuming and often biased towards abundant and highly active compounds, potentially leading to identification of previously known compounds (Yang et al., 2013). Furthermore, bioactivity-guided fractionation may fail to identify AMPs if they are simply at too low of a concentration to exert any activity or if they are specific towards different pathogens that are not screened against. Isolation of compounds of interest and subsequent confirmation of the bioactive molecule can often be difficult given the complexities of the sample matrices, especially in plant extracts (Agarwal et al., 2014). To address these challenges, contributors to bioactivity within more complex samples/fractions can be determined with statistical modelling, in which peptidyl abundance is correlated with observed bioactivity as is done in PepSAVI-MS (Kirkpatrick et al., 2017).
Alternatively, AMP discovery can be achieved via genome mining (Russell & Truman, 2020), where biosynthetic gene clusters that produce putative AMPs are detected within genomic data. This can be challenging in planta, as there are few conserved features across ribosomally synthesized and post-translationally modified peptide (RiPP) biosynthetic gene clusters (Russell & Truman, 2020). Furthermore, this method gives no information about antimicrobial activity.
Herein, we detail a multi-faceted approach to provide efficient AMP identification from complex botanical extracts. In silico cysteine-motif guided AMP prediction is combined with mass spectrometric approaches such as bottom-up proteomics / mass shift analysis and bioactivity-guided fractionation (Figure 1) to increase overall success for novel AMP discovery.
Figure 1.

Overall AMP identification workflow. Bioactivity-guided assays and in silico AMP prediction are combined with mass spectrometric approaches including bottom-up proteomics and mass shift analysis.
1.1. In silico AMP prediction
Cysteine motif-guided AMP prediction can be applied to thoroughly probe increasingly available plant genomic and/or transcriptomic data for putative AMPs. As mature AMPs are generally cysteine rich, carrying highly conserved cysteine motifs (Tam et al., 2015), these features can be leveraged to search for AMPs within protein databases. Furthermore, as AMP precursors often carry N-terminal signal peptides (Tam et al., 2015), these too can be leveraged. SignalP 5.0 (Almagro Armenteros et al., 2019), a deep neural network that predicts and cleaves N-terminal signal peptides in silico, is used to suggest mature sequences and generate a focused set of peptides for further analysis (Figure 2A). Cysmotif Searcher (Shelenkov et al., 2018) reveals cysteine motifs within the SignalP output and sorts results into AMP families (Figure 2B). These methods lead to a concise list of predicted AMPs, with sequence and classification information.
Figure 2.

Identification of a putative defensin. (A) Full sequence of the precursor protein with the signal peptide predicted by SignalP denoted in blue. (B) Defensin sequence predicted by Cysmotif Searcher, with the identified cysteine motif denoted above the sequence, and the motif within the sequence denoted by lowercase letters. The cysteine motif is represented with C corresponding to cysteine residues, X corresponding to any other amino acid residue, and bracketed numbers indicating the range of residues between cysteines (e.g. CX{4,25}C means that anywhere from 4 to 25 residues can exist between these cysteine residues). (C) Database search results, with the identified tryptic peptides denoted in red within the full protein sequence. (D) Mass shift analysis of the peptide, with the masses and mass spectra of both the intact and reduced/iodoacetamide-alkylated peptide.
A key limitation of this approach is its inability to account for additional protein cleavages, such as in the case of a C-terminal pro-domain removal. For example, consider a predicted defensin (Figure 2B), which suggests a mature sequence that contains an additional C-terminal domain occurring after the cysteine motif. Closer inspection of this sequence shows that this domain is likely an acidic C-terminal pro-domain, which are known to exist in some defensin precursors (Lay et al., 2014). Additional challenges in accurate sequence prediction can be encountered when a mature AMP is post-translationally modified, especially given how extensive the occurrence of post-translational modifications (PTMs) are within AMPs (Wang, 2012), as Cysmotif Searcher cannot predict these. In order to elucidate the mature AMP sequences, mass spectrometric approaches building upon these in silico predictions are necessary.
1.2. Mass spectrometric characterization
While predictive approaches are useful in prioritizing botanical species for AMP screening as well as providing a targeted list for identification, mass spectrometric approaches are a necessary complement in addressing the limitations of mature sequence prediction. Furthermore, in vivo translation is not guaranteed, making direct detection necessary. Mass spectrometry-based bottom-up proteomics can be used to detect these predicted AMPs within an extract, in which the plant extract is subjected to reduction and alkylation of disulfide bonds, followed by digestion with a protease such as trypsin. These tryptic peptides are then subjected to tandem mass spectrometry (MS/MS), coupled with a separation method such as liquid chromatography (LC), and the results are searched against a protein database digested in silico to identify proteins within the sample (Dupree et al., 2020; Y. Zhang et al., 2013; Z. Zhang et al., 2014).
Bottom-up proteomics has several strengths in that the proteolytically generated peptides analyzed will display easier separation, as well as better ionization and fragmentation, as compared to larger peptides and full proteins (Dupree et al., 2020). However, key disadvantages include limited protein coverage (i.e., the amount of the full protein that is detected), especially when tryptic peptides are too small to be identified by MS (Dupree et al., 2020). For example, the defensin that was predicted by Cysmotif Searcher (Figure 2B) only had 46% coverage with a trypsin digestion, and the mature sequence is still unclear (Figure 2C). In addition, highly similar protein sequences within a proteome have the potential to share tryptic peptides and, when only non-unique tryptic peptides are detected it is difficult to guarantee detection of a specific protein of interest,
Mass shift analysis is another mass spectrometric approach that can be utilized in the characterization of cysteine-rich AMPs (Narayani et al., 2017). This involves utilizing characteristic mass shifts that are observed during reduction and alkylation of disulfide-bound peptides in order to enumerate these disulfides in the mature structure. Reduction with dithiothreitol and subsequent alkylation with iodoacetamide results in a 58 Da mass shift per disulfide-bound cysteine residue (Figure 3). Comparison of the mass of the mature peptide to that of the reduced and alkylated peptide indicates the total number of disulfide bonds in the AMP structure. This data can be highly complementary to the data obtained from bottom-up proteomics. For example, a mass of 5728.7 Da is observed in a sample containing intact peptides, with no reduction or alkylation. Upon reduction and alkylation, a mass of 6193.0 Da is observed, which would correspond to eight cysteine residues involved in four disulfide bonds (Figure 2D). Coupled with in silico predictions and bottom-up proteomics, it can be determined that this likely corresponds to the defensin that was identified (Figure 2B), in which there is an additional cleavage after the final cysteine residue in the cysteine motif, yielding a mature AMP structure with a mass of 5728.7 Da and four disulfide bonds.
Figure 3.

Mass shift analysis workflow for determining the number of disulfide bonds in a peptide. The intact, disulfide-bound peptide is reduced with dithiothreitol, followed by alkylation with iodoacetamide. A carbamidomethyl group is added to each reduced cysteine, shown as purple circles, resulting in a shift of +58 Da for each.
1.3. Incorporation of bioactivity screening
While it is possible to identify AMPs solely through predictive and mass spectrometric approaches, incorporating bioactivity-guided fractionation in tandem with these methods can help to prioritize characterization as well as reveal novel AMPs that may not be predicted through Cysmotif Searcher.
In addition to screening for in silico predicted AMPs via mass spectrometric methods, peptide fractions can also be directly subjected to bioassays against pathogens of interest. In the case of observed bioactivity, those fractions may be prioritized for characterization. This was seen with the recent discovery of two botanical AMPs from ghost pepper, CC-AMP1 and CC-AMP2, in which peptides were fractionated via reversed-phase LC and several of these fractions displayed antibacterial activity and were further examined (Culver et al., 2021). While CC-AMP1 was a variation of a predicted AMP sequence, CC-AMP2 contained an entirely novel cysteine motif and was not predicted by Cysmotif Searcher, only being identified via these antibacterial assays (Culver et al., 2021). This example further illustrates the complementary nature of using these different approaches in tandem.
Through incorporation of each of these complementary methods of in silico prediction, mass spectrometric bottom-up proteomics, mass shift analysis, mass spectrometric analyses of both the intact and reduced/alkylated peptides for sequence confirmation, and bioactivity-guided fractionation incorporated with PepSAVI-MS for analyses of complex mixtures of peptides, a comprehensive protocol for botanical AMP identification is achieved.
2. Materials
2.1. Databases
Protein database in FASTA format, such as one downloaded from UniProt (https://www.uniprot.org/). Alternatively, transcriptomic data can be downloaded from sources such as 1000 plants (https://sites.google.com/a/ualberta.ca/onekp/), or genomic data can be downloaded from sources such as Phytozome (https://phytozome-next.jgi.doe.gov/)
Database of common laboratory contaminants (https://www.thegpm.org/crap/)
EMBOSS (http://emboss.sourceforge.net/download/), if translating transcriptomic or genomic assemblies
2.2. Reduction and alkylation
100 mM ammonium bicarbonate, pH 7.8
Reduction buffer: 500 mM dithiothreitol in 100 mM ammonium bicarbonate, pH 7.8
Alkylation buffer: 100 mM iodoacetamide in 100 mM ammonium bicarbonate, pH 7.8
2.3. Trypsin digestion
0.5 µg/µL Promega (Madison, WI, USA) trypsin gold, mass spectrometry grade, in 50 mM acetic acid
5% formic acid (FA, LC-MS grade) in water (LC-MS grade)
2.4. Desalting
Millipore (Burlington, MA, USA) C18 ZipTips
0.1% trifluoroacetic acid (TFA, LC-MS grade)
5% TFA (LC-MS grade)
1% TFA (LC-MS grade), 2% acetonitrile (ACN, LC-MS grade)
ACN (LC-MS grade)
70% ACN (LC-MS grade), 0.1% TFA (LC-MS grade)
2.5. LC-MS/MS analysis
LC-MS total recovery vials (Waters)
5% ACN (LC-MS grade), 0.1% TFA (LC-MS grade)
Symmetry C18 trap column (100 Å, 5 μm, 180 μm x 20 mm; Waters)
HSS T3 C18 column (100 Å, 1.8 μm, 75 μm x 250 mm; Waters)
Mobile phase A: 0.1% FA (LC-MS grade) in water (LC-MS grade)
Mobile phase B: 0.1% FA (LC-MS grade) in ACN (LC-MS grade)
Acquity UPLC M-Class system (Waters)
Q Exactive HF-X Hybrid Quadrupole Orbitrap mass spectrometer (ThermoFisher, Waltham, MA, USA)
2.6. Data analysis
SignalP 5.0b for Linux or Darwin. Download link available at DTU Health Tech (https://services.healthtech.dtu.dk/software.php)
Perl 5.8 or later (https://www.perl.org/get.html)
Cysmotif Searcher for cysteine motif mining (https://github.com/fallandar/cysmotifsearcher)
MSConvert (ProteoWizard, http://proteowizard.sourceforge.net/download.html)
Mascot Daemon v3.5.1 (Matrix Science, Boston, MA, USA)
Progenesis QI for Proteomics v2.0 (Nonlinear Dynamics, Durham, NC, USA)
Python script for mass shift analysis (https://github.com/hickslab/MassShiftAnalysis)
2.7. Disulfide Bond Connectivity Determination via Partial Acid Hydrolysis
3M hydrochloric acid (HCl, ACS Plus grade)
Glass vial
3. Protocol
3.1. Cysteine motif-guided AMP prediction
If translating transcriptomic or genomic assemblies, run getorf using the EMBOSS package with the input as the FASTA file containing the nucleotide sequences. This program will find and extract open reading frames, as well as translate them into protein sequences. Use the default options.
Upload the FASTA file of the protein database of interest to the system that will run SignalP.
- Run SignalP on the protein database FASTA file using the following options:
- Set the output format to short
- Set organism to eukaryote
- Generate a mature sequence FASTA file
With the resultant mature FASTA file from SignalP, run Cysmotif Searcher with the option of skipping translation of input sequences.
-
Unzip the Cysmotif Searcher results file and obtain the *_motifs_pre.FASTA file, which will contain the full set of predicted AMPs.
Tip: When parsing through Cysmotif Searcher results, it is important to note that CYSRICH classifications arise when a sequence is found to have a known cysteine motif, but to also contain additional cysteine residues not belonging to that motif. While this can potentially result in the discovery of novel cysteine motifs, CYSRICH classifications will often be observed in larger precursor proteins that have additional cleavages outside of just the signal peptide cleavage. If these pro-domains contain any cysteine residues, the sequence is classified as CYSRICH. This is also observed in precursors that produce multiple mature AMP sequences, as in some α-Hairpinins (Slavokhotova & Rogozhin, 2020), in which the cysteine motif is observed several times throughout the sequence. In analyzing these results, it can often be useful to search CYSRICH sequences on UniProt BLAST (Pundir et al., 2016), which may reveal regions of homology with known AMP families and provide more insight into the precursor protein and possible mature AMP regions.
3.2. Peptide extraction and fractionation
For a detailed protocol on plant peptide extraction and subsequent chromatographic fractionation, refer to Brechbill et al. (Brechbill et al., 2021). Briefly, the steps are as follows:
Grind the plant material into a powder under liquid nitrogen and extract with the extraction buffer.
Filter out solid material, then filter out large proteins from the extract.
Dialyze the extract, then subject it to strong cation exchange (SCX) chromatography to remove the majority of small molecules.
Using C18 solid phase extraction (SPE), desalt the SCX elution
Fractionate the desalted sample using reversed-phase chromatography.
Resuspend the fractions in LC-MS water for downstream analyses.
3.3. Bacterial bioassay
For a detailed bioassay protocol, refer to Brechbill et al. (Brechbill et al., 2021). Briefly, the steps are as follows:
Streak the bacterial pathogen on an appropriate agar plate and incubate at 37 °C overnight.
Add an isolated bacterial colony to MHB and incubate overnight.
Dilute the culture and incubate for an additional hour, reaching mid log phase.
Combine bacterial culture with peptide samples in a 96 well plate, using water as a negative control and an appropriate antibiotic as a positive control.
Incubate for 4 hours, then measure optical density at 600 nm (OD600).
Add resazurin to the wells, incubate for an additional hour, then measure fluorescence with an excitation wavelength of 544 nm and emission wavelength of 590 nm.
-
Calculate percent activity against the positive and negative controls.
Tip: Peptide fractions that display bioactivity can be prioritized for further analyses. If an active fraction is an isolated peptide, then this is likely the active component. If a complex mixture of peptides is observed, then a statistical modelling approach, such as PepSAVI-MS (Kirkpatrick et al., 2017) (https://cran.r-project.org/web/packages/PepSAVIms/index.html) can be utilized to determine the active component(s).
3.4. Reduction and alkylation
From each peptide fraction, aliquot three samples at 10 µL each (approximately 1 µg/µL). One for analysis of mature peptides, one for reduction and alkylation, and one for reduction, alkylation, and trypsin digestion. The mature peptides do not need to be desalted and are ready for LC-MS/MS analysis (section 3.7).
To the 10 µL of both the reduction/alkylation and the reduction/alkylation/trypsin digestion samples, add 40 µL of 100 mM ammonium bicarbonate, pH 7.8. Reduce by adding 1 µL of 500 mM dithiothreitol (~10 mM final concentration) and incubating in a thermomixer at 45°C and 850 rpm for 30 minutes in the dark.
Alkylate by adding 10 µL of 100 mM iodoacetamide (~16 mM final concentration) and incubating at 25°C at 850 rpm for 15 minutes in the dark.
-
For samples that will not be trypsin digested, quench with 10 µL of 5% FA (LC-MS grade) and dry in a vacuum centrifuge.
Tip: The iodoacetamide and dithiothreitol stocks should be made fresh and light exposure should be avoided.
3.5. Trypsin digestion
To the reduced and alkylated samples that will be digested, add 0.5 µL of 0.5 µg/µL Promega (Madison, WI, USA) trypsin gold, mass spectrometry grade, in 50 mM acetic acid for an enzyme:protein ratio of approximately 1:50 (w/w).
Incubate overnight, approximately 16–17 hours, at 37°C and 850 rpm.
Add 10 µL of 5% FA (LC-MS grade) and dry in a vacuum centrifuge.
3.6. Desalting
Resuspend samples in 20 µL of 1% TFA (LC-MS grade), 2% ACN (LC-MS grade).
Confirm with a pH test strip that the pH of the sample is <3. If pH is not <3, spike in additional small aliquots of 5% TFA (LC-MS grade) until the pH is <3.
In advance, prepare an Eppendorf tube for elution containing 10 µL of 70% ACN (LC-MS grade), 0.1% TFA (LC-MS grade).
Using a Millipore (Burlington, MA, USA) C18 ZipTip attached to a 10 µL pipette, set pipette to 10 µL and draw up 100% ACN (LC-MS grade) 3 times, expelling the waste liquid each time down to the resin, without letting the resin go dry.
Draw up 0.1% TFA (LC-MS grade) 3 times.
Draw up sample 10 times, expelling back into the sample tube each time.
Draw up 0.1% TFA (LC-MS grade) 6 times.
From the prepared elution tube, draw up 70% ACN (LC-MS grade), 0.1% TFA (LC-MS grade) 10 times, expelling back into the same tube each time.
Dry elutions using a vacuum centrifuge.
3.7. LC-MS/MS analysis
Resuspend mature, reduced/alkylated, and reduced/alkylated/trypsin digested peptide samples in 5% ACN (LC-MS grade), 0.1% TFA (LC-MS grade) and transfer to LC-MS total recovery vials (Waters).
Inject samples and perform LC-MS/MS analysis using an Acquity UPLC M-Class system (Waters) coupled to a Q Exactive HF-X Hybrid Quadrupole Orbitrap mass spectrometer (ThermoFisher) via a Nanospray Flex Ion Source (ThermoFisher). Inject the peptide mixture to a Symmetry C18 trap column (100 A°, 5 μm, 180 μm x 20 mm; Waters) with a flow rate of 5 μL/min for 3 min using 99% A and 1% B, then separate on a HSS T3 C18 column (100 A°, 1.8 μm, 75 μm x 250 mm; Waters) using a gradient of increasing mobile phase B at a flow rate of 300 nL/min. Hold mobile phase B at 5% for 1 min before increasing from 5% to 50% in 30 min and ramping to 85% in 2 min. Hold at 85% for 3 min before returning to 5% in 1 min and re-equilibrating for 23 min.
Use the following MS acquisition parameters: Use a tune file set with positive polarity, 2.1 kV spray voltage, 325 °C capillary temperature, and 40 S-lens RF level. In the instrument method, include lock masses best of 371.10124 and 445.12003 background polysiloxane ions. Select full MS/DD-MS2 scan type and set method duration to 60 min and default charge state to 2. Perform MS survey scan in profile mode across 350–2000 m/z at 120,000 resolution until 50 ms maximum IT or 3 × 106 AGC target is reached. Select the top 20 features above 5000 counts excluding ions with unassigned, +1, or >+8 charge state. Collect MS2 scans at 30,000 resolution with NCE at 28 until 100 ms maximum IT or 1 × 105 AGC target. Set the dynamic exclusion window for precursor m/z to 10 s and an isolation window of 1.5 m/z.
3.8. Data analysis
3.8.1. Bottom-up proteomics via Mascot database search
- Upload LC-MS/MS results files (.raw) into MSConvert and convert to Mascot Generic Format (.mgf) files, selecting “Write index,” “Use zlib compression,” and “TPP compatibility.” Use the following filter options:
- peakPicking: vendor msLevel=1–2
- zeroSamples: removeExtra 1-
- activation: HCD
- threshold: count 5000 most intense
Using Mascot Daemon, search against the protein database of interest, as well as common laboratory contaminants (https://www.thegpm.org/crap/). Select MS/MS ion search with a tolerance of 0.1 Da and decoy database. Set peptide charge to +1, +2, and +3 and set the peptide tolerance to 15 ppm. Select trypsin with max missed cleavages of 3. Select a variable modification of methionine oxidation and a fixed modification of cysteine carbamidomethyl. Run the search and after it is complete, adjust the false discovery rate of the significant peptide identifications to be less than 1% using the embedded Percolator algorithm. Export results as a (.csv) file.
-
Parse through Mascot results to identify any Cysmotif Searcher predicted AMPs detected in the peptide samples. One way to do this is by finding matching protein accessions between the two lists and then discarding Mascot identifications of other proteins.
Tip: Other database search engines can be used to analyze the data. Examples include SEQUEST (Eng et al., 1994), ProteinPilot (Shilov et al., 2007), and Andromeda (Cox et al., 2011).
Tip: After percolation, peptide scores of 13 represent a p-value of 0.05 and peptide matches with scores <13 should generally be discarded. Coverage on accessions of interest can be verified by manually checking the quality of the matched spectra.
3.8.2. Mass Shift Analysis
Upload spectral result file (.raw) from LC-MS/MS analyses of both mature peptide samples and reduced/alkylated samples individually into Progenesis QI for Proteomics (Nonlinear Dynamics).
Use default peak picking parameters, except for applying a minimum peak width of 0.05 minutes.
From the “Identify Peptides” section, export peptide ion data with m/z, retention time, mass, charge, and raw abundance selected.
With the peptide ion data of both mature and reduced/alkylated sample, sort each by mass. Copy and paste the mass columns of each into a new, single .csv file, labeling column A as the intact peptides and column B as the reduced/alkylated peptides.
-
Using the .csv file, run the mass shift analysis python script (https://github.com/hickslab/MassShiftAnalysis), identifying mass pairs that correspond to the specified number of disulfide bonds.
Tip: Mass pairs identified by the mass shift analysis python script should be manually confirmed by checking that the reduced and alkylated mass is not present in the mature sample and that the mature mass is not present in the reduced and alkylated sample. Furthermore, Progenesis may misidentify monoisotopic peaks, which should also be checked by manually inspecting the MS spectra. This is more likely to occur in the case of low abundance peptides.
Tip: When the mass of a putative AMP is determined and the number of disulfides is known, the reduced and alkylated peptide can be subjected to MS/MS experiments to fully determine and/or confirm the sequence, in addition to coverage obtained from a trypsin digestion. This can be used to reveal any amino acid substitutions or PTMs in the sequence. The theoretical mass of the determined sequence should match that of the observed mass in the intact peptide sample. When a peptide’s sequence is confirmed and the peptide is relatively isolated, the specific disulfide bond connectivity may be determined via section 3.9.
3.9. Disulfide Bond Connectivity Determination via Partial Acid Hydrolysis
Add 20 µL of peptide sample (approximately 1 µg/µL) and 40 µL of 3M HCl to a glass vial.
Heat at 85 °C for a total of 80 min, removing 15 µL aliquots every 20 min.
Dry down samples under nitrogen gas and resuspend in 5% ACN (LC-MS grade), 0.1% TFA (LC-MS grade).
Analyze each sample via LC-MS/MS following section 3.7.
-
The full peptide will be cleaved into a complex mixture of smaller, disulfide-linked peptides via the partial acid hydrolysis. Parse through precursor masses that could correspond to disulfide-linked acid hydrolysis dipeptides and confirm the residues and disulfide linkages via the corresponding MS/MS spectra.
Tip: Acid hydrolysis will convert asparagine and glutamine residues to aspartic acid and glutamic acid residues, respectively. This should be accounted for when analyzing the data.
Tip: Depending on the distribution of lysine and arginine residues within the peptide sequence, an alternative approach to partial acid hydrolysis is to perform a trypsin digestion without any prior reduction or alkylation. This will cleave the peptide after arginine and lysine residues, while leaving disulfide bonds intact.
4. Summary
The workflow detailed herein outlines a robust approach towards botanical AMP identification. Increasingly available genomic and/or transcriptomic data from sources such as the 1000 plants project (Leebens-Mack et al., 2019), Phytozome (Goodstein et al., 2012), and UniProt (Pundir et al., 2016) provide a foundation for discovery via the use of AMP prediction. Protein sequences can be mined for signal peptides using SignalP (Almagro Armenteros et al., 2019), obtaining sequences with these signal peptides cleaved off. This output can then be mined for conserved cysteine motifs using Cysmotif Searcher (Shelenkov et al., 2018), providing a focused list of putative AMPs.
Bioactivity-guided fractionation approaches can be used in tandem with mass spectrometric approaches such as bottom-up proteomics and mass shift analysis. Bioassays can provide direct assessments of AMP bioactivity, as well as help prioritize characterization. Bottom-up proteomics allows detection of predicted AMPs within the botanical extract, regardless of observed bioactivity. Finally, as cysteine-rich AMPs are expected to have multiple disulfide bonds, mass shift analysis provides a means to identify mass species that contain disulfide bonds, as well as enumerate these disulfide bonds. In addition, MS/MS analysis of the reduced and alkylated peptide can confirm the full sequence, revealing any amino acid substitutions or unexpected PTMs. Taken together, these approaches encompass a robust AMP discovery workflow, which helps to address the disadvantages of any individual approach.
Acknowledgments
This work was supported by NIH-NIGMS under award R01 GM125814 to L.M.H.
References
- Agarwal A, D’Souza P, Johnson TS, Dethe SM, & Chandrasekaran CV (2014). Use of in vitro bioassays for assessing botanicals. Current Opinion in Biotechnology, 25, 39–44. 10.1016/j.copbio.2013.08.010 [DOI] [PubMed] [Google Scholar]
- Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von Heijne G, & Nielsen H (2019). SignalP 5.0 improves signal peptide predictions using deep neural networks. Nature Biotechnology, 37(4), 420–423. 10.1038/s41587-019-0036-z [DOI] [PubMed] [Google Scholar]
- Assoni L, Milani B, Carvalho MR, Nepomuceno LN, Waz NT, Guerra MES, Converso TR, & Darrieux M (2020). Resistance Mechanisms to Antimicrobial Peptides in Gram-Positive Bacteria. Frontiers in Microbiology, 11, 593215. 10.3389/fmicb.2020.593215 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brechbill AM, Moyer TB, Parsley NC, & Hicks LM (2021). Creating optimized peptide libraries for AMP discovery via PepSAVI-MS. Methods in Enzymology [DOI] [PMC free article] [PubMed] [Google Scholar]
- Browne K, Chakraborty S, Chen R, Willcox MD, Black DS, Walsh WR, & Kumar N (2020). A New Era of Antibiotics: The Clinical Potential of Antimicrobial Peptides. International Journal of Molecular Sciences, 21(19), 7047. 10.3390/ijms21197047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Centers for Disease Control and Prevention. (2020). Antibiotic/Antimicrobial Resistance (AR/AMR) https://www.cdc.gov/drugresistance/index.html
- Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, & Mann M (2011). Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment. Journal of Proteome Research, 10(4), 1794–1805. 10.1021/pr101065j [DOI] [PubMed] [Google Scholar]
- Culver KD, Allen JL, Shaw LN, & Hicks LM (2021). Too Hot to Handle: Antibacterial Peptides Identified in Ghost Pepper. Journal of Natural Products 10.1021/acs.jnatprod.1c00281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- da Costa JP, Cova M, Ferreira R, & Vitorino R (2015). Antimicrobial peptides: an alternative for innovative medicines? Applied Microbiology and Biotechnology, 99(5), 2023–2040. 10.1007/s00253-015-6375-x [DOI] [PubMed] [Google Scholar]
- Dupree EJ, Jayathirtha M, Yorkey H, Mihasan M, Petre BA, & Darie CC (2020). A Critical Review of Bottom-Up Proteomics: The Good, the Bad, and the Future of this Field. Proteomes, 8(3), 14. 10.3390/proteomes8030014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eng JK, McCormack AL, & Yates JR (1994). An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society for Mass Spectrometry, 5(11), 976–989. 10.1016/1044-0305(94)80016-2 [DOI] [PubMed] [Google Scholar]
- Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, & Rokhsar DS (2012). Phytozome: a comparative platform for green plant genomics. Nucleic Acids Research, 40(Database issue), D1178–D1186. 10.1093/nar/gkr944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirkpatrick CL, Broberg CA, McCool EN, Lee WJ, Chao A, McConnell EW, Pritchard DA, Hebert M, Fleeman R, Adams J, Jamil A, Madera L, Strömstedt AA, Göransson U, Liu Y, Hoskin DW, Shaw LN, & Hicks LM (2017). The “PepSAVI-MS” Pipeline for Natural Product Bioactive Peptide Discovery. Analytical Chemistry, 89(2), 1194–1201. 10.1021/acs.analchem.6b03625 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lay FT, Poon S, McKenna JA, Connelly AA, Barbeta BL, McGinness BS, Fox JL, Daly NL, Craik DJ, Heath RL, & Anderson MA (2014). The C-terminal propeptide of a plant defensin confers cytoprotective and subcellular targeting functions. BMC Plant Biology, 14(1), 41. 10.1186/1471-2229-14-41 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leebens-Mack JH, Barker MS, Carpenter EJ, Deyholos MK, Gitzendanner MA, Graham SW, Grosse I, Li Z, Melkonian M, Mirarab S, Porsch M, Quint M, Rensing SA, Soltis DE, Soltis PS, Stevenson DW, Ullrich KK, Wickett NJ, DeGironimo L, … Initiative OTPT (2019). One thousand plant transcriptomes and the phylogenomics of green plants. Nature, 574(7780), 679–685. 10.1038/s41586-019-1693-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Magana M, Pushpanathan M, Santos AL, Leanse L, Fernandez M, Ioannidis A, Giulianotti MA, Apidianakis Y, Bradfute S, Ferguson AL, Cherkasov A, Seleem MN, Pinilla C, de la Fuente-Nunez C, Lazaridis T, Dai T, Houghten RA, Hancock REW, & Tegos GP (2020). The value of antimicrobial peptides in the age of resistance. The Lancet Infectious Diseases, 20(9), e216–e230. 10.1016/S1473-3099(20)30327-3 [DOI] [PubMed] [Google Scholar]
- Mahlapuu M, Björn C, & Ekblom J (2020). Antimicrobial peptides as therapeutic agents: opportunities and challenges. Critical Reviews in Biotechnology, 40(7), 978–992. 10.1080/07388551.2020.1796576 [DOI] [PubMed] [Google Scholar]
- Moretta A, Scieuzo C, Petrone AM, Salvia R, Manniello MD, Franco A, Lucchetti D, Vassallo A, Vogel H, Sgambato A, & Falabella P (2021). Antimicrobial Peptides: A New Hope in Biomedical and Pharmaceutical Fields. Frontiers in Cellular and Infection Microbiology, 11, 668632. 10.3389/fcimb.2021.668632 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Narayani M, Chadha A, & Srivastava S (2017). Cyclotides from the Indian Medicinal Plant Viola odorata (Banafsha): Identification and Characterization. Journal of Natural Products, 80(7), 1972–1980. 10.1021/acs.jnatprod.6b01004 [DOI] [PubMed] [Google Scholar]
- Pundir S, Martin MJ, O’Donovan C, & Consortium U (2016). UniProt Tools. Current Protocols in Bioinformatics, 53, 1.29.1–1.29.15. 10.1002/0471250953.bi0129s53 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russell AH, & Truman AW (2020). Genome mining strategies for ribosomally synthesised and post-translationally modified peptides. Computational and Structural Biotechnology Journal, 18, 1838–1851. 10.1016/j.csbj.2020.06.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharma SB, & Gupta R (2015). Drug development from natural resource: a systematic approach. Mini Reviews in Medicinal Chemistry, 15(1), 52–57. 10.2174/138955751501150224160518 [DOI] [PubMed] [Google Scholar]
- Shelenkov AA, Slavokhotova AA, & Odintsova TI (2018). Cysmotif Searcher Pipeline for Antimicrobial Peptide Identification in Plant Transcriptomes. Biochemistry (Moscow), 83(11), 1424–1432. 10.1134/S0006297918110135 [DOI] [PubMed] [Google Scholar]
- Shilov IV, Seymour SL, Patel AA, Loboda A, Tang WH, Keating SP, Hunter CL, Nuwaysir LM, & Schaeffer DA (2007). The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra*. Molecular & Cellular Proteomics, 6(9), 1638–1655. 10.1074/mcp.T600050-MCP200 [DOI] [PubMed] [Google Scholar]
- Sierra JM, Fusté E, Rabanal F, Vinuesa T, & Viñas M (2017). An overview of antimicrobial peptides and the latest advances in their development. Expert Opinion on Biological Therapy, 17(6), 663–676. 10.1080/14712598.2017.1315402 [DOI] [PubMed] [Google Scholar]
- Slavokhotova AA, & Rogozhin EA (2020). Defense Peptides From the α-Hairpinin Family Are Components of Plant Innate Immunity. Frontiers in Plant Science, 11, 465. 10.3389/fpls.2020.00465 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tam J, Wang S, Wong K, & Tan W (2015). Antimicrobial Peptides from Plants. Pharmaceuticals, 8(4), 711–757. 10.3390/ph8040711 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang G (2012). Post-translational Modifications of Natural Antimicrobial Peptides and Strategies for Peptide Engineering. Current Biotechnology, 1(1), 72–79. 10.2174/2211550111201010072 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang G, Li X, & Wang Z (2015). APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Research, 44(D1), D1087–D1093. 10.1093/nar/gkv1278 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang JY, Sanchez LM, Rath CM, Liu X, Boudreau PD, Bruns N, Glukhov E, Wodtke A, de Felicio R, Fenner A, Wong WR, Linington RG, Zhang L, Debonsi HM, Gerwick WH, & Dorrestein PC (2013). Molecular networking as a dereplication strategy. Journal of Natural Products, 76(9), 1686–1699. 10.1021/np400413s [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang Y, Fonslow BR, Shan B, Baek M-C, & Yates JR 3rd (2013). Protein analysis by shotgun/bottom-up proteomics. Chemical Reviews, 113(4), 2343–2394. 10.1021/cr3003533 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang Z, Wu S, Stenoien DL, & Paša-Tolić L (2014). High-Throughput Proteomics. Annual Review of Analytical Chemistry, 7(1), 427–454. 10.1146/annurev-anchem-071213-020216 [DOI] [PubMed] [Google Scholar]
