Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Apr 3.
Published in final edited form as: J Proteome Res. 2020 Mar 20;19(4):1459–1469. doi: 10.1021/acs.jproteome.9b00713

Tryp-N: A thermostable protease for the production of N-terminal argininyl and lysinyl peptides

John P Wilson 1,2,§, Jonathan J Ipsaro 1,§, Samantha N Del Giudice 1, Nikita Saha Turna 1,3, Carla M Gauss 1, Katharine H Dusenbury 1,4, Krisann Marquart 1, Keith D Rivera 1, Darryl J Pappin 1,*
PMCID: PMC7842235  NIHMSID: NIHMS1662885  PMID: 32141294

Abstract

Bottom-up proteomics is a mainstay in protein identification and analysis. These studies typically employ proteolytic treatment of biological samples to generate suitably-sized peptides for MS/MS. In MS, fragmentation of peptides is largely driven by charge localization. Consequently, peptides with basic centers exclusively on their N-termini produce mainly b-ions. Thus, it was long ago realized that proteases that yield such peptides would be valuable proteomic tools for achieving simplified peptide fragmentation patterns and peptide assignment. Work by several groups has identified such proteases, however, structural analysis of these suggested that enzymatic optimization was possible. We therefore endeavored to find enzymes which could provide enhanced activity and versatility while maintaining specificity. Using these previously-described proteases as informatic search templates, we discovered then characterized a thermophilic metalloprotease with N-terminal specificity for arginine and lysine. This enzyme, dubbed Tryp-N, affords many advantages including improved thermostability, solvent and detergent tolerance, and rapid digestion time.

Keywords: proteomics, Tryp-N, thermostable, metalloprotease, lysine, arginine, LysargiNase, ulilysin, Lys-N, trypsin

Introduction

The results of mass spectrometric (MS) proteomic analyses are dictated by charge: only charged molecules are experimentally observed and moreover, the points of fragmentation in MS/MS that elucidate molecular structure depend on the location of these charges (reviewed in Glish and Vachet1). Consequently, MS/MS analysis fails when a molecule of interest ionizes poorly or does not generate informative fragments; such instances occur especially frequently in studies of PTMs such as phosphorylation2. Due to these analytical constraints, tools and reagents that extend the ability to control the localization of charge in proteomic analysis are especially useful, helping to improve sequence coverage and fragment ion observation in MS/MS.

In positive-mode MS analysis, basic functional groups are of primary importance since those sites impart positive charge. Peptides generally have at least one intrinsic basic center at their amino terminus. Fragments generated during MS/MS which retain charge N-terminally to break points comprise the b ion series (reviewed in Steen and Mann3). Proteolytic processing of proteins with commonly used proteases—including trypsin (which cleaves after arginine and lysine4) or Arg-C (which cleaves after arginine5)—produces peptides with an additional basic center at the C-terminus. Owing to their strong basicity, these functional groups efficiently retain protons and upon fragmentation produce a strong series of charged C-terminal fragments, termed y-ions. Peptides with both N- and C-terminal basic sites thus produce two series of overlapping ion series, complicating the final MS/MS spectra6.

Few enzymatic tools are available to specifically produce peptides with N-terminal basic amino acids. One, Lys-N (peptidyl-Lys metalloendopeptidase (MEP) from Grifola frondosa), specifically cuts prior to lysine. The resultant peptides, while concentrating positive charge closer to the N-terminus as compared to Trypsin, frequently bear internal arginines that then result in additional y-ions, similarly complicating fragmentation spectra7-10. In this case, the high specificity Lys-N exhibits for lysine is likely imparted through a substrate binding pocket that is not deep enough to accommodate arginine in a manner that positions the peptide backbone appropriately for hydrolysis11. A more recently described protease, ulilysin (also known as LysargiNase), displays broader N-terminal specificity than Lys-N and cleaves prior to arginine and lysine12,13. A valuable addition to proteomic workflows, ulilysin produces peptides with the desired N-terminally localized basic centers. Moreover, this protein has been readily implemented in several proteomics workflows ranging from use as a simple substitute for trypsin in conventional workflows (reviewed by Giansanti et al.14 and Trevisiol et al.15) to improved characterization of protein termini16,17 enhanced protein identification18, and improved de novo peptide sequencing19,20. Given the utility of enzymes with this dual N-terminal specificity we sought to find related enzymes which could be similarly useful while presenting enhanced activity, specificity, and versatility. To that end, our approach was to informatically screen for putative proteases with favorable molecular features (described below), characterize their specificity, produce a top candidate via recombinant protein expression, and profile that candidate's activity under numerous biochemical conditions.

Materials and methods

General

Digestion efficiency was measured and conditions were optimized for temperature, incubation time, tolerance to detergents and salts, pH, and preferred digestion buffers at 2% w/w enzyme/substrate. For the purpose of comparison, all buffers were brought to pH 7.4 (unless otherwise specified) even when this value was outside their normal buffering range. In time course experiments, digestion was quenched by addition of EDTA. Digestion efficiency was assayed by SDS-PAGE, FITC casein assays21, and Orbitrap XL or Velos Pro MS/MS analysis using CID or HCD fragmentation, respectively. Digestion specificity, including incorrect and missed cleavage events, was determined by Mascot22 and other MS search engines with no imposed enzyme specificity and results filtered to a 1% peptide false discovery rate (FDR). The number of total and unique peptides, cleavage residue specificity, and protein IDs were generated from these searches.

Construct screening by in vitro transcription/translation

Selected constructs (Supplemental Table 1) were commercially synthesized and included a 5′ T7 promoter sequence, 5′ ribosome binding site, and 3′ T7 terminator. This allowed all constructs to be amplified by Polymerase Chain Reaction (PCR) with a limited set of primers and then be easily taken as templates for in vitro transcription translation (IVT) using the PURExpress In Vitro Protein Synthesis Kit (NEB). IVT reactions followed the manufacturer's recommendations for template concentrations and incubation times. Many of the constructs were expected to have disulfide bonds based on the presence of such bonds in the related, but better studied Lys-N and ulilysin proteases. As such, reactions were then subjected to denaturation and refolding. For these steps, each completed IVT reaction was first mixed with three volumes of denaturation buffer (50 mM Tris, pH 8.0, 6 M guanidinium hydrochloride, 2 mM EDTA, and 10 mM dithiothreitol) and incubated for 10 minutes at room temperature. Subsequently, 40 volumes (compared to the original IVT reaction) of refolding buffer (50 mM Tris, pH 8.0, 100 mM NaCl, 20% NDSB-201, 5 mM cysteamine, 2.5 mM cystamine, and 2.5 mM EDTA) were added and the reactions incubated overnight at 4 °C. Following renaturation, each reaction was concentrated using centrifugal filters and the buffer exchanged in to 20 mM Tris, pH 8.0, 1 mM EDTA by two rounds of 20-fold dilution and reconcentration. Proteolysis was then activated by the addition of 5 mM CaCl2 and 100 μM ZnCl2 in the presence of ProteaseMAX surfactant (Promega) at 0.05%. Digests were performed at 37 °C overnight, after which samples were taken for analysis by SDS-PAGE (Supplemental Figure 2). Samples of selected successful digests were then assessed for protease specificity by mass spectrometry.

Protein expression

Synthetic genes from above were cloned, typically into the vector pRSF for expression in E. coli. Cloning was performed using standard sequence and ligation-independent cloning (SLIC) methods23 and the sequence fidelity of all constructs was verified by the Cold Spring Harbor Laboratory DNA Sequencing Facility. Recombinant plasmids were transformed into E. coli BL21(DE3) RIPL by heat shock and grown on lysogeny broth (LB) agar plates containing appropriate antibiotics. After overnight incubation at 37 °C, single colonies were inoculated into a starter culture of LB media with antibiotic and grown at 37 °C with shaking at 220 rpm. After ~16 h, this starter culture was used to inoculate terrific broth (TB) media with antibiotic (50 mL of starter culture added per liter of TB) to an initial OD600 of ~0.1. These cultures were then incubated at 37 °C with shaking at 300 rpm. The cell density was monitored and after ~4 hours (OD600 of 1.0), the cultures were briefly cooled on ice, and protein expression was induced with 1 mM isopropyl-β-d-1-thiogalactopyranoside (IPTG). Cultures were then incubated at 16 °C with shaking at 300 rpm. After overnight induction, the cells were harvested at 4000g for 20 – 30 minutes at 4 °C. The resulting cell pellet was frozen and stored at −80 °C until needed. Recombinant protein was purified by IMAC chromatography, purity assayed by SDS-PAGE and activity quantified by proteomic mass spectrometric analysis. The accession number and sequences of the Tryp-N constructs are provided below in Table 1.

Table 1.

Database, protein sequence, and protein parameters for the full-length MEP1-like protein from Chaetomium themophilium and the final construct expressed as Tryp-N.

Metalloprotease MEP1-like protein [Chaetomium thermophilum var. thermophilum DSM 1495] MRFAPVFLAAVAAQGAMAAPRRPGDAVATKGK
VFSCGAPEPSPEHIKISQAFATQELQALASGN
YSIKAVTTIDAYFHVVAKNTSLSGGYLTDAML
NNQLNVLNAAYAPHGFQFNLKGITRTVNANWA
DDTKGYEMTMKRSLRKGTYRTLNVYYLYEMGS
292 amino acids
(31.7 kDa)
NCBI Reference Sequence: XP_006696712.1 NLGYCYFPQSVTSGSTAFYRDGCTVLYSTVPG
GSLTNYNLGHTTTHEVGHWMGLYHTFQGGCTG
SGDYVSDTPAQASASSGCPIGRDSCPSQPGLD
PIHNYMDYSYDSCYEEFTAGQQARMVSYWNNY
RAGK
Tryp-N expression construct MHHHHHHGGENLYFQGGKAVTTIDAYFHVVAK
NTSLSGGYLTDAMLNNQLNVLNAAYAPHGFQF
NLKGITRTVNANWADDTKGYEMTMKRSLRKGT
YRTLNVYYLYEMGSNLGYCYFPQSVTSGSTAF
YRDGCTVLYSTVPGGSLTNYNLGHTTTHEVGH
WMGLYHTFQGGCTGSGDYVSDTPAQASASSGC
PIGRDSCPSQPGLDPIHNYMDYSYDSCYEEFT
AGQQARMVSYWNNYRAGK
242 amino acids
(26.8 kDa)

Color coding key:
Initiator methionine
His6-tag
Glycine linkers
TEV protease site

Protein purification

After thawing the cell pellets, cells were resuspended in 20 mL of 50 mM sodium phosphate, pH 8.0, 50 mM NaCl, and 10 mM imidazole. The resuspended cells were then lysed by sonication (2 seconds on, 2 seconds off for a total sonication time of 2 minutes). Cell debris was then removed by centrifugation at 35000g for 45 minutes. The resulting supernatant was purified by affinity chromatography on Ni-NTA resin (Qiagen). The target protein was eluted from the column with 50 mM sodium phosphate, pH 8.0, 0.2 M NaCl, 200 mM imidazole. EDTA was immediately added to the elution at a final concentration of 10 mM to prevent proteolysis.

Optional removal of the affinity tag

For some preparations, the N-terminal affinity tag was removed by cleavage with TEV protease. To accomplish this, TEV protease (1:20 m/m ratio target/protease) was added to the elution. The reaction was incubated at 4 °C for ~16 hours. As the removal of the tag did not seem to affect the activity of Tryp-N, most preparations omitted this step.

Refolding and buffer exchange

The purified protein was unfolded with 6 M guanidinium hydrochloride and reduced completely with 10 mM dithiothreitol. The protein was then slowly refolded by dialysis into 20 mM HEPES, pH 7.4, 100 mM NaCl, 5 mM cysteamine, 2.5 mM cystemine, 2 mM EDTA, at 4 °C. The dialysis procedure took place over ~16 hours with the buffer being exchanged once. After this dialysis, the protein was further dialyzed into 20 mM HEPES, pH 7.4, 100 mM NaCl, 2 mM EDTA at 4 °C to remove the reductants.

Final preparation and assessment of protein purity

After dialysis, the protein solution was filtered using a 0.2 μm syringe and concentrated using Amicon Ultra centrifugal filter devices. The purity of each preparation was assessed by SDS-PAGE and the yield determined from absorbance at 280 nm and the calculated extinction coefficient. The purified protein was kept either at 4 °C for short-term use or flash frozen in liquid nitrogen maintained at −80 °C for long-term storage.

Analysis of protease activity

Protease activity was assayed with fluorescein isothiocyanate (FITC) labeled casein essentially as described previously21. Typically, 100 μg of FITC-labeled casein was added to a solution containing 2.5 μg of protease in buffer (usually 50 mM trimethylammonium acetate (TMA) or HEPES), pH 7.4, 2 mM CaCl2, 0.1 mM ZnCl2, and 0.1 mM MgCl2, in a reaction volume of 40 μL. The reaction was incubated for 3 hours at 55-65 °C, then stopped by the addition of 1 μL of 0.5 M EDTA and 10 μL of 6.1 M TCA. The reaction was further incubated at 37 °C for 30 minutes in the dark with shaking at 1400 rpm to ensure complete quenching. The supernatant containing TCA-soluble peptides was separated from undigested protein by centrifugation at 14,000g for 10 minutes. Five microliters of the resulting supernatant were diluted with 195 μL of 0.5 M Tris assay buffer, pH 7.5. Fluorescence was then measured in 96-well format with an excitation and emission wavelengths of 485 and 538 nm, respectively. This assay was adjusted as needed to assess the protease activity under a series conditions including various buffers, metals, chelators, detergents, chaotropes, organic solvents, etc. To account for fluorescence differences attributed to additives, after reaction quenching, all samples were brought to the same final volumes and additive concentrations. TCA precipitation and fluorometric analysis then followed.

Analysis of protease specificity

Tests of activity and specificity over time in volatile buffers (Figures 3c, 3d) were performed with a reduced and alkylated sixteen-protein standard mixture. Fifty micrograms of this substrate, originally in volatile TEAB, were dried down. Protein samples were subsequently resuspended by sonication in various buffers (25 mM final buffer concentration, all at pH 7.4 with pH measured at 60 °C) containing 2 mM CaCl2 and 0.1 mM ZnCl2. Protease was added to the substrate at 1:50 (wt:wt) and incubated at the indicated temperature for the indicated time in a water bath. The final volume of all digestions was 50 μL. Reactions were then quenched by addition of 1 μL of 500 mM EDTA, pH 8.0. A five microgram aliquot was reserved for mass spectrometric analysis and the remaining digested material was typically analyzed by SDS-PAGE. One microgram of this digest was subjected to LC/MS/MS on an Orbitrap-XL mass spectrometer as described in Obad et al.24. The resulting data were searched with Mascot22 against the Uniprot database25 using no enzyme specificity, a 20 ppm precursor mass tolerance, and 0.5 Da fragment mass tolerance with a target FDR of 1%. Specificity was determined by examining the amino acid before proteolytic events at both ends of detected peptides in the context of their larger protein sequence. Unique peptides were counted as those with a given primary sequence and observed modifications. Ion current of the y- and b-series was calculated from the ion intensities of those ions reported in an mzIdentML export.

Figure 3 – Tryp-N shows high specificity, generating peptides with N-terminal arginines and lysines as determined by proteomic analysis.

Figure 3 –

a, Under typical digestion conditions at pH 7.5, Tryp-N showed an overall proteolytic specificity ≥95%, with pronounced specificity for arginine and lysine. b, Although overall activity levels were higher close to neutral pH, cleavage specificity was stable from pH 5 to 11. With increasing pH, the relative preference for arginine over lysine increased. c, Cleavage specificity was not significantly influenced by the reaction buffer or time of digestion from 0.5 – 16 hours incubation. d, At 60 °C, approximately 75% of detectable peptides were generated after an incubation of 30 minutes. Thirty minute to 3 hour incubation times were found to be optimal. TEAA (triethylammonium acetate); AA (ammonium acetate).

Analysis of effect of pH

To control for buffer identity across a wide pH range, six buffers with pKa values from 2.15 to 13.08 were combined in equimolar ratios similar to as described by Ellis et al.26 (Table 2). The pH of this composite buffer was then adjusted as needed and brought to a final total buffer concentration of 0.5 M. The average difference in pKa between successive buffers (ranked by pKa below) is < 1, thereby affording roughly equal buffering capacity over the full pH range. Digests in the pH activity series for overall monitoring of activity (Figure 2a) were performed using the FITC-labeled casein assay for 3 hours at 60 °C. Those for determination of enzymatic specificity (Figures 3a, 3b) were performed on reduced and alkylated E. coli lysate with the reactions being prepared essentially as described above. The resulting data were searched with Mascot22 against the Uniprot database25 of E. coli proteins using no enzyme specificity, a 20 ppm precursor mass tolerance, and 0.5 Da fragment mass tolerance with a target FDR of 1%.

Table 2 –

Buffers and pKa parameters for components of the universal buffer used to assess the effects of pH on Tryp-N activity.

Buffer pKa pKa difference
sodium phosphate 2.15
citric acid 3.15 1.00
formic acid 3.75 0.60
citric acid 4.77 1.02
citric acid 5.19 0.42
sodium carbonate 6.37 1.18
sodium phosphate 7.20 0.83
HEPPS 8.10 0.90
boric acid 9.14 1.04
sodium bicarbonate 10.25 1.11
sodium phosphate 12.28 2.03
boric acid 12.74 0.46
boric acid 13.08 1.06

Figure 2 – Tryp-N demonstrates enzymatic activity over a wide pH range at thermophilic temperatures.

Figure 2 –

a, In FITC-labeled casein digestions Tryp-N demonstrated high activity over a broad pH range with optimal levels between pH 6 and 8. Substantial activity persisted as high as pH 9. Error bars represent the standard deviation from the mean (n=3). b, For the temperatures tested, Tryp-N exhibited highest activity around 60-65 °C. Error bars represent the standard deviation from the mean for the relative activity determined at each temperature at four concentrations of enzyme (n=4).

Stability to freeze-thaw cycles

Stability of the protease to freeze-thaw cycles was determined either on enzyme pre-loaded with metals (holoenzyme) or without additional added metals ("apoenzyme"). Samples were frozen on dry ice and then thawed on ice for the specified number cycles. For holoenzyme samples, 2 μg of protease and 100 μg of FITC-labeled casein (1:50 wt:wt) were added in a total volume of 25 μL in 25 mM TMA acetate containing 2 mM CaCl2 and 0.1 mM ZnCl2. For apoenzyme samples, 2 μg of protease in ~220 mM TMA acetate was frozen and thawed for the specified number of cycles. FITC casein, CaCl2 and ZnCl2 were then added to these solutions to the same concentrations and final volumes as compared to the holoenzyme samples. All samples were then incubated for 2 hours at 50 °C after which the reaction was stopped by the addition of 0.5 μL 500 mM EDTA and 75 μL 0.6 M TCA. Samples were then incubated at 37 °C in the dark for 30 minutes then centrifuged at 14,000g for 20 minutes. The supernatant was then assayed for fluorescence. Two microliters of the supernatant were diluted into 200 μL of fluorescence assay buffer (500 mM Tris, pH 8.5) and fluorescence was measured as described above.

Comparison of peptide and protein detection rates

To compare detection rates between trypsin and Tryp-N, 50 μg of HEK293 cell lysate that had been reduced and alkylated with methyl methanethiosulfonate (MMTS) as described in Londhe et al.27 was digested with 1 μg of enzyme (2% w/w). One microgram of this digest was subjected to LC/MS/MS on an Orbitrap Lumos mass spectrometer, essentially as described in Engle et al.28 The resulting data were searched with Mascot22 against the Uniprot human reference proteome database25 using the relevant cleavage specificity, one missed cleavage a 30 ppm precursor mass tolerance, 0.2 Da fragment mass tolerance with a target FDR of 1%. Cysteine modifications were set as fixed, with methionine oxidation and asparagine and glutamine deamidation as variable. Unique peptides and proteins for each digest were compiled for each protease, then compared.

Analysis of reproducibility in proteomics workflows.

Reproducibility was assessed by comparing peptides identified from three biological replicates of Tryp-N digested HEK293 cell lysate as described above. One microgram of this digest was subjected to 8-fraction 2D LC/MS/MS on an Orbitrap Lumos mass spectrometer, as described in Engle et al.28 and Londhe et al.21 and searched as above.

MRM limit of detection (LOD) assay

For MRM LOD measurements, catalase was digested with trypsin or Tryp-N under conditions favorable to each enzyme. Three transitions were independently optimized for each of the peptides of each sample, including optimization of collision energies. A fifteen-point calibration curve from 3.7 fg/injection to 1 μg/injection (distributed roughly over half-logs) was quantified on a Thermo Vantage triple quadrupole mass spectrometer (Thermo Scientific), equipped with a ESI spray source was coupled to EASY-nLC system (Thermo Scientific). The nano-flow LC system was configured with a 180 μm inner diameter fused silica capillary trap column containing 3 cm of Aqua 5 μm C18 material (Phenomenex), and a self-pack PicoFrit™ 100 μm analytical column with an 8 μm emitter (New Objective, Woburn, MA) packed to 15 cm with Aqua 3 μm C18 material (Phenomenex). Eluted peptides were directly electrosprayed into the Vantage triple quadrupole mass spectrometer with the application of a distal 2.3 kV spray voltage and a capillary temperature of 200 °C. Mobile phase A consisted of 2% acetonitrile, 0.1% formic acid and mobile phase B consisted of 90% acetonitrile, 0.1% formic acid. Five microliters of each sample, dissolved in mobile phase A, were injected through the autosampler onto the trap column. Peptides were then separated using the following linear gradient steps at a flow rate of 300 nL/min: 3% B for 2 min, 3% B to 8% B over 3 min, 8% B to 40% B over 30 min, 40% B to 80% B over 1 min, held at 80% B for 5 min, 80% B to 5% B over 1 minute and finally held at 5% B for 5 minutes. Peaks were integrated and the total peak areas for each peptide were calculated using Skyline analysis software. LOD was calculated for 17 peptides from two linear curves fit to background and the linear range of response on a log count scale using least squares fitting.

Molecular Modeling and Figures

Modeling of Tryp-N was performed using the Phyre2 protein fold recognition server29 and SWISS-MODEL30. In both cases, the structure of ulilysin31 was the top hit (PDB ID: 2J83). Molecular graphics were generated using PyMOL v2.3.232. Venn diagrams were generated using the eulerr package33 for R34. Cleavage frequency logos were generated using the ggseqlogo package35 for R. Cladograms were generated from multiple sequence alignment performed by the T-Coffee web server36 and displayed using the ggtree package37 for R. Boxplots were generated using the ggplot2 package38 for R. All other plots were made with GraphPad Prism 8 for macOS.

Results

Computational screening of putative Lys/Arg-specific proteases reveals many potential enzymes

Structural analysis of Lys-N11 and ulilysin13 revealed extended structural elements that do not appear to contribute to the folding of the enzymatic core (Supplemental Figure 1). Since such features can be prone to lead to protein aggregation or instability, we reasoned that alternative enzymes with improved behavior may already be found in nature.

To begin, we performed a comprehensive bioinformatics analysis to identify proteases and putative proteases which might possess the desired N-terminal cleavage specificity. We initially focused on MEROPS39 superfamily M35, which includes Lys-N8. Manual sequence inspection of related proteases, homology alignment, as well as previous literature suggested that proteases in this family often possess key motifs: an HEXXH zinc-binding site40 and a GTXDXXYG41 motif responsible for catalytic metal coordination.

An in silico screen using these motifs against all available sequence libraries and databases, including environmental metagenomic sequences, revealed thousands of potential proteases with representatives from MEROPS superfamilies M10, M12, M35, M43, M54, M57, M66, M72, M80, M84 and M97, as well as candidates for which classification was not immediately unambiguous39.

Informatic and biochemical assessment of putative lysine/arginine-specific enzymes

Filters were applied to this exhaustive list based on desirable qualities including relatively short sequence length, sequence homology to Lys-N or ulilysin (but not to proteases with unwanted specificities such as trypsin), and the absence of transmembrane domains. Enzymes in this list were then clustered by sequence homology and representative members from each cluster were taken for production and biochemical assessment (Supplemental Figure 2).

The candidate genes were commercially synthesized then taken for in vitro transcription and translation (IVT) in a reconstituted, protease-free system. Each IVT reaction was then subjected to protein refolding after which protease activity was assessed based on degradation of IVT reaction components (Supplemental Figure 2). Selected constructs that displayed significant digestion were then characterized for protease specificity by MS characterization of the resultant peptides.

The most promising hits (constructs from Chaetomium thermophilum and Myceliophthora thermophila) were then tested for their ability to be produced in recombinant expression systems. Proteolytic activity was assayed using a FITC-labeled casein assay21 and proteolytic specificity monitored by proteomic characterization of the resultant peptides. Reproducible expression, high activity, and high specificity were found for one protease which we describe here and name Tryp-N, a designation evocative of trypsin-like specificity but representing N-terminal (rather than C-terminal) selective activity.

Proteolysis by Tryp-N yields simplified peptide fragmentation spectra

When compared to tryptic peptides that harbor basic sites at both and N- and C-termini (Figure 1a), peptides like those produced by Tryp-N (Figure 1b) yield simplified fragmentation patterns as has been observed for related enzymes such as ulilysin12. For such peptides, most basic centers are co-located on the N-terminal side of the molecule thus producing predominantly b-ions during fragmentation8.

Figure 1 – Molecular Characteristics of Tryp-N.

Figure 1 –

a, Peptides with basic centers at their C-termini, such as those generated by trypsin, lead to a complex fragmentation pattern with overlapping b and y ion series. b, In contrast, peptides produced by Tryp-N digestion display basic centers near the N-terminus. This results in simplified fragmentation spectra that are dominated by b ions. c, The domain architecture of M43 family of metzincin metalloendopeptideases (such as Tryp-N) typically consists of an N-terminal signal peptide, a proenzyme cleavage site, and a protease domain near the C-terminus. d, Homology modeling of Tryp-N with SWISS-MODEL displays conserved, Zn2+-coordinating residues that are required for proteolysis. Zinc is shown in light blue, calcium in green. e, Recombinantly-expressed Tryp-N was highly purified as shown by SDS-PAGE.

Molecular characteristics of Tryp-N

Tryp-N is a 32 kDa protein belonging to the M43 family of metzincin metalloendopeptideases39,42,43 (Figure 1c; Table 1). Analysis by 3DLigandSite44 revealed an expected zinc binding site (coordinated by three histidines and one glutamic acid) as well as a single predicted calcium-binding site. These features were also observed in structural modeling of the protein, performed using both Phyre229 and SWISS-MODEL30 (Figure 1d).

In comparison to the highly-related proteins ulilysin13 (~25% identical) and mirolysin45,46 (~14% identical), some key similarities and differences are apparent. Most obviously, these three proteins belong to the same M43 family of metzincin proteases and display comparable domain arrangements (Supplemental Figure 3). As is typical of proteins from thermophilic organisms, this protein from C. thermophilum is shorter owing to truncations of several exposed surface loops. Additionally, Tryp-N lacks a C-terminal pro-enzyme tail found in other members of the M43 family. As mentioned, the predicted structure of Tryp-N does coordinate both calcium and zinc ions. Curiously, however, in Tryp-N, we did not find evidence (neither in the protein sequence nor structure prediction) for coordination of a second zinc residue as is observed in ulilysin.

Tryp-N is proteolytically active in the presence of many divalent cations

Tryp-N was purified to >95% purity as determined by SDS-PAGE (Figure 1e) and found to be proteolytically active in the presence of CaCl2 and ZnCl2 (Supplemental Figure 4).

To assess the metal cofactor preferences of the metalloprotease, activity was screened under a wide variety of added metals at multiple concentrations (Supplemental Figure 5). Owing to the prediction of calcium binding, metals were screened both in the presence and absence of calcium. Tryp-N activity was highest in the presence of a combination of calcium at 2 mM and either zinc or manganese at 100 μM. Based on this observation, and the predicted binding of Zn2+ and Ca2+ to Tryp-N, all activity was subsequently compared to this “standard condition” of 2 mM CaCl2 and 100 μM ZnCl2.

In the presence of 2 mM calcium, several other metals afforded significant activity (Supplemental Figure 6) including 100 μM cadmium (77%), 100 μM or 1 mM lead (85% and 81% respectively), and 100 μM or 1 mM strontium (77% and 98% respectively). In combination at 100 μM each, CaCl2, MnCl2 and ZnCl2 afforded 98% activity while CaCl2, CoCl2 and ZnCl2 together provided 85% of activity.

Tryp-N is active and specific over a broad range of pH values, temperatures, and buffer conditions

We assayed the pH preference of Tryp-N from pH 2 – 12 at constant 5 mM calcium and 100 μM zinc using a universal buffer (similar to the system implemented in Ellis et al.26) which provides constant buffering capacity from pH 1 – 14. Tryp-N demonstrated high activity over a broad pH range with highest activity between pH 6 – 8, though substantial activity persisted as high as pH 9 under otherwise typical reaction conditions (Figure 2a). In addition to broad pH tolerance, Tryp-N also demonstrated tolerance to a wide variety of common buffer additives. The protease exhibited highest activity between 55 and 65°C (Figure 2b).

Overall proteolytic specificity was typically ≥95% with pronounced specificity for arginine and lysine (Figure 3a), a specificity equivalent to trypsin. This specificity was not significantly influenced by the reaction buffer (see Methods) nor time of digestion from 0.5 – 16 hours incubation (Figure 3 c) at 60 °C although some specificity loss was observed at lower temperature over extended times. Importantly, this cleavage preference before arginine and lysine was stable from pH 5 – 11 (Figure 3b). With increasing pH, the relative preference for arginine over lysine increased, likely reflecting the relative pKa values of the amino acid side chains (10.5 for lysine and 12.5 for arginine) and the preference of the enzyme for positively charged residues. The protein showed fairly reproducible behavior in peptide identification experiments, with approximately 40-75% of unique peptides and proteins being detected in each of three biological replicate digestions (Supplemental Figure 7), though it should be noted that this range was significantly influenced by the variable number of peptides in each sample.

With the exception of buffers known to bind metals, Tryp-N showed comparable activity under many common buffer conditions. Highest activity was observed in Tris, though similar activity was demonstrated in buffers commonly employed in MS proteomics including ammonium and TMA acetate (Supplemental Figure 8). The enzyme showed tolerance for high salt concentrations—retaining ~70% activity in 0.75 M NaCl (Supplemental Figure 9)—as well as numerous detergents. Tryp-N maintained nearly full activity at concentrations of 0.1% SDS, 0.25% sodium deoxycholate, 5% Tween 20, and 5% Triton-X100 (Supplemental Figure 10). The chaotrope urea was tolerated at up to 8 M (62% activity) and actually enhanced activity in concentrations from 1 – 4 M (Figure 4a). In contrast, Tryp-N had little tolerance for guanidinium hydrochloride, falling to 41% activity at 250 mM (Figure 4b). Enhanced activity was observed in the presence of organic solvents such as acetonitrile or methanol (Figure 4c) and >40% of activity was maintained in 50% hexafluoroisopropanol (Figure 4d).

Figure 4 – Tryp-N tolerates a number of solvent and additive conditions, including those used in mass spectrometry workflows.

Figure 4 –

a, Tryp-N tolerated urea up to 8 M urea (62% activity) and demonstrated enhanced activity in concentrations from 1 – 4 M. b, In contrast, the protease had little tolerance for guanidinium hydrochloride, falling to 41% activity at 250 mM. c, Tryp-N showed enhanced activity in the presence of organic solvents such as acetonitrile or methanol. d, Moreover, the protease maintained >40% activity under harsh treatment with 50% hexafluoroisopropanol. All panels depict results from the FITC-labeled casein assay. Error bars represent the standard deviation from the mean (n=6 for each experiment).

As expected, Tryp-N was fully inhibited in the presence of EDTA or 1,10-phenanthroline (Supplemental Figure 4) and was inhibited by benzamidine with an IC50 of approximately 0.5 mM (Supplemental Figure 11). As the protease both contains disulfides and relies on the oxidation state of catalytic metal ion, the presence of 5 mM TCEP was found to drastically reduce activity (Supplemental Figure 12).

Stability to freeze/thaw and autolysis

More than 90% of activity remained after 4 hours of autodigestion (not shown) or through 7 freeze/thaw cycles for either the holo- or apoenzyme (Supplemental Figure 13).

Digestion time and temperature

At 60 °C, 70-80% of detectable peptides were generated after an incubation of 30 min and 0.5 – 3 hour incubation times were found to be optimal (Figure 3d). Tryp-N exhibits ~90% of its maximal activity at this temperature. At 50 °C, slightly longer digestion times were needed (4 hours) as the enzyme exhibits ~60% of maximal activity. While specificity was not significantly decreased at higher temperatures, longer incubation times at 50 °C resulted in fewer detected peptides, likely the result of precipitation and/or degradation of peptides or proteins from extended exposure at this temperature.

Given that the proteins and peptides of mesophilic organisms are prone to precipitate at high temperatures, we found that Tryp-N digestions at 55 °C were optimal as this temperature limits substrate precipitation while promoting enzymatic activity. Conveniently, this is also a common temperature for chemical reduction steps in many MS protocols allowing for in straightforward incorporation into existing workflows.

Proteomic discovery and sensitivity comparison of Tryp-N and trypsin

In proteomics discovery experiments, Tryp-N yielded ~70% as many proteins and 55% as many peptides when compared to trypsin (Figures 5a and 5b, respectively). It should be noted that while there were overlapping peptides and proteins between treatment with trypsin or Tryp-N, a substantial number were unique for each enzyme, demonstrating their complementary utility.

Figure 5 – Tryp-N allows for protein identification while producing simplified (b ion-dominated) fragmentation spectra.

Figure 5 –

a, In proteomics discovery experiments, Tryp-N yielded ~70% as many proteins when compared to trypsin. b, Additionally, Tryp-N and trypsin gave rise to some overlapping and some unique peptides. It should be noted that the small discrepancy between the total number of peptides observed and the summed numbers indicated in the Venn diagram arises from peptides that may arise from distinct proteins but cannot be distinguished by both proteases. As an example, two proteins containing …KLEELELDEQQK… and …RLEELELDEQQK… would produce peptides distinguishable by Tryp-N (KLEELELDEQQ versus RLEELELDEQQ) but not trypsin (both yielding LEELELDEQQK). A similar issue can arise for Tryp-N with N-terminally indistinguishable peptides. These peptides are rare overall (25 peptides out of the entire dataset) and do not significantly affect the data interpretation. c, Upon fragmentation, Tryp-N digested samples produced predominantly b ions (in contrast to the typical y ion-dominated fragments observed from tryptic peptides). In fact, the proportions of b and y ions from tryptic versus Tryp-N peptides precisely mirror each other, reflective of the N- versus C-terminal nature of the basic centers in these peptides. Error bars represent the standard deviation from the mean over all peptides (n=1655 for Tryp-N and n=12225 for trypsin). d, A decrease in the limit of detection (i.e. increase in sensitivity) by optimized MRM assays was generally observed for Tryp-N peptides as compared to tryptic peptides with 4.5-fold and 35.4-fold increases in median and average sensitivity, respectively. The distribution as well as individual fold-increases in sensitivity are plotted with the corresponding pair of peptides labeling each point.

Upon fragmentation in MS/MS, the majority of the total ion current for multiply charged, Tryp-N generated peptides was from b-ions (Figure 5c). This reflects the position of the basic amino acids lysine and arginine on the N-, rather than C-, termini and mirrors the ion preference of trypsin almost exactly (Figure 5c). Because proton ion current is mostly retained on N-terminal fragments, ion dilution between multiple ion series is avoided. As a result, in terms of sensitivity we found Tryp-N generated peptides often exhibit lower limits of detection (LOD) during MRM analyses as compared to their tryptic peptide counterparts. For 17 peptides of catalase, a decrease in the limit of detection by optimized MRM assays was generally observed for Tryp-N peptides with 4.5-fold and 35.4-fold increases in median and average sensitivity, respectively (Figure 5d).

Discussion

The widespread use of mass spectrometry in proteomic studies relies on robust sample processing reagents, tools, and protocols. Owing to the variety of biological samples and broad range of questions asked, resources which enable multiple routes of interrogation are critical. On account of its high activity, specificity, and versatility, trypsin is currently a mainstay in sample treatment prior to MS/MS analysis47.

Despite trypsin’s utility and prevalence, it has long been recognized that spectral interpretation would be greatly eased by promoting either a y- or b-ion series, rather than analyzing their overlapping registers as occurs with tryptic peptides. While many chemical and instrumental methods have been developed to this end, the simplicity of exchanging one protease for another is clearly a straightforward and desirable approach. By cleaving before lysine and arginine, Tryp-N produces peptides that possess a preponderance of their basic centers at the N-termini (histidines and incompletely cleaved peptides being the exceptions). Fragmentation by MS/MS thus produces simplified spectra dominated by b-ions which facilities peptide identification and de novo sequencing.

Tryp-N exhibits stability as well as high specificity and activity across a range of temperatures, values of pH, detergent concentrations, and solvent compositions. Most importantly, proteolysis is highly effective in conditions that are well-suited for downstream MS/MS analysis. Compared to LysargiNase, which similarly generates peptides with N-terminal lysine and arginine, Tryp-N affords many advantages the most practical of which is speed. Under recommended digestion conditions for LysargiNase, Tryp-N performs roughly as well over 1 – 3 hours with moderately improved digestion overnight. At higher temperatures of ~60 °C, Tryp-N is able to supersede overnight digestion performance of LysargiNase under optimal conditions within only 1 hour (Supplemental Figure 14). The main disadvantage to working at these higher temperatures can be precipitation of the target proteins to be digested, though this was seldom a problem in our workflow. Nonetheless, this limitation can be overcome by the addition of small amount of detergents, which Tryp-N tolerates well (Supplemental Figure 10). For proteomics workflows it is recommended that RapiGest surfactant is used rather than ProteaseMAX surfactant owing to the short half-life of the latter at high temperatures. Following digestion, RapiGest can be conveniently cleaved by simply lowering the pH of the reaction.

Taken together, this work demonstrates that Tryp-N allows for rapid sample processing at high temperature. This enzyme can be easily adapted to existing proteomic workflows as a substitute for (or complement to) trypsin, generating predominantly N-terminal b-ions and rapidly completing digestion in 1–3 hours at ~60 °C. Tryp-N is tolerant to a wide array of experimental conditions making it a valuable addition the molecular proteomics tool kit.

Supplementary Material

Supplementary

Supplemental Figure 1 – Structural and sequence analysis of ulilysin

Supplemental Figure 2 – Initial screening of protease activity for several putative proteases

Supplemental Figure 3 – Structural, domain, and sequence comparison of M43 proteases

Supplemental Figure 4 – Metal dependence of Tryp-N

Supplemental Figure 5 – Initial metal ion screening

Supplemental Figure 6 – Extended metal ion screening

Supplemental Figure 7 – Proteomic replicability of Tryp-N

Supplemental Figure 8 – Characterization of buffer composition on Tryp-N activity

Supplemental Figure 9 – Salt (NaCl) tolerance of Tryp-N

Supplemental Figure 10 – Detergent tolerance of Tryp-N

Supplemental Figure 11 – Benzamidine tolerance of Tryp-N

Supplemental Figure 12 – Characterization of TCEP on Tryp-N activity

Supplemental Figure 13 – Characterization of Tryp-N activity after repeated of freeze-thawing

Supplemental Figure 14 – Comparison to Tryp-N activity to LysargiNase (ulilysin)

Supplemental Table 1 – Database identification, protein sequence, and full-length protein information for proteins screened

Supplemental Table 2 – List of peptides and sensitivity gains in MRM experiments

Acknowledgments

Funding

JPW, SNDG, NST, KHD, KDR, and DJP were supported by NCI Cancer Center Support Grant 5P30CA045508. JJI was supported by NIH grant F32GM097888 and the Harvey L. Karp Discovery Award.

Footnotes

Supporting Information

The following supporting information is available free of charge at ACS website http://pubs.acs.org

Data Access

Mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE48 partner repository with the following accession numbers:
  • PXD017021 (comparison of Tryp-N specificity at various values of pH; Figures 3a, 3b),
  • PXD017023 (comparison of trypsin and Tryp-N digests on HEK293 cell lysate; Figures 5a, 5b),
  • PXD017024 (comparison of buffers and temperatures on Tryp-N digestion specificity and efficacy; Figures 3c, 3d),
  • PXD017030 (replicability of three biological replicate digests of HEK293 cell lysate with Tryp-N; Supplemental Figure 7), and
  • PXD017048 (screening of orthologous protease specificity, Supplemental Figure 2)

Python scripts used for processing MASCOT search results for cleavage specificity, unique peptide counts, and comparisons between datasets are available at https://github.com/jonipsaro/trypn.

Conflict of Interests Disclosure

John P. Wilson and Darryl J. Pappin are co-founders of Protifi, a company that provides mass spectrometry workflow products, including Tryp-N. Tryp-N is protected by US patent US9719078 to Pappin, Wilson, and Ipsaro.

References

  • 1.Glish GL & Vachet RW The basics of mass spectrometry in the twenty-first century. Nat Rev Drug Discov 2, 140–150, doi: 10.1038/nrd1011 (2003). [DOI] [PubMed] [Google Scholar]
  • 2.DeGnore JP & Qin J Fragmentation of phosphopeptides in an ion trap mass spectrometer. J Am Soc Mass Spectrom 9, 1175–1188, doi: 10.1016/S1044-0305(98)00088-9 (1998). [DOI] [PubMed] [Google Scholar]
  • 3.Steen H & Mann M The ABC's (and XYZ's) of peptide sequencing. Nat Rev Mol Cell Biol 5, 699–711, doi: 10.1038/nrm1468 (2004). [DOI] [PubMed] [Google Scholar]
  • 4.Olsen JV, Ong SE & Mann M Trypsin cleaves exclusively C-terminal to arginine and lysine residues. Mol Cell Proteomics 3, 608–614, doi: 10.1074/mcp.T400003-MCP200 (2004). [DOI] [PubMed] [Google Scholar]
  • 5.Mitchell WM Cleavage at arginine residues by clostripain. Methods Enzymol 47, 165–170 (1977). [DOI] [PubMed] [Google Scholar]
  • 6.Tabb DL, Huang Y, Wysocki VH & Yates JR 3rd. Influence of basic residue content on fragment ion peak intensities in low-energy collision-induced dissociation spectra of peptides. Anal Chem 76, 1243–1248, doi: 10.1021/ac0351163 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gauci S et al. Lys-N and trypsin cover complementary parts of the phosphoproteome in a refined SCX-based approach. Anal Chem 81, 4493–4501, doi: 10.1021/ac9004309 (2009). [DOI] [PubMed] [Google Scholar]
  • 8.Hohmann L et al. Proteomic analyses using Grifola frondosa metalloendoprotease Lys-N. J Proteome Res 8, 1415–1422, doi: 10.1021/pr800774h (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Taouatas N et al. Strong cation exchange-based fractionation of Lys-N-generated peptides facilitates the targeted analysis of post-translational modifications. Mol Cell Proteomics 8, 190–200, doi: 10.1074/mcp.M800285-MCP200 (2009). [DOI] [PubMed] [Google Scholar]
  • 10.Taouatas N, Heck AJ & Mohammed S Evaluation of metalloendopeptidase Lys-N protease performance under different sample handling conditions. J Proteome Res 9, 4282–4288, doi: 10.1021/pr100341e (2010). [DOI] [PubMed] [Google Scholar]
  • 11.Hori T et al. Structure of a new 'aspzincin' metalloendopeptidase from Grifola frondosa: implications for the catalytic mechanism and substrate specificity based on several different crystal forms. Acta Crystallogr D Biol Crystallogr 57, 361–368, doi: 10.1107/s0907444900019740 (2001). [DOI] [PubMed] [Google Scholar]
  • 12.Huesgen PF et al. LysargiNase mirrors trypsin for protein C-terminal and methylation-site identification. Nat Methods 12, 55–58, doi: 10.1038/nmeth.3177 (2015). [DOI] [PubMed] [Google Scholar]
  • 13.Tallant C, Garcia-Castellanos R, Seco J, Baumann U & Gomis-Ruth FX Molecular analysis of ulilysin, the structural prototype of a new family of metzincin metalloproteases. J Biol Chem 281, 17920–17928, doi: 10.1074/jbc.M600907200 (2006). [DOI] [PubMed] [Google Scholar]
  • 14.Giansanti P, Tsiatsiani L, Low TY & Heck AJ Six alternative proteases for mass spectrometry-based proteomics beyond trypsin. Nat Protoc 11, 993–1006, doi: 10.1038/nprot.2016.057 (2016). [DOI] [PubMed] [Google Scholar]
  • 15.Trevisiol S et al. The use of proteases complementary to trypsin to probe isoforms and modifications. Proteomics 16, 715–728, doi: 10.1002/pmic.201500379 (2016). [DOI] [PubMed] [Google Scholar]
  • 16.Marino G, Eckhard U & Overall CM Protein Termini and Their Modifications Revealed by Positional Proteomics. ACS Chem Biol 10, 1754–1764, doi: 10.1021/acschembio.5b00189 (2015). [DOI] [PubMed] [Google Scholar]
  • 17.Demir F, Niedermaier S, Kizhakkedathu JN & Huesgen PF Profiling of Protein N-Termini and Their Modifications in Complex Samples. Methods Mol Biol 1574, 35–50, doi: 10.1007/978-1-4939-6850-3_4 (2017). [DOI] [PubMed] [Google Scholar]
  • 18.Liu S et al. LysargiNase enhances protein identification on the basis of trypsin on formalin-fixed paraffin-embedded samples. Rapid Commun Mass Spectrom 33, 1381–1389, doi: 10.1002/rcm.8479 (2019). [DOI] [PubMed] [Google Scholar]
  • 19.Tsiatsiani L et al. Opposite Electron-Transfer Dissociation and Higher-Energy Collisional Dissociation Fragmentation Characteristics of Proteolytic K/R(X)n and (X)nK/R Peptides Provide Benefits for Peptide Sequencing in Proteomics and Phosphoproteomics. J Proteome Res 16, 852–861, doi: 10.1021/acs.jproteome.6b00825 (2017). [DOI] [PubMed] [Google Scholar]
  • 20.Yang H et al. Precision De Novo Peptide Sequencing Using Mirror Proteases of Ac-LysargiNase and Trypsin for Large-scale Proteomics. Mol Cell Proteomics 18, 773–785, doi: 10.1074/mcp.TIR118.000918 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Twining SS Fluorescein isothiocyanate-labeled casein assay for proteolytic enzymes. Anal Biochem 143, 30–34 (1984). [DOI] [PubMed] [Google Scholar]
  • 22.Perkins DN, Pappin DJ, Creasy DM & Cottrell JS Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567, doi: (1999). [DOI] [PubMed] [Google Scholar]
  • 23.Li MZ & Elledge SJ Harnessing homologous recombination in vitro to generate recombinant DNA via SLIC. Nat Methods 4, 251–256, doi: 10.1038/nmeth1010 (2007). [DOI] [PubMed] [Google Scholar]
  • 24.Obad S et al. Silencing of microRNA families by seed-targeting tiny LNAs. Nat Genet 43, 371–378, doi: 10.1038/ng.786 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.UniProt, C. UniProt: a hub for protein information. Nucleic Acids Res 43, D204–212, doi: 10.1093/nar/gku989 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ellis DA A new universal buffer system. Nature 191, 1099–1100 (1961). [DOI] [PubMed] [Google Scholar]
  • 27.Londhe AD et al. Regulation of PTP1B activation through disruption of redox-complex formation. Nat Chem Biol, doi: 10.1038/s41589-019-0433-0 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Engle DD et al. The glycan CA19-9 promotes pancreatitis and pancreatic cancer in mice. Science 364, 1156–1162, doi: 10.1126/science.aaw3145 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kelley LA, Mezulis S, Yates CM, Wass MN & Sternberg MJ The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc 10, 845–858, doi: 10.1038/nprot.2015.053 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Waterhouse A et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46, W296–W303, doi: 10.1093/nar/gky427 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Garcia-Castellanos R et al. Substrate specificity of a metalloprotease of the pappalysin family revealed by an inhibitor and a product complex. Arch Biochem Biophys 457, 57–72, doi: 10.1016/j.abb.2006.10.004 (2007). [DOI] [PubMed] [Google Scholar]
  • 32.Schrodinger LLC. The PyMOL Molecular Graphics System, Version 2.0. (2019). [Google Scholar]
  • 33.eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses v. 6.0.0 (2019). [Google Scholar]
  • 34.R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria, 2019). [Google Scholar]
  • 35.Wagih O ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics 33, 3645–3647, doi: 10.1093/bioinformatics/btx469 (2017). [DOI] [PubMed] [Google Scholar]
  • 36.Madeira F et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res 47, W636–W641, doi: 10.1093/nar/gkz268 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Yu G, Lam TT, Zhu H & Guan Y Two Methods for Mapping and Visualizing Associated Data on Phylogeny Using Ggtree. Mol Biol Evol 35, 3041–3043, doi: 10.1093/molbev/msy194 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wickham H ggplot2: Elegant Graphics for Data Analysis. (Springer-Verlag, 2016). [Google Scholar]
  • 39.Rawlings ND et al. The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database. Nucleic Acids Res 46, D624–D632, doi: 10.1093/nar/gkx1134 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Rawlings ND & Barrett AJ Evolutionary families of metallopeptidases. Methods Enzymol 248, 183–228 (1995). [DOI] [PubMed] [Google Scholar]
  • 41.Fushimi N, Ee CE, Nakajima T & Ichishima E Aspzincin, a family of metalloendopeptidases with a new zinc-binding motif. Identification of new zinc-binding sites (His(128), His(132), and Asp(164)) and three catalytically crucial residues (Glu(129), Asp(143), and Tyr(106)) of deuterolysin from Aspergillus oryzae by site-directed mutagenesis. J Biol Chem 274, 24195–24201 (1999). [DOI] [PubMed] [Google Scholar]
  • 42.Andreeva A, Howorth D, Chothia C, Kulesha E & Murzin AG SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 42, D310–314, doi: 10.1093/nar/gkt1242 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Stocker W et al. The metzincins--topological and sequential relations between the astacins, adamalysins, serralysins, and matrixins (collagenases) define a superfamily of zinc-peptidases. Protein Sci 4, 823–840, doi: 10.1002/pro.5560040502 (1995). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wass MN, Kelley LA & Sternberg MJ 3DLigandSite: predicting ligand-binding sites using similar structures. Nucleic Acids Res 38, W469–473, doi: 10.1093/nar/gkq406 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Guevara T, Rodriguez-Banqueri A, Ksiazek M, Potempa J & Gomis-Ruth FX Structure-based mechanism of cysteine-switch latency and of catalysis by pappalysin-family metallopeptidases. IUCrJ 7, 18–29, doi: 10.1107/S2052252519013848 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Koneru L et al. Mirolysin, a LysargiNase from Tannerella forsythia, proteolytically inactivates the human cathelicidin, LL-37. Biol Chem 398, 395–409, doi: 10.1515/hsz-2016-0267 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rodriguez J, Gupta N, Smith RD & Pevzner PA Does trypsin cut before proline? J Proteome Res 7, 300–305, doi: 10.1021/pr0705035 (2008). [DOI] [PubMed] [Google Scholar]
  • 48.Perez-Riverol Y et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res 47, D442–D450, doi: 10.1093/nar/gky1106 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary

Supplemental Figure 1 – Structural and sequence analysis of ulilysin

Supplemental Figure 2 – Initial screening of protease activity for several putative proteases

Supplemental Figure 3 – Structural, domain, and sequence comparison of M43 proteases

Supplemental Figure 4 – Metal dependence of Tryp-N

Supplemental Figure 5 – Initial metal ion screening

Supplemental Figure 6 – Extended metal ion screening

Supplemental Figure 7 – Proteomic replicability of Tryp-N

Supplemental Figure 8 – Characterization of buffer composition on Tryp-N activity

Supplemental Figure 9 – Salt (NaCl) tolerance of Tryp-N

Supplemental Figure 10 – Detergent tolerance of Tryp-N

Supplemental Figure 11 – Benzamidine tolerance of Tryp-N

Supplemental Figure 12 – Characterization of TCEP on Tryp-N activity

Supplemental Figure 13 – Characterization of Tryp-N activity after repeated of freeze-thawing

Supplemental Figure 14 – Comparison to Tryp-N activity to LysargiNase (ulilysin)

Supplemental Table 1 – Database identification, protein sequence, and full-length protein information for proteins screened

Supplemental Table 2 – List of peptides and sensitivity gains in MRM experiments

RESOURCES