Abstract
The data described provide a comprehensive resource for the family-wide active site specificity portrayal of the human matrix metalloproteinase family. We used the high-throughput proteomic technique PICS (Proteomic Identification of protease Cleavage Sites) to comprehensively assay 9 different MMPs. We identified more than 4300 peptide cleavage sites, spanning both the prime and non-prime sides of the scissile peptide bond allowing detailed subsite cooperativity analysis. The proteomic cleavage data were expanded by kinetic analysis using a set of 6 quenched-fluorescent peptide substrates designed using these results. These datasets represent one of the largest specificity profiling efforts with subsequent structural follow up for any protease family and put the spotlight on the specificity similarities and differences of the MMP family. A detailed analysis of this data may be found in Eckhard et al. (2015) [1]. The raw mass spectrometry data and the corresponding metadata have been deposited in PRIDE/ProteomeXchange with the accession number PXD002265.
Keywords: Matrix metalloproteinases, MMPs, PICS, Proteomics, Quenched fluorescence, Specificity profiling, Cleavage sites
Specifications Table
Subject area | Biology |
More specific subject area | Proteolytic enzymes, metalloproteinases, substrate specificity profiling, inhibitor design, drug discovery, matrix biology, extra cellular matrix (ECM). |
Type of data | Mass spectrometry raw-files; search engine output files; metadata; quenched fluorescent peptide cleavage data; specificity profiling analysis (.xlsx). |
How data was acquired | Liquid chromatography tandem mass spectrometry (LC-MS/MS): either QSTAR XL or QSTAR Pulsar I (Applied Biosystems) mass spectrometer coupled on-line to LC Packings capillary LC system (Dionex). |
Data format | RAW files: .wiff and .mzXML-files; .tandem and .pepxml post-database search output files from X! Tandem [2]. |
Experimental factors | (A) Human MMPs 1, 2, 3, 8, 9 and 13 were expressed and purified from CHO or Timp2−/− MEF-conditioned medium [3], [4], [5], [6], [7]. Soluble MMP14 (A21-R513) was purified from Pichia pastoris[3]. MMP7 was purchased from Enzo Life Sciences and MMP12 was a kind gift from Novartis. ProMMP3 was activated using chymotrypsin; all other proMMPs were activated using APMA. (B) Proteome-derived peptide libraries were prepared from K562 cells in the presence of protease inhibitors [8], [9]. Proteins were denatured and cysteine side-chains alkylated. After reaction clean-up, proteomes were digested with trypsin or GluC giving orthogonal peptide libraries. Primary amines were dimethylated, peptide libraries purified, and stored as 200 µg aliquots at −80 °C. |
Experimental features | PICS cleavage assays [8], [9] were performed by incubating peptide libraries with MMP. Cleaved peptides with neo-N-termini were biotinylated and affinity purified. Eluates were desalted and analyzed by LC–MS/MS. Spectra were matched to peptides using X! Tandem [2] and statistically evaluated with PeptideProphet [10], [11]. Identified peptides represent prime-side cleavage products and complete cleavage sites were reconstructed using WebPICS [12]. |
Data source location | Overall Laboratory, Centre for Blood Research, Department of Oral Biological and Medical Sciences, Faculty of Dentistry, University of British Columbia, Vancouver, BC, Canada. 49 °15′44.5″N 123 °14′41.8″W. |
Data accessibility | The mass spectrometry raw data have been deposited in PRIDE/ProteomeXchange with the accession number PXD002265. |
Value of the data
-
•
Comprehensive specificity profiling data of nine matrix metalloproteinases using PICS proteomics.
-
•
Largest compendium of matrix metalloproteinase P6–P6′ cleavage sites, reporting more than 4300 cleaved peptides.
-
•
Identified cleavage sites allow in-depth cooperativity analysis of specificity subsites.
-
•
Quenched fluorescent peptide cleavage data corroborating the specificity profiles.
-
•
These data can guide small molecule drug development and help in discrimination between direct and indirect MMP targets.
1. Data
Matrix metalloproteinases (MMPs) regulate the structural matrix environment and extracellular signaling by precise proteolytic cleavage. Unraveling complex in vivo proteolytic networks is challenging. Thus comprehensive specificity profiles of all proteases involved are needed to guide interpretation.
The data in the ProteomeXchange archive (PXD002265) and accompanying data of the present article provide a comprehensive resource for the individual assessment of the active site specificity of nine representative members of the human matrix metalloproteinase (MMP) family. The in-depth specificity comparison based on these proteomic data corroborated with kinetic analysis using a set of 6 quenched fluorescent peptides and in silico peptide docking was presented recently [1].
MMPs 1, 2, 3, 7, 8, 9, 12, 13 and 14 were all assayed by the high-throughput Proteomic Identification of protease Cleavage Sites (PICS; or PICS proteomics) method [8], [9] and using two orthogonal human whole-proteome peptide libraries (generated with trypsin or GluC). These data were analyzed using X! Tandem [2] for peptide-spectrum matching, and PeptideProphet [11] for statistical evaluation. However, other search engines such as Mascot [13], MS-GF+ [14], Comet [15], or MS Amanda [16] can be used to extend the number of matched spectra by combining the results using e.g. iProphet [17] within the Trans Proteomic Pipeline [10] or PeptideShaker [18]. Here we provide additional data to enable other researchers to (i) reinvestigate our analysis to identify additional subsite cooperativity effects and so far unexplored specificity preferences, and (ii) reanalyze our raw-data with entirely new concepts or ideas in mind, such as specificity-oriented protease evolution [19], [20], functional phylogeny [21], or substrate-driven mapping of the degradome [22].
A representative PICS workflow is depicted in Fig. 1, and a graphical representation of the various MMP domain architectures is shown in Supplementary Fig S1. Specificity profiling results were summarized as heat maps in Fig. 2, Fig. 3 (trypsin generated libraries) and 2.1.8, 2.1.8 (GluC-generated libraries). Supplementary Table S1 provides a general resource for the MMPs analyzed, giving their domain boundaries and links to the major databases in protein and protease research: UniProt [23], Pfam [24], MEROPS [25], and TopFIND [26], [27], [28]. Supplementary Table S2 depicts key residues and all secondary structure elements characterizing the different MMP catalytic domains, and shows a structural overlay of all analyzed MMPs. Supplementary Tables 3 and 4 contain the individual MMP specificity profiling results obtained from trypsin- and Glu-generated peptide libraries, respectively. Identified thioacylated prime-side cleavage products (column A), the WebPICS [12] results (columns C–I) including HeatMaps (www.gnuplot.info) and iceLogos [29], and the individual subsite analysis (columns K–W) together with the amino acid distribution in the human proteome (UniProt release 2013_10) used for calculating normalized abundances (columns Y–AA) are shown. Additionally, a high resolution PICS workflow is shown on both index pages, and all original WebPICS results are provided in the supplement as a combined zip-file. Supplementary Table S5 provides the data matrices underlying our subsite cooperativity analysis, in which we fixed certain subsites (e.g. P3-Pro or P1′-Leu) and analyzed the changes of the individual amino acid occurrences in the other subsites (P6–P6′). Supplementary Table S6 provides the raw data of our quenched fluorescent cleavage assays, including a graphical representation of two of the peptides (PLG↓L and PAN↓L) and a high resolution table summarizing the results normalized to the standard MMP substrate PLG↓L (originally referred to as QF-24) [30]. Detailed database search settings for PICS data analysis are given in Supplementary Table S7. Additionally, a combined zip-file of all WebPICS results can be found in the supplement. All mass spectrometry raw data and corresponding metadata have been deposited in the ProteomeXchange Consortium database (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository [51] with the PXD identifier 〈PXD002265〉.
We previously used PICS to characterize a wide selection of different proteases, such as clostridial collagenases [31], plant metalloproteinases [32], human type II transmembrane serine proteases (TTSPs) [33], the archaeal protease LysargiNase [34], snake venom serine proteinases [35], human coagulation factor Xa [12], and caspases 3 and 7 [8], proving both its versatility and robustness. Importantly, PICS is designed for active site specificity profiling and not for the identification of native substrates. For the latter task, the Overall lab developed TAILS (Terminal Amine Isotopic Labeling of Substrates) [36], [37], which we have successfully used e.g. for the identification of natural substrates of dipeptidyl peptidases 8 and 9 [38], the human gelatinases MMP2 and 9 [39], membrane-type 6 matrix metalloprotease MMP25 [40] and the meprins [41]. Over the last few years we have adapted TAILS for the study of complex in vivo biological systems [42] and identification of N-terminal modifications such as proteolytic processing that alter protein stability or function [43], [44]. We have assessed the N-terminome of various tissues and cells, such as skin [42], erythrocytes [43], platelets [45], and dental pulp [46]. In combination, TAILS N-terminomics and PICS proteomics allow an in depth characterization of any biological system and protease in an unbiased, proteomics-centered manner.
The following materials and methods section will enable other investigators and laboratories to design similar experimental procedures to study matrix metalloproteinases or any other protease by PICS proteomics. Please refer to our recent dental pulp proteomics and N-terminomics Data in Brief article for more information on TAILS [47].
2. Experimental design, materials and methods
2.1. Expression and purification of human MMPs
2.1.1. Summary
MMPs 1, 2, 3, 8, 9 and 13 were expressed as zymogens using the pGW1HG vector (kindly provided by British Biotech Pharmaceuticals, Oxford, UK), and purified from serum-free conditioned medium from (i) Chinese hamster ovary (CHO) cells or (ii) murine embryonic Timp2−/− fibroblasts (MEFs) [3], [4], [5], [6], [7]. Soluble MMP14 lacking the C-terminal transmembrane and cytoplasmic domain (A21-R513) was purified from Pichia pastoris [3]. Purified proMMPs were aliquoted into single use aliquots of 10 μg, flash-frozen with liquid nitrogen, and stored at −70 °C until use. For more expression and purification details, see below. Active MMP7 was purchased from Enzo Life Sciences, and MMP12 was a kind gift from Novartis Pharma AG (Basel, CH). ProMMP3 was activated using chymotrypsin. All others proMMPs were activated using APMA.
2.1.2. ProMMP1
Chinese hamster ovary (CHO-K1) cells (American Type Culture Collection) were maintained in Dulbecco׳s modified Eagle׳s medium (Invitrogen) supplemented with 10% cosmic calf serum (HyClone Laboratories, Inc.) and non-essential amino acids (Invitrogen). Cells were transfected with pGW1HG-MMP1 and selected with 25 μg/ml mycophenolic acid (Invitrogen). MMP1-expressing clones were expanded to confluence in roller bottles (850 cm2, BD Biosciences), washed with phosphate-buffered saline (PBS; 138 mm NaCl, 2.7 mm KCl, 20 mm Na2HPO4, 1.5 mm KH2PO4, pH 7.4), and incubated in 100 ml of serum-free CHO-S-SFM II medium (Invitrogen). Conditioned serum-free medium was collected every 1–2 days for up to 8 days. ProMMP1 was purified from collected culture supernatants using an (i) Orange-Sepharose column equilibrated in MES buffer (50 mM MES, pH 6.0, 5 mm CaCl2, 0.1 M NaCl, 0.025% sodium azide), and eluted with 1 M NaCl (in MES buffer). Elution fractions were subsequently loaded on (ii) Zn2+-chelating Sepharose Fast Flow resin (Amersham Biosciences), and chromatographed with a linear imidazole gradient (0 to 0.5 M). Fractions containing proMMP1 were pooled and dialyzed into HEPES buffer (50 mM HEPES pH 7.2, 5 mM CaCl2, 0.1 M NaCl).
2.1.3. ProMMP2
TIMP-2-free human proMMP2 was expressed in ras/myc-transformed Timp2−/− fibroblasts. Cells were grown in Dulbecco׳s modified Eagle׳s medium with 10% cosmic calf serum (HyClone Laboratories Inc), transfected with MMP2-pGW1HG, and selected by using 25 μg/ml mycophenolic acid. Serum-free conditioned medium was harvested from roller bottles and proMMP2 was purified at 4 °C in MES buffer by (i) gelatin-Sepharose (Amersham Biosciences) chromatography. After elution with 10% dimethyl sulfoxide in HEPES buffer, samples were dialyzed into MES buffer, and loaded in tandem onto (ii) lentil lectin-Sepharose (Sigma) to remove MMP9 and fibronectin, and (iii) a 1 ml gelatin-Sepharose column to capture proMMP2. After elution using 10% dimethyl sulfoxide (gelatin-Sepharose column only), fractions containing proMMP2 were pooled and dialyzed into MES buffer.
2.1.4. ProMMP3
Recombinant C-terminally FLAG-tagged human proMMP3 was expressed from pGW1HG in CHO-K1 cells and purified from supernatants in MES buffer using a (i) Green-agarose (Sigma) column. After elution with 1 M NaCl, eluates were loaded on a (ii) Zn2+-chelating column (Amersham Biosciences) and chromatographed with an imidazole gradient. Fractions containing proMMP3 were pooled, dialyzed into Tris-buffered saline (TBS; 50 mM Tris, 150 mM NaCl, pH 7.4), and subsequently loaded onto an (iii) anti-FLAG-agarose column (Sigma). After elution with 100 mM glycine, pH 3.5, fractions were immediately adjusted to pH 7–8 using 1 M Tris pH 8.0, and fractions containing proMMP3 were pooled and dialyzed into HEPES buffer.
2.1.5. ProMMP8
CHO-K1 cells were transfected with pGW1HG-MMP8 and selected using 25 μg/ml mycophenolic acid (Invitrogen); conditioned medium was collected from roller bottles. To remove gelatinases (MMP2 and MMP9), culture supernatants were chromatographed over (i) gelatin-Sepharose 4B resin (Amersham Biosciences) connected in tandem with (ii) a Red-Sepharose CL-6B column (Amersham Biosciences). MMP8 was eluted from Red-Sepharose with 1 M NaCl in TBS, and (iii) loaded on a column of Zn2+-chelating Sepharose resin (Amersham Biosciences). Fractions containing MMP-8 were pooled and chromatographed over (iv) lentil lectin-agarose-Sepharose 4B (Sigma-Aldrich) and eluted with 100 mm α-D-methylmannopyranoside (Sigma) in TBS. Purified proMMP8 was buffer exchanged into collagenase assay buffer (50 mM Tris, 200 mM NaCl, 5 mM CaCl2, 0.05% Brij-35, pH 7.4) using a PD-10 Sephadex G-25 column (Amersham Biosciences).
2.1.6. ProMMP9
Human MMP9 was expressed from pGW1HG in CHO-K1 cells with 25 μg/ml mycophenolic acid for selection. ProMMP9 was captured from conditioned medium on a gelatin-Sepharose column (Amersham Biosciences) in MES buffer, and eluted after extensive washing in MES buffer supplemented with 10% dimethyl sulfoxide. ProMMP-9 was dialyzed into HEPES buffer.
2.1.7. ProMMP13
Recombinant C-terminally FLAG-tagged human MMP13 was expressed in CHO-K1 cells from pGW1HG and purified from culture supernatants using a (i) green-agarose column (Sigma). After extensive washing with MES buffer, bound protein was eluted using 1 M NaCl, and fractions containing MMP-13 were dialyzed into TBS. MMP13 was then purified to homogeneity using an (ii) anti-FLAG-agarose column (Sigma). After elution with 100 mM glycine (pH 3.5), fractions were immediately adjusted to pH 7–8 using 1 M Tris pH 8.0. ProMMP13 was dialyzed into HEPES buffer.
2.1.8. ProMMP14
Soluble MT1-MMP with a FLAG tag in place of the transmembrane and cytoplasmic domains was cloned into pPIC9 (Invitrogen) and expressed in Pichia GS115 cells (Invitrogen). Cells were grown in 500 ml baffled flasks. After 24 h, 0.5% methanol was added to induce recombinant protein expression. Culture medium was diluted in MES buffer and MT1-MMP was purified using a red agarose column (Sigma). After extensive washing with MES buffer, protein was eluted with 1 M NaCl, and fractions containing proMMP14 were pooled and dialyzed into HEPES buffer.
3. PICS peptide library preparation.
Human whole proteome-derived peptide libraries for MMP specificity profiling were prepared as described in great detail in Nature Protocols [9]: in brief, cell pellets were collected from human lymphoblast cell K562 cultures and lysed in 20 mM HEPES (pH 7.5) supplemented with 0.1% (w/v) SDS and protease inhibitors to prevent unwanted proteolysis (1×Roche cOmplete plus 1 mM PMSF and 10 mM EDTA). Cell debris was removed by centrifugation (26,000g, 1 h, 4 °C); soluble proteins were denatured using guanidine hydrochloride (4 M), and cysteine side-chains were reduced with 20 mM DTT (1 h, 37 °C). Free sulfhydryl groups were protected with 40 mM iodoacetamide (3 h, 20 °C) to avoid peptide crosslinking and reactions were stopped by adding more DTT (5 mM, 15 min, 20 °C). Reaction clean-up was performed using chloroform/methanol precipitation as described elsewhere [48], pellets were air-dried and re-suspended in 100 mM HEPES, 5 mM CaCl2, pH 7.5, and digested with TPCK-treated trypsin or GluC (Staphylococcus aureus protease V8, Worthington) at a protease to proteome ratio of 1:100 (w/w) overnight at 37 °C. Note, another protease often used for PICS library preparation is chymotrypsin. After inactivation of trypsin/GluC with 1 mM PMSF (30 min, 20 °C), undigested protein aggregates were removed by centrifugation (20,000g, 10 min, 4 °C). Primary amines of peptide N-termini (α-amines) and lysine side chains (ε-amines) were blocked by reductive dimethylation with 30 mM formaldehyde (CH2O) and 15 mM sodium cyanoborohydride (NaCNBH3, Sterogene) at 20 °C for 16 h overnight (pH 6–7). To ensure completeness of amine blocking, another 15 mM formaldehyde and 15 mM sodium cyanoborohydride were added and incubated for additional two hours. Samples were desalted by size exclusion chromatography using Sephadex G-10 columns (10 mM potassium phosphate buffer, pH 2.7, 10% (v/v) methanol), and after methanol removal by vacuum concentration (SpeedVac, Thermo), peptides were purified by reversed-phase chromatography on an ÄKTA™ high-performance liquid chromatography system (Äkta Explorer, GE Healthcare) using a RESOURCE RPC column (GE Healthcare); wash buffer contained 0.3% (v/v) formic acid, and samples were eluted in 80% (v/v) acetonitrile, both in HPLC-grade H2O. These PICS peptide libraries were concentrated by rotary evaporation under vacuum, re-suspended in water, and stored in 200–400 µg aliquots of 5–15 mg/ml at −80 °C until use. Peptide concentration was estimated using the bicinchoninic assay (BCA, Pierce). All reagents were purchased from Sigma-Aldrich unless otherwise specified.
4. PICS cleavage site specificity assay.
MMP cleavage assays were performed by incubation of 200–400 µg human whole-proteome peptide library with active recombinant MMP at a protease to peptide library ratio of 1:100 (w/w) in 50 mM HEPES, 150 mM NaCl, 5 mM CaCl2 at pH 7.4, overnight, and stopped by heat inactivation at 70 °C for 30 min. Prime-side cleavage products generated by MMP cleavage were subsequently isolated by positive enrichment using the biotin handle. In short, cleaved peptides with a free primary amine at the N-terminus generated by MMP activity were biotinylated by incubation with 0.5 mM Sulfo-NHS-SS-Biotin, an amine-reactive biotin with a redox-sensitive and thus cleavable disulfide linker (Thermo Scientific) for 2 h at 20 °C. Biotinylated prime-side cleavage products were separated from uncleaved peptides by affinity purification, incubating with 300 μl Streptavidin Sepharose slurry (GE Healthcare) for 2 h with mild agitation. After extensive washing (50 mM HEPES, pH 7.2), biotinylated peptides were eluted with 20 mM DTT (2 h, 20 °C), desalted using reversed-phase solid phase extraction (Sep-Pak C18, Waters) with binding and washing in 0.1% (v/v) formic acid and elution in 80% (v/v) acetonitrile, both in HPLC-grade H2O. Eluates were vacuum dried to near dryness using a SpeedVac concentrator (Thermo), brought to 10 μl with 0.1% formic acid, and stored at −80 °C until LC–MS/MS analysis.
5. LC–MS/MS, peptide spectrum matching, and data analysis
LC-MS/MS analysis was performed using an LC Packings capillary LC system (Dionex) coupled online to a quadrupole time-of-flight mass spectrometer operated either by the UBC Center for Blood Research Mass Spectrometry Suite (QSTAR XL; Applied Biosystems), or by the UBC Proteomics Core Facility (QSTAR Pulsar I, Applied Biosystems). Samples were diluted in 0.3% (v/v) formic acid and loaded onto a column packed with Magic C18 resin (Michrom Bioresources). Peptides were eluted using a 2–80% (v/v) acetonitrile gradient in 0.1% (v/v) formic acid over 95 min. MS/MS data were acquired automatically, using Analyst QS software, v1.1 (Applied Biosystems) for data-dependent acquisition based on a 1 s MS survey scan from 350 m/z to 1500 m/z, followed by up to 3 MS/MS scans of 2 s each. Single charged ions were excluded because in ESI mode, peptides typically carry multiple charges. Centroids were calculated for the acquired data that was converted to mzXML format using msConvert [49]. Peptides were identified from the human UniProtKB/SwissProt database containing canonical and isoform protein sequences (downloaded October 2013) using the search engine X!Tandem [2] in conjunction with PeptideProphet [11], both implemented in the Trans Proteomic Pipeline v4.3 [10], at an estimated false discovery rate (FDR) of 1%. Search parameters included a mass tolerance of 200 ppm for parental ions and 0.2 Da (Da) for fragment ions, allowing up to two missed cleavages. The following fixed peptide modifications were set: carbamidomethylation of cysteine side chains (+57.02 Da) and dimethylation of lysine Ɛ-amines (+28.03 Da); methionine oxidation (+15.99 Da). N-terminal dimethylation (+28.03 Da) and thioacylation (+88.00 Da) were set as variables. Note, N-terminally thioacylated peptides identified by LC–MS/MS represent prime-side cleavage products of the proteases of interest. The complete cleavage sites were reconstructed bioinformatically using the open web-based program WebPICS [12], available at http://clipserve.clip.ubc.ca/pics/, which generates a non-redundant list of identified cleavage sites by matching each prime side peptide sequence to the human IPI database (v3.69, 174784 entries; EMBL-EBI, UK) and extracting the non-prime cleavage side sequence up to the next cleavage site of the enzyme used for library generation, i.e. to the next N-terminal Asp or Glu for GluC-libraries, or the next N-terminal Arg or Lys in the case of trypsin-generated libraries. Subsite positions with ambiguous information coming e.g. from different protein isoforms are omitted and replaced by X for further analysis. Identified cleavage sites can be summarized as heat maps, by using e.g. Gnuplot (www.gnuplot.info), or iceLogos (http://iomics.ugent.be/icelogoserver/index.html) [29]
6. Quenched fluorescence protease activity assay
Synthetic quenched fluorescent (QF) peptides were purchased from ChinaPeptides Co. Ltd. (Shanghai, China), dissolved in DMSO and protected from light. Working stocks (100 μM) were prepared in DMSO using the molar extinction coefficient of the conjugated quencher (DNP; (2,4)-dinitrophenyl) of 6.985 cm−1mM−1 at 400 nm [50]. MMP zymogens (MMP1, MMP2, MMP8, MMP9, MMP14) were activated in 100 mM Tris, pH 7.5, 100 mM NaCl, 10 mM CaCl2, and 0.05% Brij-35, using 1 mM APMA (para-aminophenylmercuric acetate) at 37 °C for 30 min. Chymotrypsin was used to activate MMP3 at a ratio of 1:100 (w/w) for 30 min at 37 °C, and was subsequently inactivated using 1 mM PMSF. MMP7, MMP12, and MMP13 were typically auto-activated during purification. Quenched fluorescent peptide assays were performed immediately after MMP activation in the presence of protease inhibitor cocktail (HALT™, Life Technologies, no EDTA added) using a multi-wavelength fluorescence scanner (POLARstar OPTIMA, BMG Labtech). Each MMP (1–10 nM) was incubated with 1 µM QF-peptide in 100 µL of 100 mM Tris, pH 7.5, 100 mM NaCl, 10 mM CaCl2, and 0.05% Brij-35, and the increase in fluorescence was measured at 45 s intervals for 1 h at 37 °C. The excitation and emission wavelengths were set to 320 and 405 nm, respectively, and all measurements were performed in duplicate. Experiments were repeated three times with independent substrate and MMP preparations on consecutive days.
Acknowledgments
C.M.O. holds a Canada Research Chair in Metalloproteinase Proteomics and Systems Biology. J.H.C and A.E.S were supported by graduate fellowships from Natural Sciences and Engineering Research Council of Canada (NSERC), Canadian Institutes of Health Research (CIHR), and Michael Smith Foundation for Health Research (MSFHR). A.P. and G.M. were co-funded by post-doctoral fellowships from the UBC Center for Blood Research. P.F.L. was supported by a Feodor Lynen Research Fellowship of the Alexander von Humboldt Foundation, and O.S. and U.a.d.K were supported by fellowships from the German Research Foundation (DFG). The German Academic Exchange Service (DAAD) and the MSFHR supported P.F.H. U.E. and A.D. were supported by a post-doctoral fellowship from MSFHR, and C.L.B. was supported by postdoctoral fellowships of the Swiss National Science Foundation and the Novartis Jubilee Foundation. This work was supported by project grants from CIHR (MOP-11433, MOP-37937, and MOP-111055), and infrastructure grants from both the MSFHR and the Canada Foundations for Innovation (CFI). We thank the Pride Team (http://www.ebi.ac.uk/services/teams/pride), especially Tobias Ternent and Attila Csordas, for excellent assistance with MS data deposition, Jason Rogalski, Wei Chen, and Suzanne Perry from the University of British Columbia Proteomics Core Facility (PCF) and the UBC Centre for Blood Research Mass Spectrometry Suite for LC–MS/MS measurements, and all current and former members of the Overall Lab for their continuous support and inspiring discussions.
Footnotes
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.dib.2016.02.036.
Appendix A. Supplementary material
References
- 1.Eckhard U., Huesgen P.F., Schilling O., Bellac C.L., Butler G.S., Cox J.H. Active site specificity of the matrix metalloproteinase family: proteomic identification of 4300 cleavage sites by nine MMPs explored with structural and synthetic peptide cleavage analyses. Matrix Biol.: J. Int. Soc. Matrix Biol. 2015 doi: 10.1016/j.matbio.2015.09.003. [DOI] [PubMed] [Google Scholar]
- 2.Craig R., Beavis R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004;20:1466–1467. doi: 10.1093/bioinformatics/bth092. [DOI] [PubMed] [Google Scholar]
- 3.Morrison C.J., Overall C.M. TIMP independence of matrix metalloproteinase (MMP)-2 activation by membrane type 2 (MT2)-MMP is determined by contributions of both the MT2-MMP catalytic and hemopexin C domains. J. Biol. Chem. 2006;281:26528–26539. doi: 10.1074/jbc.M603331200. [DOI] [PubMed] [Google Scholar]
- 4.Butler G.S., Tam E.M., Overall C.M. The canonical methionine 392 of matrix metalloproteinase 2 (gelatinase A) is not required for catalytic efficiency or structural integrity: probing the role of the methionine-turn in the metzincin metalloprotease superfamily. J. Biol. Chem. 2004;279:15615–15620. doi: 10.1074/jbc.M312727200. [DOI] [PubMed] [Google Scholar]
- 5.Pelman G.R., Morrison C.J., Overall C.M. Pivotal molecular determinants of peptidic and collagen triple helicase activities reside in the S3′ subsite of matrix metalloproteinase 8 (MMP-8): the role of hydrogen bonding potential of ASN188 and TYR189 and the connecting cis bond. J. Biol. Chem. 2005;280:2370–2377. doi: 10.1074/jbc.M409603200. [DOI] [PubMed] [Google Scholar]
- 6.Morrison C., Mancini S., Cipollone J., Kappelhoff R., Roskelley C., Overall C. Microarray and proteomic analysis of breast cancer cell and osteoblast co-cultures: role of osteoblast matrix metalloproteinase (MMP)-13 in bone metastasis. J. Biol. Chem. 2011;286:34271–34285. doi: 10.1074/jbc.M111.222513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wang Z., Juttermann R., Soloway P.D. TIMP-2 is required for efficient activation of proMMP-2 in vivo. J. Biol. Chem. 2000;275:26411–26415. doi: 10.1074/jbc.M001270200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Schilling O., Overall C.M. Proteome-derived, database-searchable peptide libraries for identifying protease cleavage sites. Nat. Biotechnol. 2008;26:685–694. doi: 10.1038/nbt1408. [DOI] [PubMed] [Google Scholar]
- 9.Schilling O., Huesgen P.F., Barré O., Keller U. Auf dem, Overall C.M. Characterization of the prime and non-prime active site specificities of proteases by proteome-derived peptide libraries and tandem mass spectrometry. Nat. Protoc. 2011;6:111–120. doi: 10.1038/nprot.2010.178. [DOI] [PubMed] [Google Scholar]
- 10.Deutsch E.W., Mendoza L., Shteynberg D., Slagel J., Sun Z., Moritz R.L. Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteom. Clin. Appl. 2015 doi: 10.1002/prca.201400164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Keller A., Nesvizhskii A.I., Kolker E., Aebersold R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002;74:5383–5392. doi: 10.1021/ac025747h. [DOI] [PubMed] [Google Scholar]
- 12.Schilling O., auf dem Keller U., Overall C.M. Factor Xa subsite mapping by proteome-derived peptide libraries improved using WebPICS, a resource for proteomic identification of cleavage sites. Biol. Chem. 2011;392:1031–1037. doi: 10.1515/BC.2011.158. [DOI] [PubMed] [Google Scholar]
- 13.Perkins D.N., Pappin D.J., Creasy D.M., Cottrell J.S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999;20:3551–3567. doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2. AID-ELPS3551>3.0.CO;2-2. [DOI] [PubMed] [Google Scholar]
- 14.Kim S., Pevzner P.A. MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 2014;5:5277. doi: 10.1038/ncomms6277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Eng J.K., Jahan T.A., Hoopmann M.R. Comet: an open-source MS/MS sequence database search tool. Proteomics. 2013;13:22–24. doi: 10.1002/pmic.201200439. [DOI] [PubMed] [Google Scholar]
- 16.Dorfer V., Pichler P., Stranzl T., Stadlmann J., Taus T., Winkler S. MS Amanda, a universal identification algorithm optimized for high accuracy tandem mass spectra. J. Proteome Res. 2014;13:3679–3684. doi: 10.1021/pr500202e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Shteynberg D., Deutsch E.W., Lam H., Eng J.K., Sun Z., Tasman N. iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell. Proteom. 2011;10 doi: 10.1074/mcp.M111.007690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Vaudel M., Burkhart J.M., Zahedi R.P., Oveland E., Berven F.S., Sickmann A. PeptideShaker enables reanalysis of MS-derived proteomics data sets. Nat. Biotechnol. 2015;33:22–24. doi: 10.1038/nbt.3109. [DOI] [PubMed] [Google Scholar]
- 19.Fuchs J.E., von Grafenstein S., Huber R.G., Margreiter M.A., Spitzer G.M., Wallnoefer H.G. Cleavage entropy as quantitative measure of protease specificity. Plos Comput. Biol. 2013;9:e1003007. doi: 10.1371/journal.pcbi.1003007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Perona J.J., Craik C.S. Evolutionary divergence of substrate specificity within the chymotrypsin-like serine protease fold. J. Biol. Chem. 1997;272:29987–29990. doi: 10.1074/jbc.272.48.29987. [DOI] [PubMed] [Google Scholar]
- 21.Ratnikov B.I., Cieplak P., Gramatikoff K., Pierce J., Eroshkin A., Igarashi Y. Basis for substrate recognition and distinction by matrix metalloproteinases. Proc. Natl. Acad. Sci. USA. 2014;111:E4148–E4155. doi: 10.1073/pnas.1406134111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Fuchs J.E., von Grafenstein S., Huber R.G., Kramer C., Liedl K.R. Substrate-driven mapping of the degradome by comparison of sequence logos. Plos Comput. Biol. 2013;9:e1003353. doi: 10.1371/journal.pcbi.1003353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.UniProt Consortium UniProt: a hub for protein information. Nucleic Acids Res. 2015;43:D204–D212. doi: 10.1093/nar/gku989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Finn R.D., Bateman A., Clements J., Coggill P., Eberhardt R.Y., Eddy S.R. Pfam: the protein families database. Nucleic Acids Res. 2014;42:D222–D230. doi: 10.1093/nar/gkt1223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Rawlings N.D., Waller M., Barrett A.J., Bateman A. MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2014;42:D503–D509. doi: 10.1093/nar/gkt953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lange P.F., Overall C.M. TopFIND, a knowledgebase linking protein termini with function. Nat. Methods. 2011;8:703–704. doi: 10.1038/nmeth.1669. [DOI] [PubMed] [Google Scholar]
- 27.Lange P.F., Huesgen P.F., Overall C.M. TopFIND 2.0--linking protein termini with proteolytic processing and modifications altering protein function. Nucleic Acids Res. 2012;40:D351–D361. doi: 10.1093/nar/gkr1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Fortelny N., Yang S., Pavlidis P., Lange P.F., Overall C.M. Proteome TopFIND 3.0 with TopFINDer and PathFINDer: database and analysis tools for the association of protein termini to pre- and post-translational events. Nucleic Acids Res. 2015;43:D290–D297. doi: 10.1093/nar/gku1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Colaert N., Helsens K., Martens L., Vandekerckhove J., Gevaert K. Improved visualization of protein consensus sequences by iceLogo. Nat. Methods. 2009;6:786–787. doi: 10.1038/nmeth1109-786. [DOI] [PubMed] [Google Scholar]
- 30.Knight C.G., Willenbrock F., Murphy G. A novel coumarin-labelled peptide for sensitive continuous assays of the matrix metalloproteinases. FEBS Lett. 1992;296:263–266. doi: 10.1016/0014-5793(92)80300-6. [DOI] [PubMed] [Google Scholar]
- 31.Eckhard U., Huesgen P.F., Brandstetter H., Overall C.M. Proteomic protease specificity profiling of clostridial collagenases reveals their intrinsic nature as dedicated degraders of collagen. J. Proteom. 2014;100:102–114. doi: 10.1016/j.jprot.2013.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Marino G., Huesgen P.F., Eckhard U., Overall C.M., Schröder W.P., Funk C. Family-wide characterization of matrix metalloproteinases from Arabidopsis thaliana reveals their distinct proteolytic activity and cleavage site specificity. Biochem. J. 2014;457:335–346. doi: 10.1042/BJ20130196. [DOI] [PubMed] [Google Scholar]
- 33.Barré O., Dufour A., Eckhard U., Kappelhoff R., Béliveau F., Leduc R. Cleavage specificity analysis of six type II transmembrane serine proteases (TTSPs) using PICS with proteome-derived peptide libraries. Plos One. 2014;9:e105984. doi: 10.1371/journal.pone.0105984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Huesgen P.F., Lange P.F., Rogers L.D., Solis N., Eckhard U., Kleifeld O. LysargiNase mirrors trypsin for protein C-terminal and methylation-site identification. Nat. Methods. 2015;12:55–58. doi: 10.1038/nmeth.3177. [DOI] [PubMed] [Google Scholar]
- 35.Zelanis A., Huesgen P.F., Oliveira A.K., Tashima A.K., Serrano S.M.T., Overall C.M. Snake venom serine proteinases specificity mapping by proteomic identification of cleavage sites. J. Proteom. 2015;113:260–267. doi: 10.1016/j.jprot.2014.10.002. [DOI] [PubMed] [Google Scholar]
- 36.Kleifeld O., Doucet A., auf dem Keller U., Prudova A., Schilling O., Kainthan R.K. Isotopic labeling of terminal amines in complex samples identifies protein N-termini and protease cleavage products. Nat. Biotechnol. 2010;28:281–288. doi: 10.1038/nbt.1611. [DOI] [PubMed] [Google Scholar]
- 37.Kleifeld O., Doucet A., Prudova A., Keller U. auf dem, Gioia M., Kizhakkedathu J.N. Identifying and quantifying proteolytic events and the natural N terminome by terminal amine isotopic labeling of substrates. Nat. Protoc. 2011;6:1578–1611. doi: 10.1038/nprot.2011.382. [DOI] [PubMed] [Google Scholar]
- 38.Wilson C.H., Indarto D., Doucet A., Pogson L.D., Pitman M.R., McNicholas K. Identifying natural substrates for dipeptidyl peptidases 8 and 9 using terminal amine isotopic labeling of substrates (TAILS) reveals in vivo roles in cellular homeostasis and energy metabolism. J. Biol. Chem. 2013;288:13936–13949. doi: 10.1074/jbc.M112.445841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Prudova A., Keller U. auf dem, Butler G.S., Overall C.M. Multiplex N-terminome analysis of MMP-2 and MMP-9 substrate degradomes by iTRAQ-TAILS quantitative proteomics. Mol. Cell. Proteom. 2010;9:894–911. doi: 10.1074/mcp.M000050-MCP201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Starr A.E., Bellac C.L., Dufour A., Goebeler V., Overall C.M. Biochemical characterization and N-terminomics analysis of leukolysin, the membrane-type 6 matrix metalloprotease (MMP25): chemokine and vimentin cleavages enhance cell migration and macrophage phagocytic activities. J. Biol. Chem. 2012;287:13382–13395. doi: 10.1074/jbc.M111.314179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Becker-Pauly C., Barré O., Schilling O., Keller U. Auf dem, Ohler A., Broder C. Proteomic analyses reveal an acidic prime side specificity for the astacin metalloprotease family reflected by physiological substrates. Mol. Cell. Proteom. 2011;10 doi: 10.1074/mcp.M111.009233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Keller U. auf dem, Prudova A., Eckhard U., Fingleton B., Overall C.M. Systems-level analysis of proteolytic events in increased vascular permeability and complement activation in skin inflammation. Sci. Signal. 2013;6:rs2.. doi: 10.1126/scisignal.2003512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lange P.F., Huesgen P.F., Nguyen K., Overall C.M. Annotating N termini for the human proteome project: N termini and Nα-acetylation status differentiate stable cleaved protein species from degradation remnants in the human erythrocyte proteome. J. Proteome Res. 2014;13:2028–2044. doi: 10.1021/pr401191w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Marino G., Eckhard U., Overall C.M. Protein Termini and Their Modifications Revealed by Positional Proteomics. ACS Chem. Biol. 2015;10:1754–1764. doi: 10.1021/acschembio.5b00189. [DOI] [PubMed] [Google Scholar]
- 45.Prudova A., Serrano K., Eckhard U., Fortelny N., Devine D.V., Overall C.M. TAILS N-terminomics of human platelets reveals pervasive metalloproteinase-dependent proteolytic processing in storage. Blood. 2014;124:e49–e60. doi: 10.1182/blood-2014-04-569640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Eckhard U., Marino G., Abbey S.R., Tharmarajah G., Matthew I., Overall C.M. The human dental pulp proteome and N-terminome: levering the unexplored potential of semitryptic peptides enriched by TAILS to identify missing proteins in the human proteome project in underexplored tissues. J. Proteome Res. 2015;14:3568–3582. doi: 10.1021/acs.jproteome.5b00579. [DOI] [PubMed] [Google Scholar]
- 47.Eckhard U., Marino G., Abbey S.R., Matthew I., Overall C.M. TAILS N-terminomic and proteomic datasets of healthy human dental pulp. Data Brief. 2015;5:542–548. doi: 10.1016/j.dib.2015.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wessel D., Flügge U.I. A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal. Biochem. 1984;138:141–143. doi: 10.1016/0003-2697(84)90782-6. [DOI] [PubMed] [Google Scholar]
- 49.Chambers M.C., Maclean B., Burke R., Amodei D., Ruderman D.L., Neumann S. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 2012;30:918–920. doi: 10.1038/nbt.2377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Abel M., Iversen K., Planas A., Christensen U. Pre-steady-state kinetics of Bacillus licheniformis 1,3-1,4-beta-glucanase: evidence for a regulatory binding site. Biochem. J. 2003;371:997–1003. doi: 10.1042/BJ20021504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Vizcaíno J.A., Côté R.G., Csordas A., Dianes J.A., Fabregat A., Foster J.M. The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res. 2013;41:D1063–1069. doi: 10.1093/nar/gks1262. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.