Skip to main content
Molecular & Cellular Proteomics : MCP logoLink to Molecular & Cellular Proteomics : MCP
. 2014 Nov 30;14(2):405–417. doi: 10.1074/mcp.O114.041376

Preprocessing Significantly Improves the Peptide/Protein Identification Sensitivity of High-resolution Isobarically Labeled Tandem Mass Spectrometry Data*

Quanhu Sheng ‡,§,**, Rongxia Li ‡,**, Jie Dai ¶,**, Qingrun Li , Zhiduan Su , Yan Guo §, Chen Li , Yu Shyr §,, Rong Zeng ‡,
PMCID: PMC4350035  PMID: 25435543

Abstract

Isobaric labeling techniques coupled with high-resolution mass spectrometry have been widely employed in proteomic workflows requiring relative quantification. For each high-resolution tandem mass spectrum (MS/MS), isobaric labeling techniques can be used not only to quantify the peptide from different samples by reporter ions, but also to identify the peptide it is derived from. Because the ions related to isobaric labeling may act as noise in database searching, the MS/MS spectrum should be preprocessed before peptide or protein identification. In this article, we demonstrate that there are a lot of high-frequency, high-abundance isobaric related ions in the MS/MS spectrum, and removing isobaric related ions combined with deisotoping and deconvolution in MS/MS preprocessing procedures significantly improves the peptide/protein identification sensitivity. The user-friendly software package TurboRaw2MGF (v2.0) has been implemented for converting raw TIC data files to mascot generic format files and can be downloaded for free from https://github.com/shengqh/RCPA.Tools/releases as part of the software suite ProteomicsTools. The data have been deposited to the ProteomeXchange with identifier PXD000994.


Mass spectrometry-based proteomics has been widely applied to investigate protein mixtures derived from tissue, cell lysates, or from body fluids (1, 2). Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS)1 is the most popular strategy for protein/peptide mixtures analysis in shotgun proteomics (3). Large-scale protein/peptide mixtures are separated by liquid chromatography followed by online detection by tandem mass spectrometry. The capabilities of proteomics rely greatly on the performance of the mass spectrometer. With the improvement of MS technology, proteomics has benefited significantly from the high-resolution and excellent mass accuracy (4). In recent years, based on the higher efficiency of higher energy collision dissociation (HCD), a new “high–high” strategy (high-resolution MS as well as MS/MS(tandem MS)) has been applied instead of the “high–low” strategy (high-resolution MS, i.e. in Orbitrap, and low-resolution MS/MS, i.e. in ion trap) to obtain high quality tandem MS/MS data as well as full MS in shotgun proteomics. Both full MS scans and MS/MS scans can be performed, and the whole cycle time of MS detection is very compatible with the chromatographic time scale (5).

High-resolution measurement is one of the most important features in mass spectrometric application. In this high–high strategy, high-resolution and accurate spectra will be achieved in tandem MS/MS scans as well as full MS scans, which makes isotopic peaks distinguishable from one another, thus enabling the easy calculation of precise charge states and monoisotopic mass. During an LC-MS/MS experiment, a multiply charged precursor ion (peptide) is usually isolated and fragmented, and then the multiple charge states of the fragment ions are generated and collected. After full extraction of peak lists from original tandem mass spectra, the commonly used search engines (i.e. Mascot (6), Sequest (7)) have no capability to distinguish isotopic peaks and recognize charge states, so all of the product ions are considered as all charge state hypotheses during the database search for protein identification. These multiple charge states of fragment ions and their isotopic cluster peaks can be incorrectly assigned by the search engine, which can cause false peptide identification. To overcome this issue, data preprocessing of the high-resolution MS/MS spectra is required before submitting them for identification. There are usually two major preprocessing steps used for high-resolution MS/MS data: deisotoping and deconvolution (8, 9). Deisotoping of spectra removes all isotopic peaks except monoisotopic peaks from multi-isotopic peaks. Deconvolution of spectra translates multiply charged ions to singly charged ions and also accumulates the intensity of fragment ions by summing up all the intensities from their multiply charged states. After performing these two data-preprocessing steps, the resulting spectra is simpler and cleaner and allows more precise database searching and accurate bioinformatics analysis.

With the capacity to analyze multiple samples simultaneously, stable isotope labeling approaches have been widely used in quantitative proteomics. Stable isotope labeling approaches are categorized as metabolic labeling (SILAC, stable isotope labeling by amino acids in cell culture) and chemical labeling (10, 11). The peptides labeled by the SILAC approach are quantified by precursor ions in full MS spectra, whereas peptides that have been isobarically labeled using chemical means are quantified by reporter ions in MS/MS spectra. There are two similar isobaric chemical labeling methods: (1) isobaric tag for relative and absolute quantification (iTRAQ), and (2) tandem mass tag (TMT) (12, 13). These reagents contain an amino-reactive group that specifically reacts with N-terminal amino groups and epilson-amino groups of lysine residues to label digested peptides in a typical shotgun proteomics experiment. There are four different channels of isobaric tags: TMT two-plex, iTRAQ four-plex, TMT six-plex, and iTRAQ eight-plex (1216). The number before “plex” denotes the number of samples that can be analyzed by the mass spectrum simultaneously. Peptides labeled with different isotopic variants of the tag show identical or similar mass and appear as a single peak in full scans. This single peak may be selected for subsequent MS/MS analysis. In an MS/MS scan, the mass of reporter ions (114 to 117 for iTRAQ four-plex, 113 to 121 for iTRAQ eight-plex, and 126 to 131for TMT six-plex upon CID or HCD activation) are associated with corresponding samples, and the intensities represent the relative abundances of the labeled peptides. Meanwhile, the other ions from the MS/MS spectra can be used for peptide identification. Because of the multiplexing capability, isobaric labeling methods combined with bottom-up proteomics have been widely applied for accurate quantification of proteins on a global scale (14, 1719). Although mostly associated with peptide labeling, these isobaric labeling methods have also been applied at protein level (2023).

For the proteomic analysis of isobarically labeled peptides/proteins in “high–high” MS strategy, the common consensus is that accurate reporter ions can contribute to more accurate quantification. However, there is no evidence to show how the ions related to isobaric labeling affect the peptide/protein identification and what preprocessing steps should be taken for high-resolution isobarically labeled MS/MS. To demonstrate the effectiveness and importance of preprocessing, we examined how the combination of preprocessing steps improved peptide/protein sensitivity in database searching. Several combinatorial ways of data-preprocessing were applied for high-throughput data analysis including deisotoping to keep simple monoisotopic mass peaks, deconvolution of ions with multiple charge states, and preservation of top 10 peaks in every 100 Dalton mass range. After systematic analysis of high-resolution isobarically labeled spectra, we further processed the spectra and removed interferential ions that were not related to the peptide. Our results suggested that the preprocessing of isobarically labeled high-resolution tandem mass spectra significantly improved the peptide/protein identification sensitivity.

EXPERIMENTAL PROCEDURES

Sample Preparation

The Goto-Kakizaki (GK) rat liver tissue was respectively mixed with SDT-lysis buffer (2% SDS, 0.1 m DTT, and 0.1 m Tris-HCl, pH = 7.6) and then heated for 5 min at 100 °C. After that, the tissue layers were cooled to room temperature, sonicated 60 s at 100 w, and then centrifuged at 16,000 × g for 30 min at 20 °C for removing cell debris. The protein concentration was detected by measurements of tryptophan fluorescence as described (24). Briefly, 1 μl of sample or tryptophan standard (100 ng/μl) was added into 3 ml of 8 m urea buffer (8 m urea and 20 mm Tris-HCl, pH = 7.6). Fluorescence was excited at 295 nm and measured at 350 nm. The slits were set at 10 nm.

Six hundred micrograms of liver tissue from GK rat was digested by the FASP procedure as described (25) with small modifications. Each sample was transferred to a 10k filter (Pall Corporation, Port Washington, NY) and centrifuged at 10,000 × g for 20 min at 20 °C. 200 μl of UA buffer (8 m urea and 0.1 m Tris-HCl, pH = 8.5) was added and centrifuged at 10,000 × g for 20 min again. This step was repeated once. Then, the concentrate was mixed with 100 μl of 50 mm IAA in UA buffer and incubated for an additional 40 min at room temperature in darkness. After that, IAA was removed by centrifugation at 10,000 × g for 20 min. Following dilution with 200 μl of UA buffer and centrifugation twice, 200 μl of 200 mm triethylammonium bicarbonate (TEAB) buffer (pH 8.5) was added and centrifuged at 10,000 × g for 20 min. This step was repeated four times. Finally, 100 μl of 50 mm TEAB buffer (pH 8.5) and Trypsin (1:50, enzyme to protein) was added to the filter, and after 4 h, another 50 μg trypsin was added. The samples were digested 20 h at 37 °C and peptides were collected by centrifugation at 16,000 × g. To increase the yield of peptides, the filter was washed twice with 500 μl 0.5 m TEAB buffer (pH 8.5). The peptide solutions were dried in a vacuum concentrator.

The trypsin digestion of 100 μg protein from each sample was processed as described elsewhere. iTRAQ labeling was done following the manufacturer's instructions (AB SCIEX, Foster City, CA). Briefly, for each four- or eight-plex experiment, 100 μg of dried peptide mixture power from each digested sample was reconstituted with 30μl 0.5 mm TEAB Buffer (pH 8.5). Each peptide solution was labeled at room temperature for 2 h with one iTRAQ reagent vial (four-plex mass tag 114, 115, 116, 117 or eight-plex mass tag 113,114, 115, 116, 117, 118, 119, 121) previously reconstituted with 70 μl of anhydrous acetonitrile (ACN). After 2 h, 100 μl ddH2O were added to each tube to quench the iTRAQ reaction and incubated at room temperature for 30 min. The contents of all iTRAQ reagent-labeled sample tubes were combined into one tube for four or eight-plex experiments, respectively. Then, labeled samples were dried down by evaporation in a SpeedVac to obtain a brown pellet. 100 μl of water was added to the tube and the sample was dried completely. Prior to MS analysis, samples were desalted onto Empore C18 47 mm Disk (3 m). Just prior nano-LC, the fractions were resuspended in 20 μl of H2O with 0.1% (v/v) TFA.

LC-MS/MS Analysis

The reverse phase-high performance liquid chromatography (RP-HPLC) separation was achieved on an UltiMate 3000 RSLC nanoLC Systems (Dionex, now ThermoFisher Scientific) equipped with a self-packed tip column (75 μm × 240 mm; C18, 1.9 μm) using a 180 min gradient at a flow rate of 150 nl/min. An LTQ-Orbitrap Velos instrument (Thermo Fisher Scientific) was operated in data-dependent mode. MS full scans were acquired in ranges m/z 300–2000. The mass spectrometer was set so that each full MS scan was followed by the ten most intense ions for MS/MS with charge ≥ +2 with the following Dynamic Exclusion™ settings: repeat counts, 1; repeat duration, 30 s; exclusion duration, 180 s. The normalized collision energy for MS2 was 45.0%. Full MS scans and MS/MS scans were acquired at a resolution of 30,000 for profile-mode and 7500 for centroid-mode respectively, with a lock mass option enabled for the 445.120025 ion. Data were acquired using Xcalibur software.

b/y Free Windows

b/y free windows are two mass windows for a specific mass spectrum that no B ion or Y ion would be in. With the assumption that the mass of an isobaric tag was M, trypsin was used as protease and the isobaric tag was attached at both the N-terminal of peptide and lysine (K), for a spectrum with singly charged precursor mass MH+, the b/y free windows of that spectrum can be calculated as below. Because only full-tryptic peptides are considered in data analysis, the latest amino acid of the peptide will be either arginine (R) with mass 156 or lysine with mass 128. Given the fact that glycine (G) is the smallest amino acid with mass 57, the minimum and maximum mass of B and Y ions can be calculated as formula (1–4):

graphic file with name zjw00215-4953-m01.jpg
graphic file with name zjw00215-4953-m02.jpg
graphic file with name zjw00215-4953-m03.jpg
graphic file with name zjw00215-4953-m04.jpg

where H2O is the mass of water and H is the mass of hydrogen. Then, the b/y free window in the low mass range is from 0 to minimum (minimum (B), minimum (Y)) and the b/y free window in the high mass range is from maximum (maximum (B), maximum (Y)) to infinite.

Ion Frequency and Abundance Analysis

Only the spectra with precursor charges 2, 3, and 4 were used to detect high frequency ions. The ion frequency and ion abundance distribution in each sample were generated by software “Raw Ion Frequency Statistic Builder,” which was also a part of ProteomicsTools. The charge, mass to charge (m/z), and abundance of each ion were extracted from each MS/MS spectrum through Thermo's MS File Reader interface. The abundance of ions in each MS/MS was normalized to a uniform distribution [0..1]. The ions with relative abundance less than 0.01 were discarded. All remaining ions were deconvoluted to corresponding singly charged ions by formula (5). The ions without charge information were treated as a single charge state.

graphic file with name zjw00215-4953-m05.jpg

where H is the mass of hydrogen.

The ions in different deconvoluted spectra but with difference in masses less than 20 parts per million (ppm) were considered identical ions. The ion frequency and ion average relative abundance were calculated from all the MS/MS spectra in the sample. The ions with frequency larger than 0.3 and average relative abundance larger than 0.05 were defined as high frequency ions and classified to five categories: “Rep+,” “Label+,” “Y1,” “b/y free,” and “Unknown.” “Rep+” denotes that an ion is a reporter ion. “Label+” denotes that an ion is an isobaric tag ion with both reporter group and balance group. “Y1” denotes that an ion is a first Y series ion. Because trypsin was used in the sample preparation, a Y1 ion was produced from either lysine (K) or arginine (R). b/y free denotes that the mass of the ion is located in the b/y free windows of that spectrum. All other ions belonged to the “Unknown” category. An ion within one of the first four categories “Rep+, Label+, Y1, and b/y free) was considered annotated. For each deconvoluted tandem mass spectrum (forward spectrum), a backward spectrum was generated by using the mass of the precursor minus the mass of each forward ion. The backward ions were also filtered and annotated in the same fashion as the forward ions except that the ions with mass equal to “Label+” were marked as “Precursor-Label+.” “Precursor-Label+” denotes a precursor ion without the isobaric tag. The ions annotated as Rep+, Label+, and Precursor-Label+ are not related to the peptide and therefore can be confidently removed during data preprocessing. For the ions annotated as b/y free in low mass range, they are very likely not related to the peptide as well. But it is still possible that those ions are actually multiply charged ions that lack charge information in the spectrum.

Data Preprocessing

The tandem mass spectra were extracted by TurboRaw2MGF (v1.3.4) for database searching. Four fixed criteria were used to filter out low quality spectra: (1) the required precursor mass weight range was 400 to 5000 Daltons, (2) the minimum ion absolute abundance was 1.0, 3) the minimum ion count of a spectrum was 15, and 4) the minimum total ion absolute abundance of a spectrum was 100. Four processing options were also provided in TurboRaw2MGF including deisotoping to keep monoisotopic mass peaks, deconvolution of ions with multiple charge states, preservation of the top 10 peaks in every 100 Dalton mass range, and removing the ions that may not be related to the peptide. The spectra that passed the fixed criteria and were processed with a combination of the four options were saved in mascot generic format for further database searching.

Database Searching

Five engines were used for database searching, including Mascot (v2.2.2) (6), Comet (2014.01 rev. 1) (26), MyriMatch (v2.2.140) (27), OMSSA (v2.1.9) (28), and X! Tandem (2013.09.01.1) (29). All MS/MS spectra were searched against a composite target-decoy rat Uniprot database (Version 20120222), in which each protein sequence was followed by a reversed amino acid sequence. Trypsin was set as protease. Carbamidomethylation on cysteine (+57.021464), iTRAQ-labeling on N-terminal, and lysine were set as fixed modifications. Oxidation on methionine (+15.994915) was set as a variable modification. One missing cleavage site was allowed. The tolerances of peptides and fragment ions were set at 10 ppm and 0.02 Daltons respectively. SearchGUI (30) was used for MyriMatch and OMSSA searching. BuildSummary (31) was used to generate a confident protein list for both peptide and protein with a false discovery rate ≤ 0.01.

Software Development

We implemented our preprocessing steps in a user friendly software package named TurboRaw2MGF (v2.0). The previous version of TurboRaw2MGF was developed for low-resolution tandem mass spectra and was integrated into the package ProtQuantSuite (32). TurboRaw2MGF (v2.0) was developed using the C# programming language and was compiled in the Microsoft Visual Studio 2012 Professional Edition. The software is fully compatible with Windows-based operating systems with dotNET framework v4.5. TurboRaw2MGF (v2.0) and its source code can be downloaded freely from ln]https://github.com/shengqh/RCPA.Tools/releases/. The manual of TurboRaw2MGF (v2.0) can be viewed at https://github.com/shengqh/RCPA.Tools/wiki/.

Data Availability

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (33) via the PRIDE partner repository with the data set identifier PXD000994 and DOI 10.6019/PXD000994.

To access the data please visit: http://tinyurl.com/pdbkesj

Username: reviewer06796@ebi.ac.uk

Password: jWjYoiuT

RESULTS

Isobaric Related Mass Range

Table I illustrates some important ion properties in isobaric labeling methods. For iTRAQ4 spectra, the mass of a Label+ ion is within the low mass b/y free window, and the mass of a Precursor-Label+ ion is also within the high mass b/y free window. The isobaric related mass ranges include both low and high b/y free windows. For iTRAQ8 spectra, the mass of a Label+ ion is not within the low mass b/y free window and the mass of a Precursor-Label+ ion is also not within the high mass b/y free window. The isobaric related mass ranges not only include both low and high b/y free windows but also include the mass range around Label+ ion and Precursor-Label+ ion within a specific tolerance, which was 20 ppm in our study.

Table I. Ion characteristics of isobaric labeling methods.
Property iTRAQ4 iTRAQ8
aRep+ 114–117 113–119,121
bLabel+ 145 305
Isobaric ion mass 144 304
Minimum B ion 202 362
Minimum Y ion 175 175
Low mass b/y free window 0∼175 0∼175
Maximum B ion MH+-174 MH+-174
Maximum Y ion MH+-201 MH+-361
High mass b/y free window MH+-174∼INF MH+-174∼INF
cPrecursor-Label+ MH+-144 MH+-304

a Rep+: reporter ions.

b Label+: isobaric tag ion.

c Precursor-Label+: the precursor ion without the isobaric tag.

Ion Frequency and Abundance

Tables II and III show the high frequency forward ions in iTRAQ4 and iTRAQ8 tandem mass spectra respectively. Almost all high frequency forward ions in iTRAQ4 tandem mass spectra were annotated, except 429.0888. Even with the majority of high frequency ions annotated, there were still more ions left unannotated in the iTRAQ8 tandem mass spectra than in iTRAQ4 tandem mass spectra.

Table II. High frequency ions in iTRAQ4 tandem mass spectra.

Rep+: singly charged reporter ion, Label+: singly charged isobaric tagion, Y1(R): y1 ion from peptide with 3′ terminal amino acid R, Y1(K): y1 ion from peptide with 3′ terminal amino acid K, b/y free: ion in the b/y free windows.

Charge Ion Count Frequency Meana S.D.b Medianc Annotation
2 116.1111 37123 0.997 0.837 0.274 1 Rep+
115.1078 37101 0.9964 0.744 0.243 0.844 Rep+
117.1144 37090 0.9961 0.799 0.269 0.934 Rep+
114.1107 37076 0.9957 0.751 0.253 0.858 Rep+
145.108 36251 0.9735 0.206 0.114 0.194 Label+
291.2155 35287 0.9477 0.238 0.183 0.201 Y1(K)
110.0712 34004 0.9132 0.15 0.196 0.078 b/y free
175.119 33280 0.8938 0.2 0.202 0.111 Y1(R)
120.0807 25310 0.6797 0.149 0.194 0.077 b/y free
158.0924 23172 0.6223 0.076 0.061 0.061 b/y free
429.0889 15170 0.4074 0.304 0.378 0.087 Unknown
3 116.111 23680 0.9917 0.433 0.256 0.383 Rep+
115.1077 23666 0.9911 0.399 0.243 0.347 Rep+
117.1144 23644 0.9902 0.41 0.244 0.363 Rep+
114.1107 23639 0.99 0.387 0.234 0.339 Rep+
291.2155 22582 0.9457 0.294 0.247 0.225 Y1(K)
145.1079 22367 0.9367 0.127 0.092 0.103 Label+
110.0712 21007 0.8798 0.285 0.309 0.149 b/y free
175.119 19229 0.8053 0.133 0.119 0.098 Y1(R)
120.0807 17309 0.7249 0.22 0.266 0.105 b/y free
429.0888 13624 0.5706 0.261 0.329 0.098 Unknown
136.0756 13377 0.5602 0.153 0.225 0.06 b/y free
258.1936 9096 0.3809 0.08 0.082 0.054 Unknown
404.2996 7790 0.3262 0.167 0.225 0.063 Unknown
101.0709 7675 0.3214 0.078 0.075 0.05 b/y free
4 115.1077 3570 0.938 0.123 0.089 0.101 Rep+
116.111 3566 0.9369 0.131 0.089 0.112 Rep+
117.1144 3552 0.9333 0.125 0.085 0.107 Rep+
114.1107 3533 0.9283 0.118 0.083 0.099 Rep+
291.2155 3287 0.8636 0.178 0.197 0.103 Y1(K)
110.0712 2770 0.7278 0.216 0.257 0.103 b/y free
120.0808 2314 0.608 0.181 0.238 0.082 b/y free
429.0888 1976 0.5192 0.255 0.333 0.087 Unknown
163.1188 1659 0.4359 0.249 0.333 0.062 b/y free
136.0756 1594 0.4188 0.15 0.221 0.056 b/y free

a mean of relative abundance.

b stand deviation of relative abundance.

c median of relative abundance.

Table III. High frequency ions in iTRAQ8 tandem mass spectra. Rep+: singly charged reporter ion, Label+: singly charged isobaric tag ion, Y1(R): y1 ion from peptide with 3′ terminal amino acid R, Y1(K): y1 ion from peptide with 3′ terminal amino acid K, b/y free: ion in the b/y free window.
Charge Ion Count Frequency Meana S.D.b Medianc Annotation
2 115.1078 28359 0.9886 0.608 0.303 0.725 Rep+
119.1148 28350 0.9883 0.642 0.322 0.761 Rep+
114.1107 28346 0.9881 0.642 0.32 0.765 Rep+
118.1115 28342 0.988 0.646 0.316 0.776 Rep+
117.1144 28327 0.9875 0.661 0.337 0.782 Rep+
116.1111 28325 0.9874 0.589 0.312 0.653 Rep+
113.1074 28271 0.9855 0.599 0.3 0.713 Rep+
121.1215 28268 0.9854 0.626 0.321 0.742 Rep+
201.1842 27382 0.9545 0.27 0.154 0.273 Unknown
203.1837 26311 0.9172 0.16 0.087 0.163 Unknown
219.1944 26252 0.9152 0.145 0.083 0.136 Unknown
160.088 26010 0.9067 0.145 0.165 0.087 b/y free
143.1008 25618 0.893 0.084 0.042 0.085 b/y free
141.0939 25438 0.8868 0.091 0.042 0.093 b/y free
161.1106 25354 0.8838 0.082 0.062 0.076 b/y free
305.2095 25064 0.8737 0.103 0.06 0.094 Label+
110.0713 24882 0.8674 0.186 0.215 0.106 b/y free
147.1092 24818 0.8652 0.08 0.036 0.08 b/y free
159.1038 24575 0.8567 0.1 0.1 0.083 b/y free
221.1944 24406 0.8508 0.085 0.047 0.08 Unknown
162.0947 24395 0.8504 0.095 0.108 0.06 b/y free
163.0982 23646 0.8243 0.147 0.155 0.094 b/y free
175.119 23383 0.8151 0.26 0.277 0.13 Y1(R)
205.1905 23329 0.8133 0.084 0.041 0.086 Unknown
163.1109 21118 0.7362 0.097 0.082 0.084 b/y free
451.3162 19723 0.6875 0.114 0.099 0.09 Y1(K)
145.1021 19367 0.6751 0.093 0.048 0.094 b/y free
120.0807 18968 0.6612 0.21 0.251 0.111 b/y free
136.0756 15912 0.5547 0.112 0.159 0.054 b/y free
418.2941 14626 0.5099 0.189 0.235 0.082 Unknown
429.0889 12637 0.4405 0.288 0.346 0.11 Unknown
158.0923 12555 0.4377 0.11 0.078 0.096 b/y free
112.0869 12275 0.4279 0.062 0.039 0.058 b/y free
3 114.1107 22905 0.9831 0.262 0.166 0.225 Rep+
117.1144 22904 0.983 0.268 0.169 0.23 Rep+
115.1077 22897 0.9827 0.248 0.161 0.211 Rep+
118.1115 22886 0.9823 0.264 0.169 0.225 Rep+
119.1147 22865 0.9814 0.263 0.167 0.225 Rep+
113.1073 22864 0.9813 0.248 0.161 0.211 Rep+
121.1215 22846 0.9806 0.254 0.164 0.216 Rep+
116.1111 22831 0.9799 0.237 0.158 0.201 Rep+
219.1944 22189 0.9524 0.165 0.103 0.151 Unknown
201.1841 21927 0.9411 0.142 0.089 0.128 Unknown
305.2095 21599 0.927 0.12 0.075 0.107 Label+
451.3159 21094 0.9054 0.344 0.291 0.263 Y1(K)
221.1943 20777 0.8918 0.095 0.059 0.086 Unknown
203.1837 20343 0.8731 0.084 0.051 0.075 Unknown
143.1011 19850 0.852 0.06 0.036 0.052 b/y free
147.1093 19684 0.8448 0.065 0.037 0.057 b/y free
160.0879 19452 0.8349 0.111 0.126 0.069 b/y free
110.0712 19252 0.8263 0.257 0.3 0.123 b/y free
163.098 18737 0.8042 0.103 0.115 0.065 b/y free
141.094 17633 0.7568 0.064 0.039 0.056 b/y free
120.0807 17103 0.7341 0.253 0.292 0.12 b/y free
175.119 16820 0.7219 0.118 0.098 0.09 Y1(R)
323.2203 15042 0.6456 0.095 0.137 0.051 Unknown
191.1294 14861 0.6378 0.087 0.111 0.052 Unknown
418.294 14369 0.6167 0.203 0.249 0.091 Unknown
188.1196 14347 0.6158 0.088 0.113 0.051 Unknown
153.1084 12205 0.5238 0.183 0.248 0.078 b/y free
136.0756 11924 0.5118 0.174 0.25 0.065 b/y free
145.1078 11454 0.4916 0.069 0.039 0.061 b/y free
429.0889 10728 0.4604 0.219 0.293 0.082 Unknown
102.0549 10138 0.4351 0.089 0.088 0.058 b/y free
145.1023 10108 0.4338 0.071 0.045 0.061 b/y free
101.0709 7528 0.3231 0.078 0.074 0.053 b/y free
404.2781 7393 0.3173 0.168 0.229 0.065 Unknown
4 114.1107 3901 0.8978 0.084 0.051 0.074 Rep+
117.1144 3881 0.8932 0.087 0.057 0.076 Rep+
219.1944 3871 0.8909 0.095 0.06 0.086 Unknown
118.1115 3868 0.8902 0.085 0.058 0.073 Rep+
113.1073 3867 0.89 0.081 0.051 0.07 Rep+
119.1148 3853 0.8868 0.085 0.055 0.074 Rep+
451.3159 3851 0.8863 0.333 0.299 0.234 Y1(K)
121.1215 3833 0.8822 0.083 0.052 0.073 Rep+
115.1077 3831 0.8817 0.08 0.053 0.068 Rep+
116.1111 3776 0.869 0.076 0.051 0.066 Rep+
305.2096 3643 0.8384 0.082 0.059 0.068 Label+
221.1943 3282 0.7554 0.057 0.033 0.051 Unknown
201.1842 3256 0.7494 0.063 0.045 0.054 Unknown
120.0808 2995 0.6893 0.238 0.27 0.117 b/y free
110.0713 2600 0.5984 0.154 0.198 0.076 b/y free
429.0889 2444 0.5625 0.227 0.289 0.097 Unknown
418.2936 2086 0.4801 0.142 0.204 0.06 Unknown
452.3188 2013 0.4633 0.069 0.046 0.058 Unknown
153.1084 2000 0.4603 0.214 0.285 0.087 b/y free
136.0755 1783 0.4104 0.154 0.21 0.062 b/y free
102.0549 1611 0.3708 0.08 0.078 0.056 b/y free

a mean of relative abundance.

b stand deviation of relative abundance.

c median of relative abundance.

For backward ions, only 144.1 (frequency = 0.3316, median of abundance = 0.207) from iTRAQ4 tandem mass spectra with double precursor charge and 304.1997 (frequency = 0.380, median of abundance = 0.199) from iTRAQ8 tandem mass spectra with double precursor charge passed the criteria. Both ions were annotated as Precursor-Label+.

Also, the frequency and abundance of reporter ions in both iTRAQ4 and iTRAQ8 data sets were decreased while the corresponding precursor charge increased.

Identification Sensitivity Improvement

We evaluated how the combination of preprocessing steps affected the peptide/protein identification sensitivity under the same peptide/protein false discovery rate 0.01. Table IV indicated 16 methods with different combination of five processing options used in the data preprocessing.

Table IV. 16 preprocessing methods with different combinations of three preprocessing steps.
Method Deisotoping
Top 10 Remove Isobaric Ions
Deconvolution Lowc Labelb Higha
1
2 +
3 +
4 + +
5 +
6 +
7 +
8 + + +
9 + +
10 + +
11 + +
12 + + + +
13 + + +
14 + + +
15 + + +
16 + + + + +

a b/y free window in high mass range.

b reporter and isobaric tag ions.

c b/y free window in low mass range.

Fig. 1 illustrates the identification results from iTRAQ4 and iTRAQ8 data sets using five search engines. The bigger the point of a method in the graph, the more identification that method achieved in the same engine and same isobaric labeling method. The red circle indicates the preprocessing method that achieved the most identification among all 16 methods. In iTRAQ4 data set, Mascot, MyriMatch, OMSSA, and X! Tandem achieved the most identified spectrum, peptide, and two-hit protein identification with preprocessing isobaric related ions, although the top performance method of each engine might not be identical to each other. In iTRAQ8 data set, only Mascot, OMSSA, and X! Tandem achieved most two-hit protein identification with preprocessing isobaric related ions. The preprocessing did not significantly improve the Comet identification sensitivity in both iTRAQ4 and iTRAQ8 data sets.

Fig. 1.

Fig. 1.

Identification improvement rank of 16 preprocessing methods in five searching engines and two isobaric labeling approaches. The size of spot indicates the rank of method based on identification performance. The bigger the spot, the better the identification performance. The red circle indicates the best performance method in the same identification level, same engine, and same isobaric labeling approach. Mascot, MyriMatch, OMSSA, and X! Tandem achieved the best two-hit protein identification with preprocessing isobaric related ions in iTRAQ4 data set. Mascot, OMSSA, and X! Tandem achieved the best two-hit protein identification with preprocessing isobaric related ions in iTRAQ8 data set. The preprocessing considering isobaric related ions did not significantly improve the Comet identification sensitivity in both iTRAQ4 and iTRAQ8 data sets.

Fig. 2 illustrates the identification improvement of 15 preprocessing methods compared with non-preprocessing methods in iTRAQ4 and iTRAQ8 data sets. Among all five search engines, Mascot identification sensitivity was significantly improved by most of the preprocessing methods. The identification sensitivity of MyriMatch, OMSSA, and X! Tandem was moderately improved by some of the preprocessing methods. The identification sensitivity of Comet was not improved by most of the preprocessing methods. The detailed identification summary was also provided as supplemental Table S1–S10.

Fig. 2.

Fig. 2.

The spectrum/peptide/two-hit protein identification improvement percentage of 16 preprocessing methods in five searching engines and two isobaric labeling approaches. Mascot achieved most identification improvement among five engines while Comet achieved smallest identification improvement.

Comparing method 2 to method 1 in Table IV and V indicates that deisotoping and deconvolution significantly improved the Mascot spectrum identification for iTRAQ4 and iTRAQ8 from 16,442 to 18,286 (increased 11.2%) and from 8817 to 10,219 (increased 15.9%) respectively. Comparing method 3 to method 1 shows that keeping the top 10 ions in each 100 Dalton window decreased the Mascot identification sensitivity for the iTRAQ4 data set but increased the identification sensitivity for the iTRAQ8 data set. Identified spectrum count were moderately increased for iTRAQ4 (from 16,442 to 17,912, increased 8.9%) and significantly increased for iTRAQ8 (from 8817 to 12,012, increased 36.2%) by removing isobaric tag ions and the ions in low mass range b/y free window (comparing method 5 to method 1). Comparing methods 5, 6, and 7 to 1 indicates removing any one of the three isobaric related ion types improved Mascot identification sensitivity in both iTRAQ4 and iTRAQ8 data sets, except the ions in high mass range b/y free window in iTRAQ4 data set. Finally, comparing method 10 to method 1 in Table IV indicates that deisotoping, deconvolution, and removing isobaric ions improved the Mascot spectrum identification from 16,442 to 19,118 (increased 16.3%), the peptide identification from 6275 to 7148 (increased 13.9%), and the two-hit protein identification from 950 to 1013 (increased 6.6%) in iTRAQ4 data set. Comparing method 16 to method 1 in Table V indicates that deisotoping, deconvolution, and removing all possible isobaric related ions improved the Mascot spectrum identification from 8817 to 13,240 (increased 50.2%), the peptide identification from 3349 to 4671 (increased 39.5%) and the two-hit protein identification from 612 to 766 (increased 25.2%) in iTRAQ8 data set.

Table V. Identification result from iTRAQ4 dataset using Mascot.
Method Deisotoping
Top 10 Remove Isobaric Ions
Spectrum Peptide Two-hits Protein
Deconvolution Lowc Labelb Higha
1 16442 6275 950
2 + 18286 6876 992
3 + 15856 6059 931
4 + + 18040 6757 989
5 + 17912 6752 989
6 + 17299 6614 973
7 + 16055 6110 931
8 + + + 16611 6268 959
9 + + 18794 6964 1004
10d + + 19118 7148 1013
11 + + 18169 6787 982
12 + + + + 18775 6918 1000
13 + + + 19143 7099 1011
14 + + + 19056 7100 1013
15 + + + 17735 6617 969
16 + + + + + 19313 7114 1012

a b/y free window in high mass range.

b reporter and isobaric tag ions.

c b/y free window in low mass range.

d best method identified most two-hits proteins and then most spectra.

Table VI. Identification result from iTRAQ8 dataset using Mascot.
Method Deisotoping Top 10 Remove Isobaric Ions
Spectrum Peptide Two-hits Protein
Deconvolution Lowc Labelb Higha
1 8817 3349 612
2 + 10219 3793 657
3 + 9280 3508 634
4 + + 10596 3934 674
5 + 12012 4356 732
6 + 10677 3978 687
7 + 9393 3557 634
8 + + + 12403 4464 736
9 + + 12962 4594 756
10 + + 11951 4361 721
11 + + 10831 3994 671
12 + + + + 13178 4671 763
13 + + + 13092 4624 759
14 + + + 12339 4496 733
15 + + + 11242 4141 690
16d + + + + + 13240 4671 766

a b/y free window in high mass range.

b reporter and isobaric tag ions.

c b/y free window in low mass range.

d best method identified most two-hits proteins and then most spectra.

Mascot Score Improvement by Data Preprocessing

We evaluated how the Mascot peptide identification scores were improved by preprocessing of tandem mass spectra before database searching. The scores of peptide-spectrum-match identified in method 1 and 10 in iTRAQ4 data set and method 1 and 16 in iTRAQ8 data set were extracted (See supplemental Table S11). Fig. 3 indicates that data preprocessing before database searching improved the identification scores from a majority of spectra at both iTRAQ4 and iTRAQ8 data sets. p value 2.2e-16 from Wilcoxon rank sum test indicates that the score improvement in iTRAQ8 data set was significantly higher than in iTRAQ4 data set.

Fig. 3.

Fig. 3.

Mascot score improvement after preprocessing tandem mass spectra. Both top two density plots and bottom two violin plots indicated that the majority of the spectra gained score improvement with data preprocessing in both iTRAQ4 and iTRAQ8 data sets. p value 2.2e-16 from Wilcoxon rank sum test indicates that the score improvement in iTRAQ8 data set was significantly higher than in iTRAQ4 data set.

C-terminal Peptide Identification

Because the tryptic peptide generated from the protein carboxyl terminus (C-terminal peptide) usually does not follow the assumption that the Y1 ion is either Y1(K) or Y1(R), which we use for calculating the b/y free window, we checked how those peptides were identified before and after data preprocessing. The scores of C-terminal peptide identified in method 1 and 10 in iTRAQ4 data set and method 1 and 16 in iTRAQ8 data set were extracted (See supplemental Table S12). In Fig. 4, the top two Venn diagrams indicate that preprocessing also increases C- terminal peptide identification sensitivity in both iTRAQ4 and iTRAQ8 data set, and the bottom two scatter plots indicate that the Mascot scores from a majority of commonly identified C- terminal peptides also increased after preprocessing.

Fig. 4.

Fig. 4.

C-terminal peptide identification improvement in iTRAQ4 and iTRAQ8 data sets after preprocessing tandem mass spectra. The top two Venn diagrams indicated that preprocessing also increased C-terminal peptide identification sensitivity in both iTRAQ4 and iTRAQ8 data sets. The bottom two scatter plots indicate that the Mascot scores of the majority of commonly identified C-terminal peptides were also increased after preprocessing.

DISCUSSION

We annotated the high frequency ions in isobarically labeled tandem mass spectra. The majority of high frequency ions in iTRAQ4 and iTRAQ8 data sets could be annotated as reporter ions (Rep+), isobaric tag ions (Label+), Y1 ions, or ions in the b/y free window. More unannotated ions were observed in iTRAQ8 data set than in iTRAQ4 data set. Such a phenomenon can be caused by the more complex iTRAQ8 isobaric labeling tag compared with iTRAQ4, which could introduce more byproduct ions by isolation of mass spectrometry. Reporter ions and isobaric tag ions are isobaric ions and can be confidently removed from the MS/MS spectrum for database searching. The other high frequency ions in the b/y free windows are very possibly not introduced by the peptide itself but by either the isobaric labeling procedure or mass spectrometry system. Those ions might be removed to de-noise the tandem mass spectra for improving identification sensitivity. But there is still a possibility that the ions in the low mass range b/y free window are actually multiply charged b/y ions but that their charges cannot be estimated from mass spectrum, thus, removing such ions may decrease the identification sensitivity. The benefit of removing the ions in b/y free window may be varied between different isobaric labeling methods and different searching engines. With less ions in low mass b/y free window in iTRAQ4 than in iTRAQ8 data set (supplemental Fig. S1), removing isobaric ions only may be more suitable for iTRAQ4 data and removing ions in low mass b/y free window may be more suitable for iTRAQ8 data. We also observed a few high frequency ions outside of b/y free windows, including 429.0888. Without confidential evidence, we did not remove them in this study.

We also examined the factors that might affect the sensitivity of peptide identification. Our results showed that the combination of deisotoping/deconvolution and removing isobaric related ions significantly improved the Mascot identification sensitivity and moderately improved MyriMatch, X! Tandem, and OMSSA identification sensitivity for both iTRAQ4 and iTRAQ8 data sets. Comet was only slightly affected by preprocessing procedure. We further validated our results using an independent TMT6 data set using Mascot. The analysis results from this TMT6 data set also showed similar peptide/protein identification sensitivity improvement (See supplemental Table S13). Based on our results, we conclude that removing isobaric related ions combined with deisotoping/deconvolution is highly recommended for preprocessing isobarically labeled MS/MS spectra before database search, especially for Mascot search engine.

The complexity of the isobaric labeling tag significantly affects the identification sensitivity improvement after preprocessing tandem mass spectra. Keeping the top 10 ions in each 100 Dalton window slightly decreased the Mascot peptide identification sensitivity in iTRAQ4 data sets, regardless of whether it was combined with deisotoping and deconvolution. It may indicate that the high-resolution mass spectra in our iTRAQ4 data set were very clean that keeping the top 10 ions in each 100 Daltons was not necessary during data preprocessing. This finding may require additional validation in other independent iTRAQ4 data sets. On the other hand, keeping the top 10 ions in each 100 Dalton window slightly increased the Mascot peptide identification sensitivity in the iTRAQ8 data sets. Comparing to method 1, a combination of deisotoping/deconvolution, keeping the top 10 ions in each 100 Dalton window, and removing isobaric related ions (method 16) improved identified spectra, peptides, and two-hit proteins for iTRAQ8 over iTRAQ4 by 32.7%, 36.4%, and 18.5% respectively. This suggests that preprocessing is more crucial for iTRAQ8 than iTRAQ4 data.

We validated the identification improvement of the C-terminal peptides. C-terminal peptides might not end with “K” or “R,” which voids our assumption for b/y free window calculation that Y1 ions were either from K or R. The result indicated that data preprocessing not only improved the Mascot scores of major C-terminal peptides but also increased the identification sensitivity of C-terminal peptides: even the ions in low mass b/y free window were removed.

We implemented TurboRawToMGF (v2.0) with a user friendly GUI. The GUI allows users to transfer the data generated from high-resolution mass spectrometry (such as Thermo LTQ-OrbitrapVelos) to mascot generic format file conveniently. TurboRawToMGF also supports filtering spectra based on user-defined mass ranges. For example, the user may define 428.75–429.25 to remove the 429.0888 ion. TurboRawToMGF (v2.0) offers many other conveniences to users. For example, the conversion from mzData and mzXML format file to mascot generic format file is supported, and conversion of multiple files in batch mode is also provided. TurboRawToMGF is free, and it will be consistently supported in the coming years.

Supplementary Material

Supplemental Data

Acknowledgments

We thank GSA program by Thermo. We are grateful to Margot Bjoring for her editorial support. The data deposition to the ProteomeXchange Consorium was supported by PRIDE Team, EBI.

Footnotes

Author contributions: Q.S., J.D., Y.S., and R.Z. designed research; R.L., Q.L., Z.S., and C.L. performed research; Q.S. analyzed data; Q.S., R.L., J.D., and Y.G. wrote the paper.

* This work was supported by grants from Ministry of Science and Technology (2011CB910200, 2014CB910500, 2011CB910600), and a grant from the National Natural Science Foundation of China (31130034).

1 The abbreviations used are:

MS/MS
Tandem Mass Spectrometry
LC
Liquid Chromatography
m/z
mass-to-charge ratios
SILAC
stable isotope labeling by amino acids in cell culture
iTRAQ
isobaric tag for relative and absolute quantification
TMT
tandem mass tag.

REFERENCES

  • 1. Yates J. R., 3rd, Gilchrist A., Howell K. E., Bergeron J. J. (2005) Proteomics of organelles and large cellular structures. Nat. Rev. Mol. Cell Biol. 6, 702–714 [DOI] [PubMed] [Google Scholar]
  • 2. Walther T. C., Mann M. (2010) Mass spectrometry-based proteomics in cell biology. J. Cell Biol. 190, 491–500 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Wolters D. A., Washburn M. P., Yates J. R., 3rd (2001) An automated multidimensional protein identification technology for shotgun proteomics. Anal. Chem. 73, 5683–5690 [DOI] [PubMed] [Google Scholar]
  • 4. Mann M., Kelleher N. L. (2008) Precision proteomics: the case for high-resolution and high mass accuracy. Proc. Natl. Acad. Sci. U.S.A. 105, 18132–18138 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Olsen J. V., Schwartz J. C., Griep-Raming J., Nielsen M. L., Damoc E., Denisov E., Lange O., Remes P., Taylor D., Splendore M., Wouters E. R., Senko M., Makarov A., Mann M., Horning S. (2009) A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Mol. Cell Proteomics 8, 2759–2769 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Perkins D. N., Pappin D. J., Creasy D. M., Cottrell J. S. (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 [DOI] [PubMed] [Google Scholar]
  • 7. Eng J. K., McCormack A. L., Yates J. R. (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectr. 5, 976–989 [DOI] [PubMed] [Google Scholar]
  • 8. Carvalho P. C., Xu T., Han X., Cociorva D., Barbosa V. C., Yates J. R., 3rd (2009) YADA: a tool for taking the most out of high-resolution spectra. Bioinformatics 25, 2734–2736 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Liu X., Inbar Y., Dorrestein P. C., Wynne C., Edwards N., Souda P., Whitelegge J. P., Bafna V., Pevzner P. A. (2010) Deconvolution and database search of complex tandem mass spectra of intact proteins: a combinatorial approach. Mol. Cell Proteomics 9, 2772–2782 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Ong S. E., Blagoev B., Kratchmarova I., Kristensen D. B., Steen H., Pandey A., Mann M. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell Proteomics 1, 376–386 [DOI] [PubMed] [Google Scholar]
  • 11. Bantscheff M., Schirle M., Sweetman G., Rick J., Kuster B. (2007) Quantitative mass spectrometry in proteomics: a critical review. Anal. Bioanal. Chem. 389, 1017–1031 [DOI] [PubMed] [Google Scholar]
  • 12. Thompson A., Schafer J., Kuhn K., Kienle S., Schwarz J., Schmidt G., Neumann T., Johnstone R., Mohammed A. K., Hamon C. (2003) Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75, 1895–1904 [DOI] [PubMed] [Google Scholar]
  • 13. Ross P. L., Huang Y. N., Marchese J. N., Williamson B., Parker K., Hattan S., Khainovski N., Pillai S., Dey S., Daniels S., Purkayastha S., Juhasz P., Martin S., Bartlet-Jones M., He F., Jacobson A., Pappin D. J. (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell Proteomics 3, 1154–1169 [DOI] [PubMed] [Google Scholar]
  • 14. Aggarwal K., Choe L. H., Lee K. H. (2006) Shotgun proteomics using the iTRAQ isobaric tags. Brief. Funct. Genomics Proteomics 5, 112–120 [DOI] [PubMed] [Google Scholar]
  • 15. Choe L., D'Ascenzo M., Relkin N. R., Pappin D., Ross P., Williamson B., Guertin S., Pribil P., Lee K. H. (2007) 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer's disease. Proteomics 7, 3651–3660 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Dayon L., Hainard A., Licker V., Turck N., Kuhn K., Hochstrasser D. F., Burkhard P. R., Sanchez J. C. (2008) Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal. Chem. 80, 2921–2931 [DOI] [PubMed] [Google Scholar]
  • 17. Leitner A., Lindner W. (2009) Chemical tagging strategies for mass spectrometry-based phospho-proteomics. Methods Mol. Biol. 527, 229–243 [DOI] [PubMed] [Google Scholar]
  • 18. Treumann A., Thiede B. (2010) Isobaric protein and peptide quantification: perspectives and issues. Expert Rev. Proteomics 7, 647–653 [DOI] [PubMed] [Google Scholar]
  • 19. Coombs K. M. (2011) Quantitative proteomics of complex mixtures. Expert Rev. Proteomics 8, 659–677 [DOI] [PubMed] [Google Scholar]
  • 20. Wiese S., Reidegeld K. A., Meyer H. E., Warscheid B. (2007) Protein labeling by iTRAQ: a new tool for quantitative mass spectrometry in proteome research. Proteomics 7, 340–350 [DOI] [PubMed] [Google Scholar]
  • 21. Prudova A., auf dem Keller U., Butler G. S., Overall C. M. (2010) Multiplex N-terminome analysis of MMP-2 and MMP-9 substrate degradomes by iTRAQ-TAILS quantitative proteomics. Mol. Cell Proteomics 9, 894–911 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Sinclair J., Timms J. F. (2011) Quantitative profiling of serum samples using TMT protein labelling, fractionation and LC-MS/MS. Methods 54, 361–369 [DOI] [PubMed] [Google Scholar]
  • 23. Hung C. W., Tholey A. (2012) Tandem mass tag protein labeling for top-down identification and quantification. Anal. Chem. 84, 161–170 [DOI] [PubMed] [Google Scholar]
  • 24. Nielsen P. A., Olsen J. V., Podtelejnikov A. V., Andersen J. R., Mann M., Wisniewski J. R. (2005) Proteomic mapping of brain plasma membrane proteins. Mol. Cell Proteomics 4, 402–408 [DOI] [PubMed] [Google Scholar]
  • 25. Cox J., Mann M. (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 [DOI] [PubMed] [Google Scholar]
  • 26. Eng J. K., Jahan T. A., Hoopmann M. R. (2013) Comet: an open-source MS/MS sequence database search tool. Proteomics 13, 22–24 [DOI] [PubMed] [Google Scholar]
  • 27. Tabb D. L., Fernando C. G., Chambers M. C. (2007) MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J. Proteome Res. 6, 654–661 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Geer L. Y., Markey S. P., Kowalak J. A., Wagner L., Xu M., Maynard D. M., Yang X., Shi W., Bryant S. H. (2004) Open mass spectrometry search algorithm. J. Proteome Res. 3, 958–964 [DOI] [PubMed] [Google Scholar]
  • 29. Craig R., Beavis R. C. (2004) TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467 [DOI] [PubMed] [Google Scholar]
  • 30. Vaudel M., Barsnes H., Berven F. S., Sickmann A., Martens L. (2011) SearchGUI: An open-source graphical user interface for simultaneous OMSSA and X!Tandem searches. Proteomics 11, 996–999 [DOI] [PubMed] [Google Scholar]
  • 31. Sheng Q., Dai J., Wu Y., Tang H., Zeng R. (2012) BuildSummary: using a group-based approach to improve the sensitivity of peptide/protein identification in shotgun proteomics. J. Proteome Res. 11, 1494–1502 [DOI] [PubMed] [Google Scholar]
  • 32. Mann B., Madera M., Sheng Q., Tang H., Mechref Y., Novotny M. V. (2008) ProteinQuant Suite: a bundle of automated software tools for label-free quantitative proteomics. Rapid Commun. Mass Spectrom. 22, 3823–3834 [DOI] [PubMed] [Google Scholar]
  • 33. Vizcaino J. A., Deutsch E. W., Wang R., Csordas A., Reisinger F., Rios D., Dianes J. A., Sun Z., Farrah T., Bandeira N., Binz P. A., Xenarios I., Eisenacher M., Mayer G., Gatto L., Campos A., Chalkley R. J., Kraus H. J., Albar J. P., Martinez-Bartolome S., Apweiler R., Omenn G. S., Martens L., Jones A. R., Hermjakob H. (2014) ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat. Biotechnol. 32, 223–226 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data

Data Availability Statement

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (33) via the PRIDE partner repository with the data set identifier PXD000994 and DOI 10.6019/PXD000994.

To access the data please visit: http://tinyurl.com/pdbkesj

Username: reviewer06796@ebi.ac.uk

Password: jWjYoiuT


Articles from Molecular & Cellular Proteomics : MCP are provided here courtesy of American Society for Biochemistry and Molecular Biology

RESOURCES