“Plug-and-play” investigation of the human phosphoproteome by targeted high-resolution mass spectrometry

Robert T Lawrence; Brian C Searle; Ariadna Llovet; Judit Villén

doi:10.1038/nmeth.3811

. Author manuscript; available in PMC: 2018 Apr 24.

Published in final edited form as: Nat Methods. 2016 Mar 28;13(5):431–434. doi: 10.1038/nmeth.3811

“Plug-and-play” investigation of the human phosphoproteome by targeted high-resolution mass spectrometry

Robert T Lawrence ¹, Brian C Searle ¹, Ariadna Llovet ¹, Judit Villén ¹

PMCID: PMC5915315 NIHMSID: NIHMS765948 PMID: 27018578

Abstract

Systematic approaches to study cellular signaling require new phosphoproteomic techniques that reproducibly measure the same phosphopeptides across multiple replicates, conditions, and time points. Here we present a method to mine information from large-scale, heterogeneous phosphoproteomics datasets to rapidly generate robust targeted assays. We demonstrate the performance of our method by interrogating the IGF-1/AKT signaling pathway; and show that even rarely observed phosphorylation events can be consistently detected and precisely quantified.

Each human cell harbors a signaling landscape likely spanning hundreds of thousands of phosphorylated residues¹. How are these residues dynamically engaged to control cell behavior in the context of time, environment, cellular identity, and genetic variation? Answering this will require systematic phosphoproteome analysis using high-throughput assays that are accurate, sensitive and reproducible. Measuring phosphorylation events in a targeted manner presents many hurdles. Not surprisingly, it has been only achievable after tedious assay optimization and reliance on synthetic peptide standards^2–4, which impedes assay versatility and limits widespread adoption of the technique by researchers outside the proteomics community. Our goal in this study was to develop the capability to easily generate 1-hour “plug-and-play” targeted phosphoprotein assays equivalent in sensitivity to prolonged deep fractionation experiments (>12-hr analysis time) with more reproducible sampling and quantification.

Much of the work on cellular signaling using mass spectrometry based proteomics has focused on phosphorylation site discovery, generating vast catalogues of novel phosphorylation events and their regulation^5–8. This workflow generally employs a data-dependent acquisition (DDA) strategy, in which the “top N” features in each full MS scan are selected for MS/MS fragmentation and identification. However, one of the major problems that has plagued quantitative proteomics is stochastic sampling, which leads to extensive but sparse datasets that have many missing values across different experimental conditions⁹. Analysis of phosphopeptide-enriched samples is further complicated by having a higher dynamic range than whole-cell digests, limiting the sensitivity and reproducibility of DDA. Recently, more systematic and sensitive data acquisition strategies have emerged to meet these analytical challenges, including data independent acquisition (DIA) and parallel reaction monitoring (PRM). In DIA, MS/MS scans are acquired across the full mass range each duty cycle^10,11. In PRM, MS/MS scans are targeted towards narrow prespecified mass and time windows corresponding to analytes of interest¹². Compared to selected-reaction monitoring (SRM), which has been the workhorse of targeted proteomics, PRM simplifies the targeted mass spectrometry workflow. All one needs to specify to configure an assay is the precursor mass-to-charge ratio (m/z) and the expected retention time, but no optimization is required a priori. Potential interferences can be identified and fragment ions quantified post-hoc.

The promise of targeted quantitative phosphoproteomic analysis has been demonstrated in several recent studies^3,4,13. However, there are still challenges to overcome in order to make targeted phosphoproteome analysis more versatile and routine. Selecting the best peptide sequence and charge state to monitor for phosphorylation sites still represents a major obstacle. Because protein phosphorylation is site-specific, selection of MS-compatible peptide sequences is limited by the local sequence composition and protease enzyme used for digestion. Notably, phosphorylation alters the local charge distribution, which interferes with routinely used enzymes like Lys-C and trypsin, further hampering peptide selection^14,15. Thus, the preferred peptide cleavage and charge state are difficult to predict a priori.

In this method, we instead relied upon a large-scale database of previously observed human phosphopeptide sequences. We assembled this database by searching nearly 1,000 LC-MS/MS runs from human label-free trypsin-digested phospho-enriched samples. The samples were derived from a variety of human cell lines exposed to many different stimuli and processed using different phosphopeptide enrichment methods and single-shot as well as deep offline fractionation techniques. More than two thirds of the data (727 runs) were collected in-house and all were unpublished at the time of writing. To assess the inter-laboratory reproducibility of the preferred peptide sequences representing each phosphosite, we additionally searched 262 LC-MS/MS runs from three other groups^8,16,17. Overall, more than 7.5 million phosphopeptide spectral matches were identified (PSM-level FDR < 1%) corresponding to 109,611 phosphorylation sites (90,103 localized p < 0.05) on 11,428 proteins (phosphosite-adjusted FDR < 5%), commensurate with the human phosphoproteome coverage provided in resources such as PhosphoSitePlus^{™ 18} (Fig. 1a, Supplementary Fig. 1 and Supplementary Data).

A database for targeted human phosphoproteome analysis. (a) Data analysis pipeline and summary statistics. Phosphopeptide spectral matches (PSM’s) were filtered to FDR < 1%. Phosphoisoforms refer to unique protein phosphorylation states (i.e. phosphopeptides representing multisite phosphorylation reported independently from singly phosphorylated species). Phosphosites refer to total unique protein phosphorylation sites (90,103 were confidently localized, Ascore ≥ 13). To account for data aggregation, unique phosphopeptides, phosphoisoforms, and phosphosites were each additionally filtered to reach an aggregate FDR < 5%. (b) Comparison of phosphopeptides identified in single-shot experiments (n=388) versus deep fractionation experiments (n=50). (c) Reproducibility of phosphopeptide sampling across experiments. (d) Cleavage form specificity (counts of most frequently observed cleavage state/total counts) and distribution. For specificity calculation, only phosphoisoforms observed at least 100 times were analyzed (n=14,480). The pie chart represents the distribution of the most frequently observed number of miscleavages for each phosphoisoform in the database. (e) Charge state specificity (counts of most frequently observed charge state/total counts) and distribution. For specificity calculation, only phosphopeptide sequences observed at least 100 times were analyzed (n=15,985). The pie chart represents the distribution of the most frequently observed charge state versus predicted (positively charged amino acids + 1) for each phosphopeptide in the database.

We used the database to quantify several key parameters of data-dependent phosphoproteome analysis that we hypothesize can be addressed with targeted analysis. Without fractionation, DDA is limited in sensitivity. In line with our expectations, 64% of phosphopeptides we identified were only observable in experiments that used extensive fractionation prior to phosphopeptide enrichment (Fig. 1b). Phosphopeptide sampling stochasticity is detrimental in both single-shot and deep fractionation routines. Sample fractionation increases the depth of phosphoproteome coverage (on average, 5,834 unique phosphopeptides were identified per single-shot experiment versus 32,311 per fractionation experiment), but the overlap between samples was still lower than expected. For single shot experiments only 2,544/82,944 (3%) phosphopeptides identified were observable in at least 50% of experiments compared with 16,677/210,107 (8%) for fractionation experiments (Fig. 1c).

Next, we examined phosphopeptides sequenced with deep coverage (identified a minimum of 100 times) to ascertain the distribution of charge and cleavage state specificity (preferred state/total observations). We found that most phosphosites were predominantly observed in only one cleavage state (Fig. 1d), and that charge state was moderately specific (Fig. 1e). The preferred phosphopeptide forms were both fully cleaved and of the expected charge state only 35% of the time. As expected, we observed a significant number of missed cleavage sites (Supplementary Fig. 2a,b), a small fraction (10%) of which rescued peptides that would otherwise be too short for analysis. In addition to miscleaved peptides, we identified ~9,000 phosphoisoforms mapping to protein N-terminal clipping and/or acetylation. Prediction of the most frequent charge was wrong 50% of the time (Fig. 1e), with many of the ions predicted by heuristics falling outside the optimal mass range of the mass spectrometer (Supplementary Fig. 2c). Lastly, we found that the preferred peptide sequence was the same between at least 3 of the 4 studies 87% of the time (Supplementary Fig. 2d), suggesting that the preferred phosphopeptide sequences provided in our database should be compatible with most laboratory trypsin digestion and phosphopeptide enrichment protocols.

We hypothesized that leveraging a large-scale database of previously observed phosphopeptides would enable rapid assay deployment with higher success rates than traditional approaches, and benchmarked the capabilities of this data-driven approach with respect to retention time scheduling, peptide selection, detectability, and quantification. We evaluated the precision of the retention time scheduling by using DDA analysis of BSA or phospho-enriched tryptic digest to predict the retention times of independent phosphopeptides in a subsequent DDA run, and established that when using a complex phosphopeptide mixture for assay calibration, 5-minute retention time windows are sufficient to capture 95% of the targets (Fig. 2a and Supplementary Dataset 1). Next, we targeted pairs of phosphopeptides in which the sequence and charge state selected using heuristics differed from the database selection. Data-driven selection outperformed heuristics (Fig. 2b, Supplementary Fig. 3 and Supplementary Dataset 2) by magnitudes to similar to what was predicted by our previous analysis of sequence and charge specificity (Fig. 1d,e).

Plug-and-play assay performance. (a) Retention time prediction performance. “Prediction Success Rate” refers to the percentage of phosphopeptides in the validation set that were identified within the predicted retention time windows. “Max. assay size” is based on 80 concurrent targets scheduled optimally over 60 minutes. “Expected identifications” represents the success rate using the “phospho-mixture” multiplied by the max. assay size. (b) Phosphopeptide sequence (n=100) and charge state (n=50) selection performance. The reported intensity represents the summed peak areas of all identified y-ions and unidentified targets are imputed at noise level (1e4). Significance was assessed using a Wilcoxon signed rank (paired non-parametric) test. (c) PRM, DIA, and DDA analysis of 101 targets in phosphopeptides purified from IGF-1/EGF/pervanadate stimulated MCF7 cells. Peptides are ordered from most to least frequently observed in the database and PSMs are indicated in the y-axis. Peptide centric analysis refers to targeted analysis using Skyline. Spectrum centric analysis refers to a database search using a Comet-Percolator-Ascore pipeline. Right panel shows PRM analysis of a phosphopeptide (R.QESTVSFNPY*EPELAPWAADKGPQR.E) rarely observed by DDA. The top right panel shows the MS chromatogram of the 3 precursor isotopes and the bottom right panel shows the MS/MS chromatograms of all identified fragment ions. (Y*: phosphotyrosine) (d) Targeted site and isoform specific quantification of AKT1/2 phosphorylation at T308/309 and S473/474 (n=6). *p < 1×10⁻⁶, unpaired t-test. See additional targets measured in Supplementary Dataset 4. For boxplots in (b) and (d), the lower and upper edges of the box correspond to the boundaries of the first and third quartiles. The “whiskers” extend to the most extreme value within 1.5 x interquartile range.

We further benchmarked the method by selecting 101 phosphopeptides spanning a wide range of detectability and configured a 1-h parallel reaction monitoring assay to detect those peptides in a phospho-enriched tryptic digest of MCF7 breast cancer cells treated with a cocktail of insulin-like growth factor-1 (IGF-1), epidermal growth factor, and pervanadate. We analyzed the same sample using PRM, DIA, and DDA strategies in technical quadruplicate. The PRM and DIA results were analyzed in a targeted ‘peptide-centric’ manner querying specifically for the target phosphopeptides as well as in a ‘spectrum-centric’ manner using a database search pipeline (Supplementary Dataset 3). Using the PRM method, we readily detected several species that were only sparingly detected in our phosphopeptide database, such as the peptide corresponding to the activation site of tyrosine protein kinase SYK (Fig. 2c). Measured retention times correlated well with the retention times in the database (R²>0.99) enabling efficient assay scheduling and interpretation (Supplementary Fig. 4a). Out of the 101 targets, PRM was superior to DIA and DDA 1-h assays in terms of the number of peptides detected and sampling reproducibility (Fig. 2c, Supplementary Fig. 4b and Supplementary Dataset 3). It has been suggested that ‘peptide-centric’ targeted analysis might offer advantages over the traditional ‘spectrum-centric’ approach since it more directly evaluates the evidence for a given peptide^19,20. In our analysis, peptide-centric analysis was more sensitive than database searching for both PRM and DIA, and the signal (if any was measurable) was consistently detectable across all 4 runs (Fig. 2c and Supplementary Fig. 4b).

Lastly, we designed and implemented a targeted assay to quantify phosphorylation sites on proteins within the IGF-1/AKT signaling pathway in MCF7 cells before and after stimulation with IGF-1 (Supplementary Dataset 4). Our assay enabled reproducible isoform-specific quantification of protein kinase AKT1/2 activation via phosphorylation at T308/309 and S473/474 (Fig. 2d). The specific isoforms of AKT are thought to have distinct roles in cellular signaling, but the respective kinase activation sites T308/T309 are not distinguishable using specific antibodies due to nearly identical local sequence composition. They are sparingly detectable by DDA even after deep fractionation but were reproducibly detected using PRM in the single stage enrichment protocol used here.

In addition to advantages in sensitivity and reproducibility, systematic MS/MS acquisition also introduces the capability to monitor isobaric peptide species, which are ubiquitous in phosphorylated proteins (Supplementary Fig. 5a). These positional isomers can often be resolved by retention time, allowing for more accurate quantification (Supplementary Fig. 5b).

Overall, this study demonstrates the potential of label-free PRM assays for robust, high-throughput, targeted phosphoproteome analysis. In order to facilitate “plug-and-play” assay development, we created a web-based application that queries our database for optimal peptide selection and retention time scheduling (https://phosphopedia.gs.washington.edu). This application provides several tools for assay development, including pre-curated lists of phosphosites, information for sequence and charge state selection, an MS/MS spectra viewer, retention time calibration, automated variable window assignment for positional isomers, and dynamic schedule visualization and optimization. Using this tool, targeted phosphoproteomic assays are convenient to configure, sensitive enough to detect low-abundance analytes without sample fractionation, and more reproducible than data-dependent acquisition. The use of label-free targeted quantification in conjunction with data-driven peptide selection enables rapid deployment of assays to measure virtually any known phosphorylation event in human specimens. These qualities make the method suitable for interrogating the diverse dimensions of the cellular signaling landscape with high throughput and versatility.

ONLINE METHODS

Cell culture

MCF7 breast cancer cells were obtained from ATCC, and tested biannually for the presence of mycoplasma. MCF7 cells were cultured at 37°C in 5% CO₂ in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 4.5 g/L glucose, L-glutamine, and 10% fetal bovine serum. To generate bulk phosphopeptides for method comparisons, cells were incubated in serum-free medium for 4 hours prior to treatment with IGF-1 (100 ng/ml), EGF (100 ng/ml), and pervanadate (1 mM) for 15 minutes. For IGF-1 experiments, cells were incubated in serum-free medium for 4 hours and stimulated with or without IGF-1 (100 ng/ml) for 15 minutes (n=6). At the time of harvest, cells were rinsed 3 times quickly with ice-cold phosphate-buffered saline and flash frozen on liquid nitrogen.

Sample preparation

Cell lysis was performed in 9 M urea, 50 mM Tris pH 8.2, 75 mM NaCl with protease inhibitors (Roche) and phosphatase inhibitors (50 mM beta-glycerophosphate, 50 mM sodium fluoride, 10 mM sodium pyrophosphate, 1 mM sodium orthovanadate). Cells were scraped off of plates directly into ice cold lysis buffer and subjected to 20 seconds of probe sonication, incubated on ice for 20 minutes to solubilize proteins and spun at 12,000 g for 10 min, and protein content was assayed using the bicinchoninic acid method (Pierce). Proteins were reduced with 5 mM dithiothreitol for 30 min at 55°C, alkylated with 10 mM iodoacetamide for 15 min at room temperature, and quenched with an additional 10 mM dithiothreitol. Protein extracts were diluted 5-fold with 50 mM Tris pH 8.2 and digested overnight at 37°C with sequencing grade trypsin (Promega) in a 1:200 enzyme/substrate ratio. Following digestion, the reactions were quenched with 10% TFA to pH ~2, desalted on tC18 SepPak cartridges (Waters), and dried by vacuum centrifugation. For bulk phosphopeptide preparation, 5 mg of tryptic peptides were resuspended in immobilized metal affinity chromatography (IMAC) loading solution (80% MeCN, 0.1% TFA) and divided into 12 × 150 μl aliquots. To prepare IMAC slurry, Ni-NTA magnetic agarose (Qiagen) was stripped with 40 mM EDTA for 30 min, reloaded with 10 mM FeCl₃ for 30 min, washed 3 times and resuspended in IMAC loading solution. Phosphopeptide enrichment was performed using a KingFisher Flex robot (Thermo Scientific) programmed to incubate peptides with 150 μl 5% bead slurry for 30 minutes, wash 3 times with 150 μl 80% MeCN, 0.1% TFA, and elute with 60 μl 1:1 MeCN:1% NH₄OH. The eluates were acidified with 10% formic acid, pooled, and dried by vacuum centrifugation. For IGF-1 experiments, 350 μg tryptic peptides were enriched for each sample. To control variability in phosphopeptide enrichment and mass spectrometry, we used a spike-in standard of bovine serum albumin tryptic peptides previously subjected to in vitro phosphorylation by serum-stimulated HeLa cell lysate in the presence of 2.5 mM ATP for 60 min at 30°C. The resulting peptides were purified by solid phase extraction on a tC18 SepPak cartridge and spiked in prior to IMAC at a mass ratio of 1:50.

Mass Spectrometry

Phosphopeptide-enriched samples were resuspended in 4% formic acid, 3% MeCN and subjected to liquid chromatography on an EASY-nLC 1000 system equipped with a 100 μm inner diameter x 25 cm column packed in-house with Reprosil C18 1.9 μm particles (Dr. Maisch GmbH) and column oven set to 50°C. All separations were performed using a gradient 9% to 32% MeCN in 0.15% formic acid over 44 min (60 min total method length) at a flow rate of 500 nl/min. The HPLC was coupled directly with a Q-Exactive mass spectrometer. The DDA method consisted of a full MS scan (70k resolution, 3e6 automatic gain control (AGC) target, 240 ms maximum injection time, 400 to 1200 m/z, centroid mode) followed by up to 20 data-dependent MS/MS acquisitions on the top 20 most intense precursor ions (35k resolution, 5e4 AGC target, 120 ms maximum injection time, 2 m/z isolation window, 27% normalized collision energy, centroid mode). The DIA method consisted of a full MS scan configured as above followed by 33 data-independent MS/MS acquisitions configured using an inclusion list with 25 m/z overlapping windows (12.5 m/z with deconvolution) covering the 400 to 1200 m/z mass range (35k resolution, 5e5 AGC target, 120 ms maximum injection time, 25 m/z isolation window, 27% normalized collision energy, centroid mode). The PRM method consisted of a full MS scan configured as above followed by up to 20 targeted MS/MS scans as defined by a time-scheduled inclusion list (35k resolution, 5e5 AGC target, 120 ms maximum injection time, 2 m/z isolation window, 27% normalized collision energy, centroid mode). To prevent systematic bias, the order of acquisition for “Control” and “IGF-1” samples was randomized. Benchmarking experiments for retention time, sequence and charge selection were performed on a nanoACQUITY liquid chromatography system coupled to a Q-Exactive Plus mass spectrometer with the following modifications to the above parameters: flow rate was set to 400 nl/min and for the PRM assays 25 unscheduled targeted MS/MS scans using 50 ms maximum injection time and 17.5k resolution were collected after each full MS scan. For DDA and DIA, AGC targets were optimized for speed with a goal of reaching the target before reaching maximum injection time. For PRM, the AGC target was selected for enhanced sensitivity and dynamic range with a goal of reaching the maximum injection time before reaching the target. PRM assay scheduling was performed within Skyline²¹ (version 3.1.0.7382). To calibrate the schedule, an initial pilot run was conducted with 10min wide acquisition windows, and aligned to the normalized retention database provided with this manuscript (HumanPhosphoproteomeRT.irtdb) to build a retention time predictor. Subsequently, a scheduled isolation list with refined 6min windows was exported from Skyline as a .csv file and imported directly into the instrument PRM method as an inclusion list. Any group of peptides from the database may be used as retention time calibrators, including any of the phosphopeptides, the Peptide Retention Time Calibration (PRTC) mixture (Pierce), or tryptic peptides from BSA. An equivalent retention time scheduling tool with other capabilities such as variable windows for positional isomeric phosphopeptides is available with the web portal that accompanies this manuscript.

Data processing and analysis

DDA: Raw DDA data files were converted to mzXML and searched using Comet²² (version 2015.01) against the human Swissprot database including reviewed isoforms (April 2015; 42,121 entries) allowing for variable oxidation of methionine, protein N-terminal acetylation, and phosphorylation of serine, threonine, and tyrosine residues. Carbamidomethylation of cysteines was set as a fixed modification. Trypsin (KR|P) fully digested was selected allowing for up to 2 missed cleavages. Precursor mass tolerance was set to 50 ppm, and fragment ion tolerance to 0.02 Daltons. Search results were filtered using Percolator²³ to reach a 1% false discovery rate at the PSM level. Phosphosite assignment was performed using an in-house implementation of Ascore²⁴, and sites with Ascore ≥ 13 were considered localized (p=0.05). To construct the large-scale phosphopeptide database we imposed additional filters to prevent accumulation of false hits associated with data aggregation. First, phosphopeptides in the database with multiple non-localized instances spanning the same sequence were only considered to correspond to the minimum number of phosphosites that explain the data. Second, we carried forward the best posterior error probability for each phosphopeptide spectral match, phosphoisoform, and phosphosite in order to compute an adjusted FDR at each level. A phosphoisoform represents multiple peptide sequences containing the same combination of phosphorylation sites. Multiple peptides with different degrees of phosphorylation or cleavage may represent the same phosphosite. Without imposing additional filtering beyond peptide spectral matches 196,744 phosphosites were identified, but the phosphosite-level false discovery rate after data aggregation was 29.2% and the adjusted posterior error probability for phosphosites identified by only a single MS/MS scan was 44% (Supplementary Fig. 1b). Accordingly, we suspect that phosphorylation site databases that aggregate large volumes of spectral data without imposing additional filters are also likely to aggregate false discoveries. Lastly, spectral libraries were constructed from aggregate phosphopeptide search results and assembled into a normalized retention time database (HumanPhosphoproteomeRT.irtdb) using Skyline.

DIA and PRM: For spectrum-centric analysis of DIA and PRM mass spectrometry results, we used DIA-Umpire²⁵ version 1.4 with default parameters to assemble pseudo-MS/MS spectra for the database search pipeline described above. Peptide-centric analysis was performed using Skyline. Signal extraction was performed on +2, +3, +4 precursors and +1, +2 b and y fragment ions. Full MS resolving power was set to 70,000, and MS/MS resolving power set to 17,500. After importing an initial run, extracted ion chromatograms were aligned to the retention time library to generate a predictor and all results were reimported using retention time filtering to within 5 minutes of predicted RT. Peptide identifications were further refined by manual interpretation using several criteria including product ion mass accuracy, correlation of precursor and fragment ion peak shapes, and signal-to-noise ratios. Specifically, we required at least three highly resolved fragment ions without interference to consider a peptide identified. To consider a peptide localized, we required at least 1 site-diagnostic ion. For IGF-1 experiments, integrated peak areas were measured for each peptide in Skyline and exported for analysis. Values were normalized to the average peak areas of 3 spiked-in phosphorylated bovine serum albumin peptides and log2 transformed. Statistical significance was assessed using a two-sample unpaired t-test.

Benchmarking experiments: Retention times from bovine serum albumin digest (110 peptides) or phosphopeptide-enriched MCF7 digest (phospho-mixture) were used to predict the retention times of a subsequent analysis of the same MCF7 phospho-mixture. For the phospho-mixture prediction, half of the identifications (2,357 phosphopeptides) were randomly selected as “training” data and used to predict the retention times of an independent set of phosphopeptides in the subsequent run. Unscheduled PRM experiments were used to evaluate disagreements in data-driven versus heuristic peptide sequence and charge state selection. For sequence selection, the heuristic was the fully cleaved phosphopeptide (i.e. cleavage after all lysines and arginines except when residue at +1 is proline), and 100 phosphosites were analyzed (200 targets). For charge selection, the heuristic was positively charged amino acids +1, and 50 phosphopeptides were analyzed (100 targets).

Statistical analysis

Statistics regarding peptide and phosphorylation site identification are discussed in the above section entitled “Data processing and analysis”. Sample sizes necessary for non-parametric comparisons are not easily predictable since the expected shape of the distribution is unknown. For the comparison of peptide sequence and charge state selection we used n=100 and n=50 respectively, which we predicted would be sufficient to detect a roughly 10-fold difference in the median peak area intensity. Peptide sequence selection implicitly also requires charge state selection for each sequence, hence the sample size was increased for that experiment to account for additional variability. A Wilcoxon signed rank test (paired non-parametric) was used because the intensity of different peptides representing the same phosphorylation site in the same sample are related but the intensity of peptides arising from different phosphorylation sites in the sample are unrelated and not normally distributed. Similarly, sample size necessary for high-throughput measurements is also difficult to predict, since different analytes in the assay each have different expected effect sizes and precision. For the quantitative analysis of IGF-1 stimulation versus control, we assumed that replicate measures of a typical phosphopeptide would follow a normal distribution after log transformation with coefficient of variation of approximately 20%. Under these assumptions we used a sample size of 6, which we predicted would be sufficient to detect a 2-fold change in most targets. Unpaired t-tests were used to assess significant differences between control and IGF-1 treated samples. Replicates were from independent treatments of the same source of MCF7 cells.

Supplementary Material

NIHMS765948-supplement-1.doc^{(3.8MB, doc)}

Acknowledgments

We would like to thank S.A. Gerber for providing data, the MacCoss lab for advice designing and analyzing DIA and PRM assays with Skyline, and the Villén lab for critical discussions. This work was supported by a Samuel and Althea Stroum Endowed Graduate Fellowship to R.L., Interdisciplinary Training in Genome Sciences grant from NIH/NHGRI (T32 HG00035) to B.S., and an Ellison Medical Foundation New Scholar Award (AG-NS-0953-12) to J.V.

Footnotes

ACCESSION CODES

Raw MS data for the experiments performed in this study are available at MassIVE (MSV000079423) and ProteomeXchange (PXD003344).

AUTHOR CONTRIBUTIONS

R.L. and J.V. conceived the study. R.L., B.S., and J.V. designed the experiments. R.L. and B.S. performed the experiments and analyzed data. A.L. created the web resource. J.V. supervised the work. R.L., B.S., and J.V. wrote the paper.

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

References

1.Ubersax JA, Ferrell JE. Mechanisms of specificity in protein phosphorylation. Nat Rev Mol Cell Biol. 2007;8:530–541. doi: 10.1038/nrm2203. [DOI] [PubMed] [Google Scholar]
2.Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc Natl Acad Sci USA. 2003;100:6940–6945. doi: 10.1073/pnas.0832254100. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Soste M, et al. A sentinel protein assay for simultaneously quantifying cellular processes. Nat Methods. 2014;11:1045–1048. doi: 10.1038/nmeth.3101. [DOI] [PubMed] [Google Scholar]
4.de Graaf EL, et al. Signal Transduction Reaction Monitoring Deciphers Site-Specific PI3K-mTOR/MAPK Pathway Dynamics in Oncogene-Induced Senescence. J Proteome Res. 2015;14:2906–2914. doi: 10.1021/acs.jproteome.5b00236. [DOI] [PubMed] [Google Scholar]
5.Rikova K, et al. Global survey of phosphotyrosine signaling identifies oncogenic kinases in lung cancer. Cell. 2007;131:1190–1203. doi: 10.1016/j.cell.2007.11.025. [DOI] [PubMed] [Google Scholar]
6.Huttlin EL, et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell. 2010;143:1174–1189. doi: 10.1016/j.cell.2010.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Lundby A, et al. Quantitative maps of protein phosphorylation sites across 14 different rat organs and tissues. Nat Commun. 2012;3:876. doi: 10.1038/ncomms1871. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Sharma K, et al. Ultradeep human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based signaling. Cell Rep. 2014;8:1583–1594. doi: 10.1016/j.celrep.2014.07.036. [DOI] [PubMed] [Google Scholar]
9.Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. Quantitative mass spectrometry in proteomics: a critical review. Anal Bioanal Chem. 2007;389:1017–1031. doi: 10.1007/s00216-007-1486-6. [DOI] [PubMed] [Google Scholar]
10.Venable JD, Dong MQ, Wohlschlegel J, Dillin A, Yates JR. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat Methods. 2004;1:39–45. doi: 10.1038/nmeth705. [DOI] [PubMed] [Google Scholar]
11.Gillet LC, et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics. 2012;11 doi: 10.1074/mcp.O111.016717. O111.016717. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Peterson AC, Russell JD, Bailey DJ, Westphall MS, Coon JJ. Parallel reaction monitoring for high resolution and high mass accuracy quantitative, targeted proteomics. Mol Cell Proteomics. 2012;11:1475–1488. doi: 10.1074/mcp.O112.020131. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Parker BL, et al. Targeted phosphoproteomics of insulin signaling using data-independent acquisition mass spectrometry. Sci Signal. 2015;8:rs6. doi: 10.1126/scisignal.aaa3139. [DOI] [PubMed] [Google Scholar]
14.Dickhut C, Feldmann I, Lambert J, Zahedi RP. Impact of digestion conditions on phosphoproteomics. J Proteome Res. 2014;13:2761–2770. doi: 10.1021/pr401181y. [DOI] [PubMed] [Google Scholar]
15.Giansanti P, et al. An Augmented Multiple-Protease-Based Human Phosphopeptide Atlas. Cell Rep. 2015;11:1834–1843. doi: 10.1016/j.celrep.2015.05.029. [DOI] [PubMed] [Google Scholar]
16.de Graaf EL, Giansanti P, Altelaar AFM, Heck AJR. Single-step enrichment by Ti4+-IMAC and label-free quantitation enables in-depth monitoring of phosphorylation dynamics with high reproducibility and temporal resolution. Mol Cell Proteomics. 2014;13:2426–2434. doi: 10.1074/mcp.O113.036608. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Gerber SA. unpublished. [Google Scholar]
18.Hornbeck PV, et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015;43:D512–20. doi: 10.1093/nar/gku1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Röst HL, et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol. 2014;32:219–223. doi: 10.1038/nbt.2841. [DOI] [PubMed] [Google Scholar]
20.Ting YS, et al. Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data. Mol Cell Proteomics. 2015;14:2301–2307. doi: 10.1074/mcp.O114.047035. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.MacLean B, et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26:966–968. doi: 10.1093/bioinformatics/btq054. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Eng JK, Jahan TA, Hoopmann MR. Comet: an open-source MS/MS sequence database search tool. Proteomics. 2013;13:22–24. doi: 10.1002/pmic.201200439. [DOI] [PubMed] [Google Scholar]
23.Käll L, Canterbury JD, Weston J, Noble WS, MacCoss MJ. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods. 2007;4:923–925. doi: 10.1038/nmeth1113. [DOI] [PubMed] [Google Scholar]
24.Beausoleil SA, Villen J, Gerber SA, Rush J, Gygi SP. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol. 2006;24:1285–1292. doi: 10.1038/nbt1240. [DOI] [PubMed] [Google Scholar]
25.Tsou CC, et al. DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods. 2015;12:258–264. doi: 10.1038/nmeth.3255. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS765948-supplement-1.doc^{(3.8MB, doc)}

[R1] 1.Ubersax JA, Ferrell JE. Mechanisms of specificity in protein phosphorylation. Nat Rev Mol Cell Biol. 2007;8:530–541. doi: 10.1038/nrm2203. [DOI] [PubMed] [Google Scholar]

[R2] 2.Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc Natl Acad Sci USA. 2003;100:6940–6945. doi: 10.1073/pnas.0832254100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Soste M, et al. A sentinel protein assay for simultaneously quantifying cellular processes. Nat Methods. 2014;11:1045–1048. doi: 10.1038/nmeth.3101. [DOI] [PubMed] [Google Scholar]

[R4] 4.de Graaf EL, et al. Signal Transduction Reaction Monitoring Deciphers Site-Specific PI3K-mTOR/MAPK Pathway Dynamics in Oncogene-Induced Senescence. J Proteome Res. 2015;14:2906–2914. doi: 10.1021/acs.jproteome.5b00236. [DOI] [PubMed] [Google Scholar]

[R5] 5.Rikova K, et al. Global survey of phosphotyrosine signaling identifies oncogenic kinases in lung cancer. Cell. 2007;131:1190–1203. doi: 10.1016/j.cell.2007.11.025. [DOI] [PubMed] [Google Scholar]

[R6] 6.Huttlin EL, et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell. 2010;143:1174–1189. doi: 10.1016/j.cell.2010.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Lundby A, et al. Quantitative maps of protein phosphorylation sites across 14 different rat organs and tissues. Nat Commun. 2012;3:876. doi: 10.1038/ncomms1871. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Sharma K, et al. Ultradeep human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based signaling. Cell Rep. 2014;8:1583–1594. doi: 10.1016/j.celrep.2014.07.036. [DOI] [PubMed] [Google Scholar]

[R9] 9.Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. Quantitative mass spectrometry in proteomics: a critical review. Anal Bioanal Chem. 2007;389:1017–1031. doi: 10.1007/s00216-007-1486-6. [DOI] [PubMed] [Google Scholar]

[R10] 10.Venable JD, Dong MQ, Wohlschlegel J, Dillin A, Yates JR. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat Methods. 2004;1:39–45. doi: 10.1038/nmeth705. [DOI] [PubMed] [Google Scholar]

[R11] 11.Gillet LC, et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics. 2012;11 doi: 10.1074/mcp.O111.016717. O111.016717. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Peterson AC, Russell JD, Bailey DJ, Westphall MS, Coon JJ. Parallel reaction monitoring for high resolution and high mass accuracy quantitative, targeted proteomics. Mol Cell Proteomics. 2012;11:1475–1488. doi: 10.1074/mcp.O112.020131. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Parker BL, et al. Targeted phosphoproteomics of insulin signaling using data-independent acquisition mass spectrometry. Sci Signal. 2015;8:rs6. doi: 10.1126/scisignal.aaa3139. [DOI] [PubMed] [Google Scholar]

[R14] 14.Dickhut C, Feldmann I, Lambert J, Zahedi RP. Impact of digestion conditions on phosphoproteomics. J Proteome Res. 2014;13:2761–2770. doi: 10.1021/pr401181y. [DOI] [PubMed] [Google Scholar]

[R15] 15.Giansanti P, et al. An Augmented Multiple-Protease-Based Human Phosphopeptide Atlas. Cell Rep. 2015;11:1834–1843. doi: 10.1016/j.celrep.2015.05.029. [DOI] [PubMed] [Google Scholar]

[R16] 16.de Graaf EL, Giansanti P, Altelaar AFM, Heck AJR. Single-step enrichment by Ti4+-IMAC and label-free quantitation enables in-depth monitoring of phosphorylation dynamics with high reproducibility and temporal resolution. Mol Cell Proteomics. 2014;13:2426–2434. doi: 10.1074/mcp.O113.036608. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Gerber SA. unpublished. [Google Scholar]

[R18] 18.Hornbeck PV, et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015;43:D512–20. doi: 10.1093/nar/gku1267. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Röst HL, et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol. 2014;32:219–223. doi: 10.1038/nbt.2841. [DOI] [PubMed] [Google Scholar]

[R20] 20.Ting YS, et al. Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data. Mol Cell Proteomics. 2015;14:2301–2307. doi: 10.1074/mcp.O114.047035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.MacLean B, et al. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26:966–968. doi: 10.1093/bioinformatics/btq054. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Eng JK, Jahan TA, Hoopmann MR. Comet: an open-source MS/MS sequence database search tool. Proteomics. 2013;13:22–24. doi: 10.1002/pmic.201200439. [DOI] [PubMed] [Google Scholar]

[R23] 23.Käll L, Canterbury JD, Weston J, Noble WS, MacCoss MJ. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods. 2007;4:923–925. doi: 10.1038/nmeth1113. [DOI] [PubMed] [Google Scholar]

[R24] 24.Beausoleil SA, Villen J, Gerber SA, Rush J, Gygi SP. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol. 2006;24:1285–1292. doi: 10.1038/nbt1240. [DOI] [PubMed] [Google Scholar]

[R25] 25.Tsou CC, et al. DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods. 2015;12:258–264. doi: 10.1038/nmeth.3255. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

“Plug-and-play” investigation of the human phosphoproteome by targeted high-resolution mass spectrometry

Robert T Lawrence

Brian C Searle

Ariadna Llovet

Judit Villén

Abstract

Figure 1.

Figure 2.

ONLINE METHODS

Cell culture

Sample preparation

Mass Spectrometry

Data processing and analysis

Statistical analysis

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

“Plug-and-play” investigation of the human phosphoproteome by targeted high-resolution mass spectrometry

Robert T Lawrence

Brian C Searle

Ariadna Llovet

Judit Villén

Abstract

Figure 1.

Figure 2.

ONLINE METHODS

Cell culture

Sample preparation

Mass Spectrometry

Data processing and analysis

Statistical analysis

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases