Abstract
Phosphorylation is a post-translational modification (PTM) fundamental for processes such as signal transduction and enzyme activity. We propose to apply data-independent acquisition (DIA) using mass spectrometry (MS) to determine unexplored phosphorylation events on isobarically modified peptides. Such peptides are commonly not quantitatively discriminated in phosphoproteomics due to their identical mass.
Phosphorylation is a reversible modification that regulates many biological events in a cell, including cell cycle, growth, apoptosis and signal transduction pathways1. Phosphorylation is currently the most studied and characterized post-translational modification (PTM) in biological systems, and likely one of the most abundant in nature. Its misregulation can be detrimental, giving rise to abnormal phenotype and many diseases such as cancer2, 3. Epidermal growth factor (EGF) binds to the EGF receptor on external cell membrane and stimulates cell signalling, modulating the activity of enzymes involved in specific pathways including transcriptional regulators, small GTPases and kinases by stimulating phosphorylation events4–6. Since EGF stimulation influences a complex cell signalling pathways, there are numerous phosphorylation and de-phosphorylation events in response to EGF4. Antibody based approaches for characterization of phosphorylations are labor intensive and time consuming to study a large number of protein phosphorylation events.
Mass spectrometry (MS), especially when coupled with nano-liquid chromatography (nLC), has become the method of choice to characterize PTMs in large scale and in an unbiased manner7. In MS numerous acquisition methods have been developed to address different needs in protein and peptide characterization. For instance, selected reaction monitoring (SRM) gives the highest sensitivity and selectivity due to its ability to select predetermined precursor and product ions from complex mixtures. On the other hand, methods like data dependent acquisition (DDA) are preferred for discovery proteomics, since the instrument automatically determines which signals should undergo fragmentation, generating a list of MS/MS spectra suitable for database searching and peptide identification8. More recently, data independent acquisition (DIA) methods have emerged, in order to provide a compromise between SRM and DDA9, 10. These methods use a first mass analyzer to generate wide isolation windows (10–25 m/z) stepping across a mass range, collecting sequential MS/MS at every instrument duty cycle. This leads to the production of an ion map of fragments from all detectable precursor masses11, 12. This type of raw file is suitable for the so-called “virtual SRM”, where the abundance of selected peptide candidates is integrated using MS and MS/MS extracted ion chromatography (XIC)12. However, such complex spectra are generally not ideal for peptide identification. Thus, the most common workflow currently is to perform a DDA run for peptide identification followed by DIA runs for accurate quantification. For this purpose, bioinformatics tools as (e.g.) Skyline13, OpenSWATH™ 14 or other commercial software have been developed to integrate such datasets and facilitate the analysis.
Due to its nature, DIA has the potential to differentially quantify isobaric peptides, i.e. peptides with same sequence and PTMs, but with PTMs on different amino acid residues. We recently proved its efficiency in the analysis of histone samples, as histone peptides are highly enriched in isobaric modifications15, 16. However, histone protein samples are relatively simple mixtures. Thus far, no attempt was made for proteome-wide entire cell lysates. In this study, we performed DDA followed by DIA on an Orbitrap Fusion™ (Thermo Scientific) for the analysis of the entire phosphoproteome of EGF stimulated HeLa cells. After serum starvation, we treated HeLa cells with EGF for 0, 5, and 20 min. Then we extracted total proteins, performed traditional protein digestion using trypsin and phosphopeptide enrichment using titanium dioxide (TiO2) chromatography.
First, we analyzed the phosphoproteome using DDA, where we identified approximately 12,000 phosphopeptides (Figure 1 and Table S1). This number includes phosphopeptides with ambiguous localization site, as shown by the localization confidence score in Table S1. After using DDA to generate a spectral library we performed DIA runs. From the elution profile, most of the peaks width was at least 1 min (calculated by manual estimation of the peak baseline); given that instrument duty cycle was about 2 seconds this indicated that each chromatographic profile was accurately defined by about 30 data points. Peptide identifications from DDA and DIA raw files were uploaded in Skyline13. The software integrated the identification list with peak area integration; this list was filtered using the mProphet algorithm available in Skyline itself. This led to the quantification of more than 7,000 phosphopeptides (Figure 1 and Table S2). Of these, about 1,500 had at least one isobaric form also quantified (Figure 1 and Table S3).
We selected this last table and we attempted to perform differential quantification of isobaric phosphopeptide species using unique fragment ions, i.e. same fragment ion type (e.g. y5) with a different mass between the two species due to different position of the phosphoryl group. We focused only on phosphopeptides present in only two isobaric forms to avoid differential quantification from spectra with three or more mixed species. This manual approach led to a dramatic decrease in usable peptides; in fact, only 27 positional isomer pairs were obtained (Figure 1). The reasons of this relative small number as compared to the total list of identified phosphopeptides are multiple: (i) only part of isobaric peptides have sufficiently intense unique fragment ions for differential quantification; (ii) it is required that both species are identified with comparable scores during database searching, which might not happen as one of the two species could be of low abundance and thus considered as not a confident identification; (iii) there is presently no automated data analysis software, implying that all calculations of relative ratios was tediously manually performed. In addition, 28 isobaric phosphopeptides (14 pairs) we considered had different retention times showing distinct chromatographic profile (Figure 2A). These do not require MS/MS based quantification, as their abundance can be easily discriminated using traditional MS based XIC.
One example of co-eluting and discriminated isobaric phosphopeptides is shown in Figure 2B. The peptide AASPSPQSVR (747, 756) was identified as belonging to the translated cDNA FLJ61739 with the phosphorylation pattern Ser 749 and 751, but also Ser 749 and 754. At the precursor mass level these two species completely overlap and thus would have the same abundance if quantified by MS XIC. By using the y5 fragment, which has a unique mass for each of the two species, we could determine the relative abundance of these peptides in all three analyzed conditions. It is worthwhile to note that phosphorylation on serine 751 decreases 5 min after EGF stimulation while phosphorylation on serine 754 does not decrease until 20 min after EGF treatment. More examples are illustrated in the Skyline file included as supporting information (for review, available at https://www.dropbox.com/sh/1tcsvanalqg6cd6/AADhClPRYcCtaTBoKH4T-jUaa?dl=0).
Finally, in order to evaluate the accuracy of MS/MS based quantification we considered the variance between technical replicates for those analyzed isobaric species (Figure 2D). The precursor mass ion signal provided the most reproducible observation (average CV 9.66%) as expected, being the most intense detectable ions. We also estimated the variance of the ions used to estimate the relative ratio between isobaric phosphopeptides performing three different comparisons. At first, we just estimated the variance of the intensity of the fragment ions between replicates (product, CV 17.70%); then, we calculated the ratio between the fragment ion of the isobaric species A and isobaric species B, and verified the reproducibility of the measurement between replicates (ratio product, CV 21.14%); finally, we calculated when possible the ratio of two isobaric peptides A and B using different fragment ions, and verified the reproducibility of the resulting ratio (Multiple ratio products, CV 72.46%). Results indicated that precursor and product ion signals had excellent reproducibility in their measurements. However, when calculating the ratio between two isobaric peptides the measurement was more variable depending on which fragment ion was used. This was expected, as very low abundance fragment ions, even if detectable, might have compressed dynamic range and therefore create a bias in the ratio calculation. In this study, if more than one unique fragment was detected, we considered the average ratio between the two isobaric isoforms.
Conclusions
We perform proof of principle experiments to introduce DIA as valuable tool to perform differential quantification of isobaric phosphopeptides in proteomics analyses. The application of this method is still limited due to the challenge of (i) identifying multiple species from the same MS/MS spectrum, (ii) the sensitivity in detecting unique fragment ions of the same type for the two isobaric isoforms (e.g. y5 and y5′) and (iii) the mathematical limitations in calculating the relative ratio between more than two isoforms. However, we speculate that the rapidly increasing interest of DIA methods would lead to the development of suitable software for data processing, identification and quantification of positionally isobaric modified peptides, as much is currently missed due to our inability of discriminating the abundance of co-eluting peptides with same precursor mass.
Supplementary Material
Acknowledgments
We gratefully acknowledge funding from NIH grants R01GM110174 and R01AI118891, and a Leukemia & Lymphoma Society Dr. Robert Arceci Scholar Award.
Footnotes
Electronic Supplementary Information (ESI) available: Materials and Methods (document); Supplementary tables 1–3 (Excel tables).
References
- 1.Cohen P. Trends in biochemical sciences. 2000;25:596–601. doi: 10.1016/s0968-0004(00)01712-6. [DOI] [PubMed] [Google Scholar]
- 2.Chi P, Allis CD, Wang GG. Nature Reviews Cancer. 2010;10:457–469. doi: 10.1038/nrc2876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Portela A, Esteller M. Nat Biotechnol. 2010;28:1057–1068. doi: 10.1038/nbt.1685. [DOI] [PubMed] [Google Scholar]
- 4.Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. Cell. 2006;127:635–648. doi: 10.1016/j.cell.2006.09.026. [DOI] [PubMed] [Google Scholar]
- 5.Tong JF, Li LJ, Ballermann B, Wang ZX. Plos One. 2016:11. doi: 10.1371/journal.pone.0147103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zhao YT, Feresin RG, Falcon-Perez JM, Salazar G. Traffic. 2016;17:267–288. doi: 10.1111/tra.12371. [DOI] [PubMed] [Google Scholar]
- 7.Karch KR, Denizio JE, Black BE, Garcia BA. Frontiers in genetics. 2013;4:264. doi: 10.3389/fgene.2013.00264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Zhang Y, Fonslow BR, Shan B, Baek MC, Yates JR., 3rd Chemical reviews. 2013;113:2343–2394. doi: 10.1021/cr3003533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ma D, Chan MK, Lockstone HE, Pietsch SR, Jones DN, Cilia J, Hill MD, Robbins MJ, Benzel IM, Umrania Y, Guest PC, Levin Y, Maycox PR, Bahn S. Journal of proteome research. 2009;8:3284–3297. doi: 10.1021/pr800983p. [DOI] [PubMed] [Google Scholar]
- 10.Gillet LC, Navarro P, Tate S, Rost H, Selevsek N, Reiter L, Bonner R, Aebersold R. Molecular & cellular proteomics : MCP. 2012;11:O111 016717. doi: 10.1074/mcp.O111.016717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hopfgartner G, Tonoli D, Varesio E. Analytical and bioanalytical chemistry. 2012;402:2587–2596. doi: 10.1007/s00216-011-5641-8. [DOI] [PubMed] [Google Scholar]
- 12.Reiter L, Rinner O, Picotti P, Huttenhain R, Beck M, Brusniak MY, Hengartner MO, Aebersold R. Nature methods. 2011;8:430–435. doi: 10.1038/nmeth.1584. [DOI] [PubMed] [Google Scholar]
- 13.MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ. Bioinformatics. 2010;26:966–968. doi: 10.1093/bioinformatics/btq054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rost HL, Rosenberger G, Navarro P, Gillet L, Miladinovic SM, Schubert OT, Wolski W, Collins BC, Malmstrom J, Malmstrom L, Aebersold R. Nat Biotechnol. 2014;32:219–223. doi: 10.1038/nbt.2841. [DOI] [PubMed] [Google Scholar]
- 15.Sidoli S, Lin S, Xiong L, Bhanu NV, Karch KR, Johansen E, Hunter C, Mollah S, Garcia BA. Molecular & cellular proteomics : MCP. 2015;14:2420–2428. doi: 10.1074/mcp.O114.046102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sidoli S, Simithy J, Karch KR, Kulej K, Garcia BA. Analytical chemistry. 2015;87:11448–11454. doi: 10.1021/acs.analchem.5b03009. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.