Abstract
We have explored the use of electrostatic repulsion hydrophilic interaction chromatography (ERLIC) as an alternative to the gold-standard in shotgun proteomics: reversed-phase (RP) LC for online ESI-MS/MS. Conditions for sample solubilization and initial gradient conditions were optimized to strike a balance between peptide solubility and maximum peptide retention when using mobile phase with high organic solvent concentration. Online ERLIC-MS demonstrated a 57% increase in total peptide identifications compared to RP-MS. We examined the mechanism of this improved performance and found that it stems from ERLIC’s propensity to retain longer peptides which can be identified with greater confidence. Online nanoscale ERLIC-MS provides a powerful new tool for enhancing MS-based shotgun proteomic in a broad range of applications.
Keywords: online ERLIC, reversed phase liquid chromatography, mass spectrometry, shotgun proteomics
Introduction
RPLC online with ESI-MS/MS is the foundational platform for shotgun proteomics. RP is well-understood, retaining and separating peptides based on hydrophobic character using volatile, MS-friendly mobile phase solvents. Few alternatives to RPLC are in use, and none that show improved performance for general shotgun proteomic studies in complex mixtures. Hydrophilic interaction liquid chromatography (HILIC) which works via a pseudo-normal phase separation has been described as an online alternative to RP. However, its limitations in retaining, separating and eluting peptides across the physicochemical extremes inherent to complex tryptic peptide mixtures1 has constrained its use mainly to the analysis of peptides with hydrophilic PTMs, especially glycosylation and phosphorylation.2–4
Online electrostatic repulsion hydrophilic interaction chromatography (ERLIC) offers a promising alternative to RP. ERLIC uses a weak anion exchange stationary phase and a mobile phase gradient of decreasing organic solvent and pH, separating peptides in decreasing order of pI and polarity.5 Two modes of retention are superimposed during such a separation. Early in the gradient, at high organic solvent, peptides are retained by a hydrophilic attraction to the stagnant aqueous layer surrounding the stationary phase. Later in the gradient, as the organic content is decreased and the hydrophilic attraction becomes less significant, electrostatic forces begin to dominate. Basic peptides are strongly hydrophilic but are repelled electrostatically from the anion exchange column while acidic peptides are eluted once the mobile phase becomes sufficiently acidic to neutralize their carboxyl groups.
Although effective for offline fractionation, coupling of ERLIC online with ESI has been limited.6 Here, we show optimized conditions for online ERLIC-MS and show that online ERLIC-MS outperforms RPLC-MS in the identification of proteins in complex mixtures, thereby offering a powerful new tool in shotgun proteomics.
Methods
Cell culture
Yeast cells (strain BY4742) were grown in YPD broth overnight at 30°C. HeLa cells (NIH ATCC CCL-2.2) were grown in DMEM supplemented with 10% FBS, L-glutamine and penicillin/streptomycin.
Sample preparation
Cells were washed three times in cold PBS containing a protease inhibitor cocktail (Roche). The washed cells were lysed in 50mM tris pH 8.0 with 2% SDS at 95°C for 10 min with intermittent vortexing. Cellular debris was removed by centrifuging at 16,100×g and recovering the supernatant into a clean microfuge tube. Protein recovery was measured using the micro BCA assay (Thermo Scientific). Proteins were reduced in DTT for 1h at 55°C and trypsin digested using the FASP protocol7 using iodoacetamide as the cysteine alkylating reagent. The resulting peptides were desalted using SPE cartridges (tC18 Sep-pak, Waters).
Peptide solubility studies
For peptide solubility studies, yeast peptides were aliquotted into separate tubes, each containing 5 μg total peptides and the solvent removed by vacuum drying. Solvents were prepared by mixing acetonitrile (ACN) containing 0.1% ammonium acetate with 97.9% water, 2% ACN and 0.1% formic acid to generate mixtures with final ACN concentrations of 75, 80, 83.3, 86.6 and 90%. Five μL of solvent was added to a yeast sample followed by vortexing for ~ 1min. The solvent with dissolved peptides, designated as the “soluble fraction”, was then transferred to a new tube and both tubes were dried by vacuum centrifugation. Peptides remaining in the original tube were designated as the “insoluble fraction”. Both fractions were dissolved in 6μL load RP load solvent (2% ACN and 0.1% formic acid in water) and analyzed by RP-LC-MS on an LTQ-Oribtrap Velos using a 60 min gradient from 2–40% ACN with constant 0.1% formic acid. The gradient was delivered by an Eksigent 1DLC LC system. Columns were packed to 13 cm with 5μm, 200 Å C18AQ particles in a 75 μm I.D. electrospray tip (New Objective). Electrospray was performed at 2.0kV. The LTQ-Orbitrap Velos was operated in a top-ten datadependent mode using survey scans at 30,000 resolution from 300–1800 m/z. Tandem MS scans were acquired with an isolation width of 2 m/z and fragmentation mode was HCD with 40% normalized collision energy for 0.1 ms. The automatic gain control settings were 3×105 ions in the ion trap, and 1×106 in the Orbitrap. Dynamic exclusion was used with a duration of 15 s and a repeat count of 1.
Online ERLIC-MS and RPLC-MS
ERLIC columns were prepared by packing a 75μm I.D. electrospray tip (New Objective) to 11cm with polyWAX bulk material (5μm, 300 Å; PolyLC) in a slurry of ACN. The column was connected to a Paradigm MS4 system (Michrom Bioresources). Solvent A was 2% ACN and 0.1% formic acid in water. Solvent B was ACN with 0.1% ammonium acetate. A linear gradient was run from 75-30 %B over 60 min followed by isocratic elution at 30% B for 4 min with a flow rate of 0.25 μL/min. Yeast lysate samples (5μL) were dissolved in 5μL of a 1:3 mixture of solvents A and B and loaded directly onto the column by means of a pressure vessel.
RP columns were prepared by packing a 75μm electrospray tip (New Objective) to 11cm with 5μm, 200Å C18AQ particles; Michrom Bioresources). Solvent A was 2% ACN and 0.1% formic acid in water. Solvent B was ACN with 0.1% formic acid. A linear gradient was run from 0 – 40 %B over 60 min followed by isocratic elution at 80% B for 4 min with a flow rate of 0.25 μL/min. Yeast lysate samples (5μg) were dissolved in 5μL of solvent A and loaded directly onto the column by means of a pressure vessel.
Mass spectrometry and database searching
The sample was introduced into an LTQ mass spectrometer (Thermo Fisher) by performing online electrospray ionization (ESI) at 2.0 kV. The LTQ was operated in a top-five data-dependent mode using survey scans from 400–1800 m/z. Tandem MS scans were acquired with an isolation width of 2 m/z and fragmentation mode was CID with 35% normalized collision energy for 30ms at a Q value of 0.25. The automatic gain control was set to 3×105 charges in the ion trap. Dynamic exclusion was used with a duration of 30 s and a repeat count of 1.
Raw files were converted to mzxml using msconvert (distributed as part of ProteoWizard 1.6.1260). Tandem mass spectra were searched against a yeast database containing proteins expressed from 5889 well-characterized open-reading frames in the yeast genome, including reversed sequences and common contaminant proteins (12316 entries) using Sequest v27.0. Search parameters included a 2.0 amu precursor and 1.0 amu fragment mass tolerance, 2 missed cleavages, partial trypsin specificity, fixed modification of cysteine acetamidylation and variable modification of methionine oxidation. Search results were filtered to 99% protein probability and 95% peptide probability in Scaffold (v3.3.1, Proteome Software), producing false discovery rates of 0.8 – 3.6%.
Isobaric tagging experiment
Peptides from yeast or HeLa cells were dissolved in the manufacturer-supplied buffer and labeled with the 114 and 117 iTRAQ® (Applied Biosystems) labels, respectively, at room temperature for 1h and desalted with Sep-Pak cartridges. Equal amounts of labeled yeast and HeLa peptides were combined to create a two-species sample. This mixed sample was loaded directly onto an ERLIC or RP column and analyzed as above except that pulsed Q dissociation8 was used with 35% normalized collision energy for 0.1ms at a Q value of 0.70. The data were searched against a combined database of human and yeast forward and reversed sequences and common contaminants. The values of the iTRAQ® reporter ions were extracted from all MS2 scans with in-house software.
Data processing
Peptide isoelectric points were calculated using an in-house perl script adapted from the Trans Proteomic Pipeline’s piCalculator (Seattle Proteome Center). Pseudo-3D plots were created in Xcalibur (v2.2, Thermo Scientific) using map view. Graphs were created in Prism (v5.04, GraphPad Software) and Microsoft Excel (2010, Microsoft Corporation).
Results and Discussion
Published offline ERLIC2, 5, 9 and HILIC10, 11 experiments frequently use a mobile phase containing 80–90% acetonitrile (ACN) for peptide loading and initial gradient conditions. However, not all peptides may be soluble in highly organic solvents,12 raising a concern about sample loss prior to LC loading. To assess this, we dissolved yeast lysate peptides in 75, 80, 83.3, 86.7 and 90% ACN, and analyzed both the insoluble peptide pellet and the dissolved peptides by RP-LC-MS. The number of insoluble peptides was 4.4 times higher in 90% ACN relative to 75%, indicating that a lower starting amount of ACN minimizes peptide losses. Similarly, 2.3 times as many peptides were identified in the soluble fraction of peptides dissolved in 75% ACN compared to 90% ACN (Figure 1). The situation is moderately exacerbated at 4°C (data not shown), as might occur in a chilled autosampler tray. While this trend of increased peptide solubility might continue below 75% ACN, such solvent conditions would risk lack of retention for more basic and hydrophobic peptides.
We tested the efficacy of ERLIC separations beginning at 75 and 82.5% ACN. We found marginally more peptide identifications starting at 75% ACN (data not shown), suggesting that conditions optimized for peptide solubility trumps any potential loss of peptide retention by ERLIC at lower starting %ACN. Thus, we concluded that 75% ACN for sample solubilization and loading offered an optimal balance between peptide solubility and retention. This is consistent with others’ offline ERLIC3, 13, 14 and HILIC experiments.15
With optimized conditions for sample loading in hand, we compared three technical replicates of optimized analyses of column-loaded whole cell yeast lysates using ERLIC or RP. Figure 2 and Table 1 show that ERLIC greatly outperformed RP. Supplementary Figure 1 shows representative chromatograms for each separation. Overall, ERLIC identified with high confidence 39% more nonredundant peptides, leading to 40% more protein identifications. While this report was focused on the results from Sequest, Similar results were found when searching using Mascot, X!Tandem and OMSSA, resulting in 39–59% more proteins identified by online ERLIC (Supplementary Figure 2). Supplementary Table 1 shows all peptide sequence matches and proteins identified using Sequest. The mean GRAVY score16 of ERLIC-identified peptides was significantly lower than in RP (p < 0.0001), demonstrating that ERLIC better retains polar peptides (Supplementary Figure 3a). The mean peptide pI for ERLIC was also significantly lower than RP (p < 0.0001), albeit with a narrower range (Figure 2c). Consistent with these observations, ERLIC identified more acidic residues and fewer aliphatic residues than RP (Supplementary Figures 3b, c). Of particular note, the ERLIC separations consistently produced a higher peptide identification rate (1.7-fold greater overall); that is, the fraction of MS2 spectra which produced a confident peptide spectral match (PSM) relative to the total MS2 acquired was always higher with ERLIC.
Table 1.
ERLIC | RP | |
---|---|---|
Protein IDs, 2-peptide minimum (FDR, %) | 527 (0.9) | 377 (0.8) |
Protein IDs, 1-peptide minimum (FDR, %) | 563 (1.8) | 414 (3.6) |
all PSMs (FDR, %) | 11322 (0.15) | 7203 (0.25) |
nonredundant PSMs (FDR, %) | 2936 (0.55) | 2119 (0.90) |
average Xcorr (s.d.) | 3.96 (0.90) | 3.81 (0.86) |
average delta Cn (s.d.) | 0.38 (0.11) | 0.36 (0.12) |
average peptide charge (s.d.) | 2.06 (0.26) | 2.10 (0.34) |
average identification rate, % (s.d.) | 17.8 (0.53) | 10.7 (0.21) |
Four potential reasons for ERLIC’s improvements in MS2 spectral identification rate relative to RP were explored. First, the potential for improved precursor ion purity in ERLIC separations was examined. Pseudo-3D plots of representative chromatograms, sorted on MS1 or MS2 scan intensity (Figure 3b–e) or by peptide identifications (Figure 3a) all show that in ERLIC, peptides are almost ideally spread across the m/z vs retention time space. In RP however, detected peptides occupy a much smaller portion of the retention time space and fall largely on a diagonal, indicative of the expected correlation between m/z (and peptide mass by extrapolation) and retention time in RP. Given this observation, an isobaric tagging experiment using lysates from yeast and human cells was performed to test whether ERLIC’s improved separation of peptides would produce a greater proportion of “pure” MS2 spectra originating from a single peptide precursor ion, thereby leading to more high confidence PSMs. Because the impure spectra made up the vast majority of the spectra recorded using either RP or ERLIC (Supplementary Figure 4), the identification rate from these spectra mirrored the overall identification rates when considering all spectra (data not shown). Thus increased precursor purity cannot be attributed to the difference in identification rate.
Second, the signal intensity of peptides could differ as a result of the higher concentration of organic solvent used in ERLIC’s mobile phase. The pseudo-3D plots (Figure 3b–e) also show the MS1 and MS2 signal intensities are generally higher in RP. Therefore, the improvements of ERLIC-MS do not seem to be caused by improved ionization efficiency and higher signal intensity due to the highly organic mobile phase relative to RP. This observation is consistent with those of Gilar et al.,12 but contrary to those of others.17, 18
Third, the charge states of the peptides identified in this study could also be influenced by the difference in organic solvent concentration of the respective mobile phases. Triply charged precursors are known to be assigned inflated scores compared to doubly charged in database search programs such as Sequest,19 thus a significantly different charge state distribution might explain the differences between ERLIC and RP. Doubly and triply charged peptide ions from the ERLIC runs made up 93% and 6% of the PSMs, while those from RP made up 87% and 11%, respectively (Supplemental Figure 5). The proportion of singly charged peptides was very small in both methods, indicating that singly charged peptides are not a significant factor in either ERLIC or RP. Thus, differences in charge state distribution seems unlikely to cause the observed improvement in identification rate.
Finally, we examined the physicochemical properties of the peptides identified. ERLIC retained and identified a significantly higher proportion of acidic peptides than RP (Figure 2c). Because there is no direct evidence in the literature suggesting that acidic peptides are more efficiently identified compared to basic peptides, we explored further the physical differences between acidic and basic peptides. We performed an in silico tryptic digest of the yeast proteome, focusing on peptides within the size range normally identified via LC-MS/MS, and sorted the peptides by isoelectric point. Acidic peptides tend to be longer (Figure 4a), and longer peptides score better in correlation-based programs such as Sequest and Mascot20 regardless of their pI (Supplementary Figure 6). Analysis of our data using other database search programs tested (X!Tandem and OMSSA) also showed a bias in these programs towards improved scores on longer peptides (data not shown). When concentrating on identified peptides, the distribution of peptide lengths identified by ERLIC and RP was not significantly different, with an average of 15.7 and 16.1 residues respectively. However, when considering the difference in identification rate, it is also necessary to consider the MS/MS spectra which were not matched to a peptide sequence. Thus we compared the m/z values selected for MS2 scans with the m/z values of identified peptides (Figure 4b,c). In the ERLIC runs, the two distributions match very closely. However, in RP runs a larger proportion of MS2 spectra were being taken at low m/z (400 – 600) indicating a higher proportion of smaller peptides which are less likely to produce a PSM score sufficient to meet filtering criteria. Thus, we believe that ERLIC’s propensity to retain and select for MS2 a higher proportion of longer peptides compared to RP is a main cause of the observed difference in spectral identification rate.
Conclusions
Overall, our results show that online ERLIC-MS can be easily implemented in a standard shotgun proteomics workflow, offering a significant improvement in protein identification compared to RPLC-MS. Although ERLIC does not retain peptides across all isoelectric points, using the conditions described here, as well as does RP, its advantages are clear for the multitude of applications where sensitive and high confidence identification of proteins within complex mixtures is critical, such as analysis of protein-limited samples. Online ERLIC-MS should also be amenable to LC-based multidimensional fractionation workflows, for increased sensitivity in very complex mixtures. Finally, as with offline ERLIC,2, 14 online ERLIC-MS might offer a unique ability to analyze post-translational modifications – particularly glycosylation and phosphorylation, which increase the hydrophilicity and acidity of peptides, respectively. Our findings lay the foundation for these future studies.
Supplementary Material
Acknowledgments
This work was supported by NIH grant R01DE017734. We thank Peter Jauert for providing yeast cells and Dr. Jeongsik Yong for HeLa cells, the University of Minnesota’s Center for Mass Spectrometry and Proteomics for use of the mass spectrometers and the Minnesota Supercomputing Institute, Dr. Getiria Onsongo, and Susan van Riper for computational support.
Footnotes
Supporting Information Available: This material is available free of charge via the Internet at http://pubs.acs.org.
References
- 1.Alpert AJ. Electrostatic repulsion hydrophilic interaction chromatography for isocratic separation of charged solutes and selective isolation of phosphopeptides. Analytical Chemistry. 2008;80(1):62–76. doi: 10.1021/ac070997p. [DOI] [PubMed] [Google Scholar]
- 2.Hao PL, Guo TN, Sze SK. Simultaneous Analysis of Proteome, Phospho- and Glycoproteome of Rat Kidney Tissue with Electrostatic Repulsion Hydrophilic Interaction Chromatography. Plos One. 2011;6(2) doi: 10.1371/journal.pone.0016884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Chien KY, Liu HC, Goshe MB. Development and Application of a Phosphoproteomic Method Using Electrostatic Repulsion-Hydrophilic Interaction Chromatography (ERLIC), IMAC, and LC-MS/MS Analysis to Study Marek’s Disease Virus Infection. Journal of Proteome Research. 2011;10(9):4041–4053. doi: 10.1021/pr2002403. [DOI] [PubMed] [Google Scholar]
- 4.Zarei M, Sprenger A, Metzger F, Gretzmeier C, Dengjel J. Comparison of ERLIC-TiO2, HILIC-TiO2, and SCX-TiO2 for Global Phosphoproteomics Approaches. Journal of Proteome Research. 2011;10(8):3474–3483. doi: 10.1021/pr200092z. [DOI] [PubMed] [Google Scholar]
- 5.Hao PL, Guo TN, Li X, Adav SS, Yang J, Wei M, Sze SK. Novel Application of Electrostatic Repulsion-Hydrophilic Interaction Chromatography (ERLIC) in Shotgun Proteomics: Comprehensive Profiling of Rat Kidney Proteome. Journal of Proteome Research. 2010;9(7):3520–3526. doi: 10.1021/pr100037h. [DOI] [PubMed] [Google Scholar]
- 6.Hao P, Qian J, Dutta B, Cheow ES, Sim KH, Meng W, Adav SS, Alpert A, Sze SK. Enhanced Separation and Characterization of Deamidated Peptides with RP-ERLIC-Based Multidimensional Chromatography Coupled with Tandem Mass Spectrometry. J Proteome Res. 2012;11(3):1804–11. doi: 10.1021/pr201048c. [DOI] [PubMed] [Google Scholar]
- 7.Wisniewski JR, Zougman A, Nagaraj N, Mann M. Universal sample preparation method for proteome analysis. Nature Methods. 2009;6(5):359–362. doi: 10.1038/nmeth.1322. [DOI] [PubMed] [Google Scholar]
- 8.Griffin TJ, Xie HW, Bandhakavi S, Popko J, Mohan A, Carlis JV, Higgins L. iTRAQ reagent-based quantitative proteomic analysis on a linear ion trap mass spectrometer. Journal of Proteome Research. 2007;6(11):4200–4209. doi: 10.1021/pr070291b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hao PL, Ren Y, Alpert AJ, Sze SK. Detection, Evaluation and Minimization of Nonenzymatic Deamidation in Proteomic Sample Preparation. Molecular & Cellular Proteomics. 2011;10(10):1–11. doi: 10.1074/mcp.O111.009381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Di Palma S, Boersema PJ, Heck AJR, Mohammed S. Zwitterionic Hydrophilic Interaction Liquid Chromatography (ZIC-HILIC and ZIC-cHILIC) Provide High Resolution Separation and Increase Sensitivity in Proteome Analysis. Analytical Chemistry. 2011;83(9):3440–3447. doi: 10.1021/ac103312e. [DOI] [PubMed] [Google Scholar]
- 11.Di Palma S, Stange D, van de Wetering M, Clevers H, Heck AJR, Mohammed S. Highly Sensitive Proteome Analysis of FACS-Sorted Adult Colon Stem Cells. Journal of Proteome Research. 2011;10(8):3814–3819. doi: 10.1021/pr200367p. [DOI] [PubMed] [Google Scholar]
- 12.Gilar M, Olivova P, Daly AE, Gebler JC. Orthogonality of separation in two-dimensional liquid chromatography. Analytical Chemistry. 2005;77(19):6426–6434. doi: 10.1021/ac050923i. [DOI] [PubMed] [Google Scholar]
- 13.Zhang HM, Guo TN, Li X, Datta A, Park JE, Yang J, Lim SK, Tam JP, Sze SK. Simultaneous Characterization of Glyco- and Phosphoproteomes of Mouse Brain Membrane Proteome with Electrostatic Repulsion Hydrophilic Interaction Chromatography. Molecular & Cellular Proteomics. 2010;9(4):635–647. doi: 10.1074/mcp.M900314-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zarei M, Sprenger A, Metzger F, Gretzmeier C, Dengjel J. Comparison of ERLIC-TiO(2), HILIC-TiO(2), and SCX-TiO(2) for Global Phosphoproteomics Approaches. Journal of Proteome Research. 2011;10(8):3474–3483. doi: 10.1021/pr200092z. [DOI] [PubMed] [Google Scholar]
- 15.Boersema PJ, Divecha N, Heck AJR, Mohammed S. Evaluation and optimization of ZIC-HILIC- RP as an alternative MudPIT strategy. Journal of Proteome Research. 2007;6(3):937–946. doi: 10.1021/pr060589m. [DOI] [PubMed] [Google Scholar]
- 16.Kyte J, Doolittle RF. A SIMPLE METHOD FOR DISPLAYING THE HYDROPATHIC CHARACTER OF A PROTEIN. Journal of Molecular Biology. 1982;157(1):105–132. doi: 10.1016/0022-2836(82)90515-0. [DOI] [PubMed] [Google Scholar]
- 17.Weng ND, Eerkes A. Development and validation of a hydrophilic interaction liquid chromatography-tandem mass spectrometric method for the analysis of paroxetine in human plasma. Biomedical Chromatography. 2004;18(1):28–36. doi: 10.1002/bmc.288. [DOI] [PubMed] [Google Scholar]
- 18.Nguyen HP, Schug KA. The advantages of ESI-MS detection in conjunction with HILIC mode separations: Fundamentals and applications. Journal of Separation Science. 2008;31(9):1465–1480. doi: 10.1002/jssc.200700630. [DOI] [PubMed] [Google Scholar]
- 19.Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Analytical Chemistry. 2002;74(20):5383–5392. doi: 10.1021/ac025747h. [DOI] [PubMed] [Google Scholar]
- 20.Nesvizhskii AI. Protein Identification by Tandem Mass Spectrometry and Sequence Database Searching. In: Matthiesen R, editor. Mass Spectrometry Data Analysis in Proteomics. Vol. 367. Humana Press Inc; Totowa, NJ: 2007. pp. 87–119. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.