Abstract
Reversed-phase liquid chromatography (RPLC) is widely used to reduce sample complexity prior to mass spectrometry (MS) analysis in bottom-up proteomics. Improving peptide separation in complex samples enables lower-abundance proteins to be identified. Multidimensional separations that combine orthogonal separation modes improve protein and peptide identifications over RPLC alone. Here we report a preparative capillary electrophoresis (CE) fractionation method that combines CE and RPLC separations. Using this method, we demonstrate improved protein and peptide identification in a tryptic digest of E. coli cell lysate, with 132 ± 33% more protein identifications and 185 ± 65% more peptide identifications over non-fractionated samples. Fractionation enables detection of lower-abundance proteins in this complex sample. We demonstrate improved coverage of ovarian cancer biomarker MUC16 isolated from conditioned cell media, with 6.73% sequence coverage using CE fractionation compared to 2.74% coverage without preparative fractionation. This new method will allow researchers performing bottom-up proteomics to harness the advantages of CE separations while using widely available LC-MS/MS instrumentation.
Bottom-up proteomics—involving enzymatic digestion of complex protein samples and mass spectrometry (MS) analysis of the resulting peptides—typically subjects analytes to one or more separation steps prior to MS analysis.1,2 Fractionation of protein digests and other complex mixtures prior to mass spectrometry has been reported using various forms of LC including reversed-phase chromatography (RPLC)3 and ion-exchange chromatography4 and various modes of CE including isotachophoresis,5,6 isoelectric focusing,7 and capillary zone electrophoresis.8,9 These separation steps provide additional analyte characterization while reducing sample complexity. Decreasing the number of peptides introduced to a mass analyzer at a given time improves the likelihood of obtaining peptide identifications via database searching, increasing the number of proteins identified in the sample. This benefit is relevant in many fields, including proteomicsbased biomarker discovery, where the identification of specific peptides is required against a background of many other peptides. The biofluids of interest in biomarker discovery are highly complex and can lead to ion suppression,10 resulting in bias towards highly abundant peptides.
The advantages of a single stage of separation are enhanced by combining non-redundant separation modes that resolve analytes based on complementary molecular properties. For this reason, many proteomic techniques employ two orthogonal separation steps prior to MS analysis. At the protein level, two-dimensional (2D) separation has traditionally employed 2D gel electrophoresis, in which proteins are separated by isoelectric point and size.11 Resolved proteins are digested in-gel and analyzed with MS.12–14 A more recent example of 2D protein separation for proteomics used size-exclusion chromatography followed by CE prior to native (top-down) proteomic analysis by tandem MS.15 At the peptide level, 2D separation prior to MS has most frequently taken the form of combined chromatographic techniques. One widely adopted system is MudPIT, in which ion-exchange chromatography and RPLC are performed in a single column prior to MS.16 These complementary methods allow for additional separation between peptides, increasing peptide and protein identifications. The advantages gained from 2D separations like MudPIT have led to the widespread popularity of this technique17,18 and to other 2D LC approaches.19 Other hybrid platforms to perform 2D separations have been described. The Yates lab reported a solid-phase microextraction, multi-step elution, transient isotachophoresis CE-MS/MS platform connected to a porous electrospray ionization (ESI) interface that displayed improved comprehensiveness and sensitivity in a moderately complex protein mixture.11 The Ramsey group developed a hybrid multidimensional separation system in which capillary UPLC was coupled to a CE-ESI microchip; this system was applied to glycopeptide analysis.20 A detachable strong cation exchange monolith, integrated with CE and coupled with pH gradient elution, was reported by the Dovichi lab.20 The Sun lab coupled microscale RPLC and dynamic pH junction CE-MS/MS to perform deep bottom-up proteomics of a breast-cancer cell line.21
CE (in capillary zone electrophoresis mode) has been used to separate analytes prior to MS in bottom-up proteomics,3,8,22,23 and in native proteomics.24 Preparative CE25,26 has also been reported for various analytes including peptides,27,28 and used in proteomics.29 CE separates analytes based on mass-to-charge ratio and is therefore largely orthogonal to RPLC, which separates based on hydrophobicity. CE has the advantage over LC of improved resolution:30–32 over 1 000 000 theoretical plates have been reported with CE separations.33 The combination of CE and MS has been achieved using a variety of ESI interfaces,34,35 and CE-ESI-MS has been used for proteomic analysis,8,23,35,36 enabling increased peptide and protein identifications over RPLC-ESI-MS.8 However, CE-ESI-MS/MS instruments are not as widely available in industrial and academic laboratories as are LC-ESI-MS/MS instruments. Over the last decade LC-MS/MS has become routine in clinical diagnostics,37 and although CE-MS/MS shows considerable promise,38 it is currently not as common in clinical labs.
Here we present a novel configuration of techniques to accomplish 2D separation of peptides in which preparative-scale CE is followed by RPLC. Tandem mass spectrometry (MS/MS) is accomplished online through a nanoLC-ESI interface. This configuration harnesses the complementary strengths of CE and LC. Using preparative CE fractionation and RPLC-MS/MS, we identified more peptides and proteins in E. coli cell lysate and human serum than RPLC-MS/MS alone, and we increased sequence coverage of MUC16 (an important ovarian cancer biomarker) enriched from conditioned cell media. We anticipate widespread use of this approach by researchers in the proteomics field to harness the complementary advantages of CE and LC separations.
Materials and methods
Materials and reagents
Male human serum (HS), acetic acid (AA), sodium dodecyl sulfate (SDS), iodoacetamide (IAA), triethylammonium bicarbonate (TEAB), α-cyano-4-hydroxycinnamic acid, and molecular weight cutoff filters (3 kDa and 100 kDa) were purchased from Millipore Sigma (St. Louis, MO). Tris(2-carboxyethyl)phosphine (TCEP), deoxycholic acid (DCA), phosphoric acid, methanol, 0.1% formic acid (FA) in water (Burdick and Jackson), and sodium hydroxide (NaOH) were obtained through VWR. Pure FA (99% purity), acetonitrile (ACN), and C18 ZipTips™ were purchased from Fisher Scientific (Hanover Park, IL). Trypsin gold was purchased from Promega (Madison, WI), and S-Traps™ were purchased from Protifi (Huntington, NY). RPMI 1640 used for cell culture was purchased from Gibco (Dublin, Ireland). Fetal bovine serum (FBS) and 0.25% pig trypsin were purchased from Cytiva (Marlborough, MA). L-Glutamine was purchased from Lonza (Switzerland). The Pierce BCA Protein Assay Kit was purchased from Thermo Fisher (Waltham, MA). K-12 Escherichia coli (E. coli) cell lysate was provided by Matthew Champion at the University of Notre Dame. Four synthetic peptides (LTLLRPK, HLLSPLFQR, ELGPYTLDR, and VLQGLLSPIFK) were purchased from Genscript (Piscataway, NJ) and reconstituted to a final concentration of 1 mg mL−1 in 18.2 MΩ cm water.
Sample preparation
Synthetic peptide mix.
Four synthetic peptides were combined and diluted in 10 mM AA to a total of 0.4 mg mL−1 (0.1 mg mL−1 for each peptide). This mixture was fractionated on CE using the methods described below.
Digestion of human serum and E. coli cell lysate.
Fifty micrograms total protein was digested following previously described methods.39 Briefly, proteins were denatured and reduced with 10% SDS and 10 mM TCEP at 95 °C for 10 minutes. 0.2% DCA was included as a passivating agent, and 100 mM TEAB (pH 8) was included for buffering. Proteins were alkylated using 10 mM IAA for 30 min at room temperature in the dark. The alkylation reaction was quenched with phosphoric acid at a final concentration of 1.2%. 100 mM TEAB in 90% methanol was added to form a protein suspension that was spun onto an S-Trap device and rinsed following manufacturer instructions. Proteins retained on the S-Trap were digested using 1 μg trypsin gold in TEAB for every 25 μg total protein. Following digestion, peptides were eluted using 100 mM TEAB followed by 0.1% FA in water. The reaction was quenched with 10% FA. A third elution was performed using 50% acetonitrile (ACN) and 0.1% FA. All eluates were combined and dried using a SpeedVac set at 53 °C. The peptides were resuspended in 0.1% FA and desalted using ZipTips. Peptides were reconstituted in 0.5% FA, 4% ACN for immediate mass spectrometry analysis or 10 mM AA for CE fractionation.
Cell culture and preparation of conditioned cell media (CCM).
MUC16 was isolated from conditioned media of OVCAR-3 cells using methods described by Patankar and co-workers.40 OVCAR-3 cells were provided by Professor Sharon Stack (University of Notre Dame). Cells were cultured in 175 cm2 tissue culture flasks containing RPMI 1640 media supplemented with 10% FBS and 2 mM L-glutamine until they reached 100% confluency. Spent media was removed and discarded, and cells were washed two times with FBS-free RPMI 1640 without phenol red. Fresh FBS-free media was added to the flask, and cells were allowed to grow for 5 additional days. Conditioned media was removed from the flask and centrifuged for 5 min at 1000 g to pellet cell debris. The supernatant was filtered using a 3 kDa molecular weight cutoff (MWCO) filter. Following initial filtration, supernatant was filtered a second time using a 100 kDa MWCO filter. This combination of molecular weight cut-off filters enables the enrichment of MUC16 (a 3–5 MDa glycoprotein) by removing smaller proteins. Total protein concentration was determined using a BCA assay. Fifty micrograms of protein were digested using the same methods as the human serum digest except that 1 μg of trypsin was used for every 13 μg protein.
CE fractionation of peptides
Peptides were dried and reconstituted in 10 mM AA to a concentration of 0.5 mg mL−1. CE analysis and fractionation were performed on a SCIEX P/ACE MDQplus instrument with UV absorbance detection at 214 nm containing a 30 cm uncoated fused silica capillary (20 cm to detection window, ID: 100 μm, OD: 375 μm). Smaller ID capillaries were not suitable because of the limited peak volume and resulting lower collected amounts of peptide. Capillary length is constrained to predetermined lengths in this commercial CE instrument and was not optimized. 1 M AA was used as background electrolyte (BGE). The capillary was rinsed with 1 M NaOH, water, and BGE prior to each injection. Fraction collection windows were determined by calculating migration velocity of analyte to detector window and extrapolating to the full length of the capillary. Fractions were collected by moving the capillary outlet into separate collection vials throughout the run. The volume in the inlet and outlet vials during fraction collection was 125 μL. Samples were injected by the application of 2.0 psi to the inlet vial for 5 seconds. This injection is estimated by CE Expert Lite (SCIEX; Framingham, MA) to introduce ~0.5 μL, or 250 ng, of analyte (200 ng for synthetic peptide mix), onto the capillary. Electrophoresis was performed under a constant applied voltage of 7.0 kV for 48 minutes and ended with a 0.2 psi separation with no applied voltage for 5 minutes at the end of the run to push out any remaining material. In HS digest, E. coli lysate digest, and CCM experiments, material collected during this push made up the final collected fraction. Voltage was optimized to maintain reasonable run times while generating a sufficiently wide spread of peptides in the collection windows. The capillary temperature was kept at 25 °C, and samples were stored at 15 °C. Eight CE injections were fractionated for each replicate of HS digest, E. coli lysate, and CCM. Three injections were fractionated for the synthetic peptide mixture. ‘Plug’ sample was collected by running the same instrument method but without iterating the capillary outlet through different collection vials, so that the sample experienced electrophoresis but not fractionation. ‘Raw’ sample refers to digested material that was never introduced to the CE system. For direct comparisons among plug, raw, and fractionated samples, the same number of injections into the mass spectrometer and total material by mass were compared. This is experimentally supported by equivalent total ion intensity between the raw and plug controls (ESI Fig. S1†). During fraction collection, a total of ~2 μg peptide was collected over eight injections for an average of 400 ng peptide per fraction. To prepare the plug samples, four injections were collected, for a total of 1 μg. Raw samples were diluted to 25 ng μL−1.
Mass spectrometry
MALDI (synthetic peptide mix).
CE fractions were dried using a SpeedVac and reconstituted in 20 μL 4% ACN and 0.5% FA. MALDI spectra were acquired using an ultrafleXtreme MALDI-TOF-TOF mass spectrometer (Bruker Daltonics, Bremen, Germany). Mass spectra were acquired in reflectron positive ion mode with 50 laser shots per spot. The laser spot size was 2000 μm. The laser was set at 60% power with a sampling frequency of 2 kHz. Mass spectra were acquired from 500–5000 m/z. α-Cyano-4-hydroxycinnamic acid was used as the matrix.
Orbitrap (human serum digest, E. coli lysate digest, and conditioned cell media).
Peptides were analyzed using a Waters NanoAcquity liquid chromatography (LC) system coupled to a Q-Exactive mass spectrometer (Thermo Scientific). Each fraction, ‘plug,’ or ‘raw’ digest sample was dried down and reconstituted in 4% ACN and 0.5% FA to a concentration of 25 ng μL−1. Reconstitution volumes were 16 μL for each collected fraction and 40 μL for the plug. Evaporating collected samples to dryness and reconstituting into a minimal volume prior to introduction to the LC was possible because of the off-line format of this 2D separation and partially offsets the effects of dilution in the first dimension of separation. 2 μL of sample was injected onto the column, for 50 ng material injected. The concentration of peptides in the collected fractions was not known, but each fraction was treated as if it contained 20% (1/5) of the total volume injected onto the CE. The LC system was equipped with a peptide BEH C18 100 μm × 100 μm column that contained 1.7 μm particles (Waters; Milford, MA). The mobile phase flow rate was 0.9 μL min−1. Peptides were separated over a 48 min gradient using a binary solvent system. Solvent A consisted of water with 0.1% FA while solvent B consisted of ACN with 0.1% FA. The following linear gradient was used for all samples: 4% B for 0–8 min, 4–7% B from 8–10 min, 7–33% B from 10–30 min, 33–90% B from 30–33 min, 90% B until 36 min, 90–4% B for 1 min, and reequilibration at 4% B from 37–48 min. The mass spectrometer was operated in top 12 data-dependent acquisition mode with automated switching between MS and MS/MS. The ion source was operated in positive ion mode at 1.8 kV, and the ion transfer tube was maintained at 280 °C. Full MS scans were acquired from 415 to 2000 m/z at resolution of 70 000, with an AGC target of 3 × 106 ions and a fill time of 80 ms. MS/MS scans were performed from 200 to 2000 m/z at a resolution of 17 500 and a maximum fill time of 120 ms. The AGC target was set at 1 × 105 ions. An isolation window of 4.0 m/z was used for fragmentation with a normalized collision energy of 27. Dynamic exclusion was set at 40 s. Ions with a charge of +1 or greater than +6 were excluded from fragmentation.
Database searching and data analysis
Raw data files were searched using PEAKS Online X build 1.4.2020-10-21 (Bioinformatics Solutions, Waterloo, ON, Canada)41 using the current Homo sapiens UniProt database (downloaded April 1, 2021)42 with the MUC16 entry (Accession ID Q8WXI7, 14 152 amino acids) replaced with the version from the 2016 SwissProt database (22 152 amino acids) for human serum and conditioned cell media samples. For E. coli lysate samples, the database used was the K-12 E. coli UniProt database (downloaded March 1, 2021). The digestion enzyme was set to trypsin with a maximum of 2 missed cleavages. Peptide mass tolerance was 10 ppm and fragment mass tolerance was 0.02 Da. Carbamidomethylation of C was set as a global modification and variable modifications were oxidation of M, deamidation of N and Q, pyro-glu conversion from E and Q, and sodium adduction. A peptide FDR was set to 1%, and protein −10log P was set to ≥20 (equivalent to a p-value of 0.01).43 Peptides of length 6–45 amino acids were considered, and common contaminants (containing keratin) were filtered out. Two or more unique peptides were required for protein identification.
Data analysis and statistical evaluation was performed using R (version 4.0.3) in RStudio.44,45 All plots were made with ggplot2,46 and the following packages were used for analysis: seqinr,47 stringr,48 readr,49 eulerr,50 and dplyr.51 Peptide charges were estimated based on amino acid sequence.52 Percent coverage was calculated by comparing amino acids identified in peptides to total amino acids in the protein sequence. All reported values are mean ± standard deviation.
Results and discussion
Preparative CE of synthetic peptides followed by mass spectrometry demonstrates the accuracy of CE fractionation
For CE fractionation of peptides to be a viable dimension of separation prior to bottom-up proteomics, collected fractions must be unique, and the collected material must be compatible with mass spectrometry. To demonstrate these attributes, we created a mixture of 4 synthetic peptides (Table 1) and separated the mixture with CE, collecting 5 fractions (Fig. 1A). The overlaid electropherograms show excellent qualitative repeatability in migration time between injections, with 4 distinct features appearing in all electropherograms.
Table 1.
Synthetic peptides in mixture
Label | Sequence | Theoretical [M + H]+ | Approximate pI53 |
---|---|---|---|
| |||
A | LTLLRPK | 840.567 | 11.5 |
B | HLLSPLFQR | 1110.642 | 11.1 |
C | ELGPYTLDR | 1063.542 | 4.1 |
D | VLQGLLSPIFK | 1214.751 | 10.1 |
Fig. 1.
CE fractionation of synthetic peptide mixture. (A) Overlaid electropherograms of 3 injections (blue, green, and orange traces) of a synthetic peptide mixture, with collected fractions labelled as F1–F5. (B) MALDI spectra of collected fractions with labelled peaks corresponding to the synthetic peptides in Table 1. C* is modified peptide C, with N-terminal glutamate converted to pyroglutamate.
We hypothesized that the two peaks in fraction 1 were peptides A and B, that peptides C and D co-migrated in fraction 2, and the peak in fraction 4 was a modified version of peptide C where the N-terminal glutamate was converted to pyroglutamate (a spontaneous conversion in acidic conditions54). Spiking experiments confirmed that the first peak is peptide A and the second peak is peptide B (data not shown). Each fraction was analyzed with MALDI-TOF mass spectrometry, resulting in the spectra shown in Fig. 1B, which confirmed the hypothesized peak assignments and demonstrated that the collected material was compatible with mass spectrometry. Furthermore, fractions 3 and 5 contained no significant amount of peptide, indicating that the CE separation is efficient and accurate, and that collected fractions are unique. Note that the pyroglutamate conversion involves a loss of 18.015 Da, which results in a predicted m/z of 1045.527 for modified peptide C, observed in the MALDI spectrum for fraction 4. The modified version of peptide C has altered mobility relative to peptide C because of the loss of charge upon the cyclization of glutamate to pyroglutamate. We note that both fractions 2 and 4 contain modified peptide C. We attribute this observation to continued conversion of glutamate to pyroglutamate between fraction collection and MALDI-MS analysis.
CE fractionation of complex protein digests prior to tandem mass spectrometry increases protein and peptide identifications
E. coli bacterial cell lysate was selected as a complex biological sample. Lysate samples were digested with trypsin and fractionated with CE (ESI Fig. S2A†). Individual fractions were analyzed with tandem-mass spectrometry prior to database searching with PEAKS proteomics software. Analysis of each fraction as an individual sample showed unique peptides in each collected fraction, indicating that CE fractionation simplified the initial protein digest and is an effective separation technique for complex peptide mixtures. 80 ± 5% of peptides were identified in only one fraction, and 98 ± 9% of peptides were found in one or two (Fig. 2A). Each consecutive fraction showed an increase in average peptide mass-to-charge ratio (m/z) except for the final fraction which was collected with pressure rather than electrophoresis (Fig. 2B and ESI Fig. S3†). This trend aligns with the separation mechanism in capillary zone electrophoresis.32
Fig. 2.
CE fractionation of E. coli lysate digest, peptide data analysis. (A) Number of fractions where each peptide was found, showing that most peptides were only present in one fraction. Bars are averages (±standard deviation) of biological triplicates, with each biological replicate combining triplicate injections. (B) Identified peptides by fraction plotted against theoretical m/z (charge estimated based on amino acid sequence at pH 2.38). Red point is the mean value for each fraction. ****signifies a p-value of ≤0.0001 by the Wilcoxon rank-sum test. ns = not significant. Data shown are from one biological replicate (replicate A) with triplicate injections. Other biological replicates show the same trend and can be found in ESI (Fig. S3†).
When all 5 fractions were searched as one sample, the number of protein and peptide identifications was increased over both a single plug collected from CE (‘plug’) and sample not subjected to electrophoresis (‘raw’). There were 132 ± 33% more protein identifications with fractionation over a collected plug, and 65 ± 24% more protein identifications with fractionation over raw digest (Fig. 3A). Peptide identification showed a similar improvement with 185 ± 65% more identifications in fractions over plug, and 80 ± 41% more identifications in fraction over raw digest (Fig. 3B). Importantly, the proteins and peptides identified with the fractionation are complementary to those identified in the collected plug and raw digest. 57 ± 6% and 40 ± 7% of the proteins identified in the CE fractions were not identified in the plug and raw digest respectively, and 66 ± 7% and 59 ± 6% of peptides in the fractionated sample were not identified in the plug and raw digest respectively (n = 3 technical replicates of biological triplicates). These values indicate that CE fractionation effectively adds a dimension of peptide separation prior to LC-MS. This conclusion agrees with other reports8,55 demonstrating that CE-MS/MS is a complementary technique to LC-MS/MS and can increase peptide and protein identifications.
Fig. 3.
Protein and peptide identifications in CE fractionated E. coli lysate sample compared to CE plug and raw digest. (A) Overlap in protein identifications between fraction (red) and plug or raw (blue) in three biological replicates. (B) Overlap in peptide identifications between fraction (red) and plug or raw (blue) in three biological replicates. Files from the five fractions (15 injections total) were combined and searched as one sample in PEAKS, to enable direct comparison with the raw and plug controls, for which equivalent material by mass was analyzed.
Fig. 3B illustrates that some peptides found in the raw or plug sample are not found in the fractionated samples. To better understand this observation, we analyzed the peptides uniquely identified in the raw sample, examining m/z, estimated pI, hydrophobicity score, and mass. We find that the raw sample contains larger and more hydrophobic peptides, whereas there is no difference based on m/z or estimated pI (ESI Fig. S4†). This trend has been reported previously for CE-MS/MS experiments.8 We examined the effect of physiochemical properties on fraction collection by plotting fraction collection vs. hydrophobicity score, mass, and predicted pI. This analysis revealed no trend based on hydrophobicity or mass, but pI is observed to decrease with increasing fraction number, consistent with the observed trend in m/z (ESI Fig. S5†).
To evaluate whether CE fractionation enables the identification of lower-abundance proteins, we used an approach described by Kelleher and co-workers,56 comparing identified proteins to RNA-seq gene counts in an E. coli K-12 strain.57 We found that average gene counts for proteins identified in fractionated samples were lower than those in the raw control (Studenťs t-test, p ≤ 0.00005). The average transcript counts of proteins identified uniquely in the fractionated sample were lower still (ESI Fig. S6†). These data suggest that preparative CE fractionation improves proteomic coverage by identifying additional low-abundance proteins. This increase in dynamic range is relevant for biomarker discovery, where low abundance proteins must be identified against a complex background of other species, some of which are highly abundant.
Similar experiments were performed with human serum, which is a matrix of interest because of its relevance in biomarker discovery but challenging in proteomics because it is dominated by a small number of highly abundant proteins. Using CE fractionation (ESI Fig. S2B†), we saw an increase in total peptides and proteins identified over control samples. Specifically, 44 ± 16% and 20 ± 13% more proteins were identified with fractionation over plug and raw respectively, and 39 ± 20% and 19 ± 17% more peptides were identified with fractionation over plug and raw respectively (ESI Fig. S7†). Peptide analysis for the human serum samples showed the same separation trends observed in Fig. 2B (ESI Fig. S8†). These data show the general utility of CE fractionation to increase protein and peptide identifications in multiple sample types, including those relevant for biomarker discovery. We note that use of the CE fractionation increases the total analysis time: as the number of CE fractions increases, MS instrument time increases linearly. Although longer MS run time is a disadvantage of the approach reported here, that shortcoming is offset by the increase in identified proteins and peptides, and the longer analysis time may be justified depending on the goal of the analysis.
Peptide coverage of MUC16 from conditioned cell media (CCM) is increased by CE fractionation
CCM from OVCAR-3 cells was digested and enriched for proteins >100 kDa using molecular weight cutoff filters. The resulting proteins were digested using trypsin. CE fractionation of this material was performed to evaluate the ability of the technique to increase peptide coverage of MUC16, an FDA-approved biomarker for ovarian cancer used to track patient response to treatment and monitor cancer recurrence.58,59 MUC16 contains ~60 similar but non-identical repeat regions.60,61 In our database searches, we use the 2016 SwissProt version of MUC16 which contains all repeats, resulting in a sequence with 22 152 amino acids, whereas the current version (as of December, 2021) in UniProt contains a truncated version with only 14 057 amino acids.62 In our previous work we identified peptides found in the full-length protein and not the truncated version, providing a rationale for using the 2016 build.39,63 Using CE fractionation (ESI Fig. 2C†) followed by bottom-up proteomics, we obtain 6.73% coverage of MUC16, with 25 unique peptide identifications, an improvement over the 2.60% coverage (5 unique peptides) and 2.74% coverage (7 unique peptides) observed when the raw and plug digest controls were evaluated, respectively (Fig. 4). Note that owing to the repeating domain structure of MUC16, many peptides are present in multiple copies throughout the full amino acid sequence. This experiment was duplicated and produced nearly identical percent-coverage results (ESI Fig. S9†).
Fig. 4.
Peptide coverage of MUC16 following CE fractionation of CCM. Conditioned cell media from OVCAR-3 cells showed the presence of MUC16, an ovarian cancer biomarker. The N-terminus (left) and C-terminus (right) are designated by red dashed lines. The repeat region starts at amino acid 12 071. Identified peptides are shown with vertical colored lines. Percent coverage for each sample is shown to the right. Data are shown from one biological replicate; an additional replicate shows the same trend (ESI Fig. S9†).
All but one identified MUC16 peptide (seen in the fractionated sample) originated in the repeat region or C-terminus, consistent with our previous analysis of MUC16 isolated from ascites of individual ovarian cancer patients.63 There is evidence that the antibodies used in the clinical immunoassay for MUC16 display inconsistent binding to the repeat regions,64 making the development of improved MUC16 assays a crucial analytical goal. Increasing peptide coverage is important for our molecular understanding of the protein, which will enable improved diagnostics. Importantly, the method reported here is not limited to MUC16, or to the biomarkers of any particular human disease. Any biomarker discovery process can benefit from de-densifying peptides using CE fractionation prior to LC-MS/MS. Although this study was not designed to detect posttranslational modifications (PTMs), we note that this approach may benefit the characterization of PTMs, which are important for protein function. Among the MUC16 peptides, we detect four peptides that have oxidized methionine, two peptides that display deamidation from asparagine, and one peptide displaying deamidation from glutamine. Our ongoing work focuses explicitly on using this CE fractionation approach to improve detection of low-abundance glycopeptides derived from MUC16.
Conclusions
We have demonstrated the utility and broad applicability of CE fractionation as a preparative step in bottom-up proteomics. Using this novel 2D separation combination, we identified more peptides and proteins in E. coli lysate and human serum samples than with LC-MS/MS alone, as well as different peptides and proteins than with LC-MS/MS alone. These data demonstrate that we can harness the power of CE in proteomics without the need for a CE-MS/MS instrument. In addition to the proteome-wide analysis of a bacterial cell lysate, we looked specifically at ovarian cancer biomarker MUC16 in conditioned cell media from a human ovarian cancer cell line. Using CE fractionation, we increased our percent coverage of this important biomarker protein. This accomplishment will complement our work in developing assays for MUC16 to improve ovarian cancer detection. We plan to expand on this work by increasing the inner diameter of our capillaries to increase sample throughput, fractionating additional sample types, such as human biofluids, and applying this approach to quantitation of biomarkers. With the broad availability of LC-MS/MS instruments in academic and clinical settings, we anticipate that other laboratories will be able to include CE fractionation in their proteomics workflows and utilize the benefits demonstrated here.
Supplementary Material
Acknowledgements
This work was supported by the University of Notre Dame Advancing Our Vision Fund in Analytical Science and Engineering (to RJW). SDW and NSL are fellows of the Chemistry–Biochemistry–Biology Interface (CBBI) Program at the University of Notre Dame, supported by training grant T32GM075762 from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health. The authors thank Bill Boggess and the University of Notre Dame Mass Spectrometry and Proteomics Facility for expert technical assistance. The authors thank Matthew Champion and Daniel Hu (University of Notre Dame) for providing E. coli cell lysate and Sharon Stack (University of Notre Dame) for providing OVCAR-3 cells.
Footnotes
Electronic supplementary information (ESI) available: Figures (word document) and database search results (.csv files) are provided. A brief description of each database search file. See DOI: 10.1039/d1ay02145a
Conflicts of interest
The authors declare that they have no conflicts of interest.
References
- 1.Dupree EJ, Jayathirtha M, Yorkey H, Mihasan M, Petre BA and Darie CC, Proteomes, 2020, 8, 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhang Y, Fonslow BR, Shan B, Baek M and Yates JR, Chem. Rev, 2013, 113, 2343–2394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Klein J, Papadopoulos T, Mischak H and Mullen W, Electrophoresis, 2014, 35, 1060–1064. [DOI] [PubMed] [Google Scholar]
- 4.Boichenko AP, Govorukhina N, van der Zee AG and Bischoff R, J. Sep. Sci, 2013, 36, 3463–3470. [DOI] [PubMed] [Google Scholar]
- 5.Marak J, Stanova A, Gajdostinova S, Skultety L and Kaniansky D, Electrophoresis, 2011, 32, 1273–1281. [DOI] [PubMed] [Google Scholar]
- 6.Marak J and Stanova A, Electrophoresis, 2014, 35, 1268–1274. [DOI] [PubMed] [Google Scholar]
- 7.Hubner NC, Ren S and Mann M, Proteomics, 2008, 8, 4862–4872. [DOI] [PubMed] [Google Scholar]
- 8.Li Y, Champion MM, Sun L, Champion PA, Wojcik R and Dovichi NJ, Anal. Chem, 2012, 84, 1617–1622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zhao Y, Sun L, Zhu G and Dovichi NJ, J. Proteome Res, 2016, 15, 3679–3685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Annesley TM, Clin. Chem, 2003, 49, 1041–1044. [DOI] [PubMed] [Google Scholar]
- 11.Wang Y, Fonslow BR, Wong CC, Nakorchevsky A and Yates JR 3rd, Anal. Chem, 2012, 84, 8505–8513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.O’Farrell PH, J. Biol. Chem, 1975, 250, 4007–4021. [PMC free article] [PubMed] [Google Scholar]
- 13.Klose J and Kobalz U, Electrophoresis, 1995, 16, 1034–1059. [DOI] [PubMed] [Google Scholar]
- 14.Jungblut PR, Schaible UE, Mollenkopf HJ, Zimny-Arndt U, Raupach B, Mattow J, Halada P, Lamer S, Hagens K and Kaufmann SH, Mol. Microbiol, 1999, 33, 1103–1117. [DOI] [PubMed] [Google Scholar]
- 15.Shen X, Kou Q, Guo R, Yang Z, Chen D, Liu X, Hong H and Sun L, Anal. Chem, 2018, 90, 10095–10099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Washburn MP, Wolters D and Yates JR, Nat. Biotechnol, 2001, 19, 242–247. [DOI] [PubMed] [Google Scholar]
- 17.Schirmer EC, Yates JR and Gerace L, Discov. Med, 2003, 3, 38–39. [PubMed] [Google Scholar]
- 18.Mosley AL, Florens L, Wen Z and Washburn MP, J. Proteomics, 2009, 72, 110–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Roca LS, Gargano AFG and Schoenmakers PJ, Anal. Chim. Acta, 2021, 1156, 338349. [DOI] [PubMed] [Google Scholar]
- 20.Mellors JS, Black WA, Chambers AG, Starkey JA, Lacher NA and Ramsey JM, Anal. Chem, 2013, 85, 4100–4106. [DOI] [PubMed] [Google Scholar]
- 21.Yang Z, Shen X, Chen D and Sun L, Anal. Chem, 2018, 90, 10479–10486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Loo JA, Udseth HR and Smith RD, Anal. Biochem, 1989, 179, 404–412. [DOI] [PubMed] [Google Scholar]
- 23.Sun L, Zhu G, Zhang Z, Mou S and Dovichi NJ, J. Proteome Res, 2015, 14, 2312–2321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Jooss K, McGee JP, Melani RD and Kelleher NL, Electrophoresis, 2021, 42, 1050–1059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yin H, Keely-Templin C and McManigill D, J. Chromatogr. A, 1996, 744, 45–54. [Google Scholar]
- 26.Kasicka V, Electrophoresis, 2009, 30(1), 40. [DOI] [PubMed] [Google Scholar]
- 27.Scriba GKE, in Capillary Electrophoresis: Methods and Protocols, ed. Schmitt-Kopplin P, Springer; New York, New York, NY, 2016, pp. 365–391. [Google Scholar]
- 28.Gennaro LA and Salas-Solano O, J. Chromatogr. A, 2009, 1216, 4499–4503. [DOI] [PubMed] [Google Scholar]
- 29.Latosinska A, Siwy J, Mischak H and Frantzi M, Electrophoresis, 2019, 40, 2294–2308. [DOI] [PubMed] [Google Scholar]
- 30.Jorgenson JW and Lukacs KD, Clin. Chem, 1981, 27, 1551–1553. [PubMed] [Google Scholar]
- 31.Jorgenson JW and Lukacs KD, Science, 1983, 222, 266–272. [DOI] [PubMed] [Google Scholar]
- 32.Voeten RLC, Ventouri IK, Haselberg R and Somsen GW, Anal. Chem, 2018, 90, 1464–1481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hutterer KM and Jorgenson JW, Anal. Chem, 1999, 71, 1293–1297. [DOI] [PubMed] [Google Scholar]
- 34.Rosnack KJ, Stroh JG, Singleton DH, Guarino BC and Andrews GC, J. Chromatogr. A, 1994, 675, 219–225. [DOI] [PubMed] [Google Scholar]
- 35.Sun L, Zhu G, Yan X, Champion MM and Dovichi NJ, Proteomics, 2014, 14, 622–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zhang Z, Sun L, Zhu G, Yan X and Dovichi NJ, Talanta, 2015, 138, 117–122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Seger C and Salzmann L, Clin. Biochem, 2020, 82, 2–11. [DOI] [PubMed] [Google Scholar]
- 38.Gahoual R, Leize-Wagner E, Houze P and Francois YN, Rapid Commun. Mass Spectrom, 2019, 33(1), 11–19. [DOI] [PubMed] [Google Scholar]
- 39.Schuster-Little N, Madera S and Whelan R, Anal. Bioanal. Chem, 2020, 412, 6361–6370. [DOI] [PubMed] [Google Scholar]
- 40.Kui Wong N, Easton RL, Panico M, Sutton-Smith M, Morrison JC, Lattanzio FA, Morris HR, Clark GF, Dell A and Patankar MS, J. Biol. Chem, 2003, 278, 28619–28634. [DOI] [PubMed] [Google Scholar]
- 41.Zhang J, Xin L, Shan B, Chen W, Xie M, Yuen D, Zhang W, Zhang Z, Lajoie GA and Ma B, Mol. Cell. Proteomics, 2012, 11, M111.010587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.The Uniprot Consortium, Nucleic Acids Res., 2021, 49, D480–D489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.https://www.bioinfor.com/dbscoring-tutorial/, accessed 08–2021.
- 44.R. Team, RStudio: Integrated Development Environment for R, PBC, Boston, MA, 2020. [Google Scholar]
- 45.C. T. R, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2020. [Google Scholar]
- 46.Wickham H, ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag, New York, 2016. [Google Scholar]
- 47.Charif D and Lobry JR, in Structural Approaches to Sequence Evolution: Molecules, Networks, Populations, ed. Bastolla U, Porto M, Roman HE and Vendruscolo M, Springer Berlin Heidelberg, Berlin, Heidelberg, 2007, pp. 207–232. [Google Scholar]
- 48.Wickham H, 2019, https://CRAN.R-project.org/package=stringr.
- 49.Wickham H and Hester J, 2020, https://CRAN.R-project.org/package=readr.
- 50.Larsson J, 2020, https://cran.r-project.org/package=eulerr.
- 51.Wickham H, Francois R, Henry L and Muller K, 2021, https://CRAN.R-project.org/package=dplyr.
- 52.Sims PA, J. Chem. Educ, 2010, 87, 803–808. [Google Scholar]
- 53.https://www.thermofisher.com/us/en/home/life-science/protein-biology/peptides-proteins/custom-peptide-synthesis-services/peptide-analyzing-tool.html, accessed 08–2021.
- 54.Yu L, Vizel A, Huff MB, Young M, Remmele RL and He B, J. Pharm. Biomed. Anal, 2006, 42, 455–463. [DOI] [PubMed] [Google Scholar]
- 55.Ibrahim M, Gahoual R, Enkler L, Becker HD, Chicher J, Hammann P, François Y, Kuhn L and Leize-Wagner E, J. Chromatogr. Sci, 2016, 54, 653–663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Gerbasi VR, Melani RD, Abbatiello SE, Belford MW, Huguet R, McGee JP, Dayhoff D, Thomas PM and Kelleher NL, Anal. Chem, 2021, 93, 6323–6328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Kavita K, Zhang A, Chin-Hsien T, Majdalani N, Storz G and Gottesman S, Nucleic Acids Res, 2022, DOI: 10.1093/nar/gkac017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.van der Burg ME, Lammes FB and Verweij J, Ann. Oncol, 1990, 1, 301–302. [DOI] [PubMed] [Google Scholar]
- 59.van der Burg ME, Lammes FB and Verweij J, Neth. J. Med, 1992, 40, 36–51. [PubMed] [Google Scholar]
- 60.O’Brien TJ, Beard JB, Underwood LJ and Shigemasa K, Tumor Biol, 2002, 23, 154–169. [DOI] [PubMed] [Google Scholar]
- 61.O’Brien TJ, Beard JB, Underwood LJ, Dennis RA, Santin AD and York L, Tumor Biol, 2001, 22, 348–366. [DOI] [PubMed] [Google Scholar]
- 62.Grimwood J, Gordon LA, Olsen A, Terry A, Schmutz J, Lamerdin J, Hellsten U, Goodstein D, Couronne O, Tran-Gyamfi M, Aerts A, Altherr M, Ashworth L, Bajorek E, Black S, Branscomb E, Caenepeel S, Carrano A, Caoile C, Chan YM, Christensen M, Cleland CA, Copeland A, Dalin E, Dehal P, Denys M, Detter JC, Escobar J, Flowers D, Fotopulos D, Garcia C, Georgescu AM, Glavina T, Gomez M, Gonzales E, Groza M, Hammon N, Hawkins T, Haydu L, Ho I, Huang W, Israni S, Jett J, Kadner K, Kimball H, Kobayashi A, Larionov V, Leem S, Lopez F, Lou Y, Lowry S, Malfatti S, Martinez D, McCready P, Medina C, Morgan J, Nelson K, Nolan M, Ovcharenko I, Pitluck S, Pollard M, Popkie AP, Predki P, Quan G, Ramirez L, Rash S, Retterer J, Rodriguez A, Rogers S, Salamov A, Salazar A, She X, Smith D, Slezak T, Solovyev V, Thayer N, Tice H, Tsai M, Ustaszewska A, Vo N, Wagner M, Wheeler J, Wu K, Xie G, Yang J, Dubchak I, Furey TS, DeJong P, Dickson M, Gordon D, Eichler EE, Pennacchio LA, Richardson P, Stubbs L, Rokhsar DS, Myers RM, Rubin EM and Lucas SM, Nature, 2004, 428, 529–535. [DOI] [PubMed] [Google Scholar]
- 63.Schuster-Little N, Fritz-Klaus R, Etzel M, Patankar N, Javeri S, Patankar MS and Whelan RJ, Analyst, 2021, 146, 85–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Bressan A, Bozzo F, Maggi CA and Binaschi M, Dis. Markers, 2013, 34, 257–267. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.