Abstract
A new approach for the identification of intact proteins has been developed that relies on the generation of relatively few abundant products from specific cleavage sites. This strategy is intended to complement standard approaches that seek to generate many fragments relatively non-selectively. Specifically, this strategy seeks to maximize selective cleavage at aspartic acid and proline residues via collisional activation of precursor ions formed via electrospray ionization (ESI) under denaturing conditions. A statistical analysis of the SWISS-PROT database was used to predict the number of arginine residues for a given intact protein mass and predict a m/z range where the protein carries a similar charge to the number of arginine residues thereby enhancing cleavage at aspartic acid residues by limiting proton mobility. Cleavage at aspartic acid residues is predicted to be most favorable in the m/z range of 1500–2500, a range higher than that normally generated by ESI at low pH. Gas-phase proton transfer ion/ion reactions are therefore used for precursor ion concentration from relatively high charge states followed by ion isolation and subsequent generation of precursor ions within the optimal m/z range via a second proton transfer reaction step. It is shown that the majority of product ion abundance is concentrated into cleavages C-terminal to aspartic acid residues and N-terminal to proline residues for ions generated by this process. Implementation of a scoring system that weights both ion fragment type and ion fragment area demonstrated identification of standard proteins, ranging in mass from 8.5 kDa to 29.0 kDa.
Keywords: Ion/ion reactions, intact protein identification, top-down, selective fragmentation
Graphical Abstract
Introduction
The notion of using collision induced dissociation (CID) to elucidate the primary structure of intact proteins was first presented in 1990[1], and formed the basis for the development of ‘top-down’ tandem mass spectrometry of proteins. Today, collisional activation remains one of the most common techniques for the identification of proteins in top-down workflows[2–6]. Other dissociation approaches, such as electron capture dissociation[7], electron transfer dissociation[8], and UV photodissociation (UVPD)[9] have subsequently been applied to whole protein ions with a common objective of maximizing the number of sequence informative products. With this objective in mind, it is desirable that fragmentation be non-selective. However, as the number of different product ions increase, the overall product ion signal is dispersed among the many fragmentation channels[10]. There is merit, therefore, in having the option to maximize selective cleavages that tend to concentrate product ion signal into fewer channels. Scenarios that might benefit from this capability include, for example, the identification of low level proteins in cases when the dispersal of product ion signal among many channels limits dynamic range and in single (or multiple) reaction monitoring workflows. We note here a distinction between ‘identification’ of a protein or ‘proteoform’[11] and complete ‘characterization’. Maximizing the number of cleavages is a priority for the latter objective in spite of the fact that this might limit detection limits and dynamic range.
The generation of almost exclusively b- and y-type fragment ions upon CID of multiply-protonated proteins is commonly observed and is consistent with the mobile proton model[12–14]. Collisional activation of a peptide or protein leads to mobilization of the excess protons along the polypeptide backbone that can initiate a charge-directed fragmentation process to produce b- and y-fragment ions[15]. While a number of non-specific fragmentation pathways are observed when using collisional activation, particularly at intermediate charge states, several site-specific fragmentation pathways have been reported[16–20]. Specifically, under favorable conditions, cleavages at aspartic acid residues and proline residues have been shown to be the dominant pathways[16, 19, 21–24]. Despite both fragmentation pathways resulting in highly selective cleavages, the underlying mechanism regarding both cleavages differs. The higher basicity of the proline amide bond, relative to those of other amino acids, leads to preferential localization of a mobilized proton at the proline amide bond, initiating the charge directed fragmentation N-terminal to the proline residue. The favorable cleavage at aspartic acid residues C-terminal to the acidic side-chain does not require a mobile proton and can become prominent when the ionizing protons are sequestered by arginine residues [25].
Numerous studies have been performed to characterize the gas-phase fragmentation behavior of intact proteins[26–32], revealing that the information obtained in a CID experiment is highly dependent on the charge state subjected to activation as well as the number of basic residues present in the protein sequence[33, 34]. Fragmentation patterns can be rationalized on the basis of three major precursor ion types: high charge states, intermediate charge states, and low charge states. Activation of high charge states, where the large intramolecular Coulomb field tends to minimize proton mobilization, leads to fragmentation patterns that are difficult to predict a priori. Activation of intermediate charge states typically results in a high degree of sequence coverage and extensive non-specific fragmentation due to maximum proton mobility. Collisional activation of low charge state proteins, where protein mobility is hindered by solvation of a proton by multiple basic residues or where protons are sequestered onto mostly arginine residues, generates prominent small molecule loss and cleavage C-terminal to aspartic acid residues.
Recently, in recognition of the differences observed between the product ion spectra of protein ions generated under denaturing conditions versus those generated under native conditions [26–29, 35], the Kelleher group examined the gas-phase fragmentation propensities of native intact proteins[36]. In this study, a statistical analysis of 5311 matched fragments from 165 different experiments revealed striking differences in fragmentation propensities for native conditions when compared to denatured conditions. Most notably, under native conditions, fragmentation C-terminal to aspartic acid residues and N-terminal to proline residues, showed significant enhancement in fragmentation tendency when compared to their denatured counterparts. This observation suggests that the CID of protein ions generated under native conditions could constitute a strategy for maximizing selective fragmentation at aspartic acid and proline residues. However, protein ion signals tend to be less intense using native conditions relative to denaturing conditions (e.g., low pH and, perhaps, some fraction of organic solvent) and the extent of sodium and potassium ion adduction tends to be greater due to differences in ionization mechanism[37].
Here, we present a methodology for the identification of intact proteins that maximizes the selective cleavages at aspartic acid and proline residues while retaining the advantages in signal levels and lower metal ion adduction afforded by generating ions under denaturing conditions [38–41]. A drawback associated with protein ionization under denaturing conditions is that the precursor ion signal is divided amongst many relatively high charge states. We therefore use a sequence of proton transfer ion/ion reactions within the mass spectrometer to, first, concentrate the precursor ion charge state distribution largely into one charge state prior to ion isolation and, second, to move the precursor ion charges into a range (i.e., m/z 1500–2500) that is most likely to lead to selective cleavages. This methodology is demonstrated here with four standard proteins often used by the National Resource for Translational and Developmental Proteomics in optimizing top-down workflows: ubiquitin (8.6 kDa), myoglobin (16.9 kDa), trypsinogen (24.0 kDa), and carbonic anhydrase (29.0 kDa)[42].
Materials and Methods
Sample Preparation
Ubiquitin from bovine erythrocytes, myoglobin from equine skeletal muscle, trypsinogen from bovine pancreas, carbonic anhydrase from bovine erythrocytes, 2,2,3,3,4,4,5,5,6,6,7,7,8,8,8-pentadecafluoro-1-octanol (PFO), ammonium bicarbonate, dithiothreitol (DTT), iodoacetamide, and formic acid were purchased from Sigma-Aldrich (St. Louis, MO, USA). Optima-grade acetonitrile, HPLC-grade methanol, and Optima LC/MS-grade water were purchased from Fisher Scientific (Fair Lawn, NJ, USA). Chloroform was purchased from Mallinckrodt (Phillipsburg, NJ, USA) and urea was purchased from Pierce Chemical (Rockford, IL, USA). PFO was made at a concentration of 200 μM in a methanol solution containing 2% ammonium hydroxide.
Stock solutions of ubiquitin, myoglobin, and carbonic anhydrase were prepared at ~2 mg/mL in water. 50 μL of the ubiquitin stock, 80 μL of the myoglobin stock, and 80 μL of the carbonic anhydrase stock were combined and precipitated using 1:1:4:3 protein:chloroform:methanol:water[43]. The dried precipitated protein mixture was reconstituted in 100 μL of mobile phase A. The proteins were then separated via reverse-phase HPLC using an Agilent 1200 series with manual injector (Palo Alto, CA, USA) using an Agilent Zorbax 300SB-C3 4.6 × 150 mm, 5μm (Palo Alto, CA, USA) analytical HPLC column. The HPLC method consisted of a linear gradient from 5% mobile phase B to 50% mobile phase B over 60 minutes (mobile phase A: 95% water, 5% acetonitrile, and 0.2% formic acid, mobile phase B: 5% water, 95% acetonitrile, 0.2% formic acid) at flow rate of 1 mL/min and a detection wavelength of 215 nM.
A total of four 20 μL injections were made. For each injection, each protein was collected into a separate Eppendorf tube for a total of four fractions for each protein. The first three fractions were dried under vacuum and reconstituted in the fourth sample. The concentrations of the working samples are approximately 9 μM, 8μM, and 4 μM for ubiquitin, myoglobin, and carbonic anhydrase, respectively.
Reduction and Alkylation of Trypsinogen
Approximately 2 mg of trypsinogen was dissolved in 1 mL of reduction buffer (100 mM ammonium bicarbonate, 7 M urea). DTT was added from a 500 mM stock to a final concentration of 5 mM and was incubated for 45 minutes at 55 °C. The solution was cooled to room temperature and centrifuged briefly. Iodoacetamide was added to a final concentration of 14 mM, from a 500 mM freshly prepared stock solution. The solution was incubated at room temperature in the dark for 30 minutes. Unreacted iodoacetamide was quenched with addition of another 5 mM of DTT and incubated at room temperature for 15 minutes in the dark. 100 μL of the reduced and alkylated trypsinogen was subjected to the precipitation and HPLC procedures as described above. The final trypsinogen concentration is estimated to be approximately 6 μM.
Mass Spectrometry
All data were collected using a modified TripleTOF 5600 quadrupole/time-of-flight mass spectrometer (SCIEX, Concord, ON, Canada), modified to perform ion/ion reactions[44]. Alternatively pulsed nano-electrospray (nESI) ionization allows for sequential injection of multiply protonated proteins and singly deprotonated pentadecafluoro-1-octanol dimer (PFO)[45]. Analyte cations were injected into the mass spectrometer and trapped in q0 followed immediately by injection of the reagent anions. Instrument parameters were tuned such that PFO dimer was injected and transmitted through q0 while still trapping protein cations, resulting in consecutive proton transfer ion/ion reactions[46, 47]. During the reaction, the period which anions are passing through q0, an excitation voltage was applied at a slightly higher frequency than the desired ion’s fundamental frequency[48]. This process is referred to as ion parking. Application of the excitation voltage slows the ion/ion reaction rate of a specific mass-to-charge, allowing concentration of signal into a single, lower charge state. Ion parking was performed at the charge state immediately prior to m/z 1500. Reaction times varied from 100 ms to 300 ms.
Next, the charge reduced protein was isolated in Q1 and transferred to q2. PFO dimer was once again generated via negative nESI and transferred to q2. The ions were mutually stored for a reaction time optimized to generate a distribution of charge states between m/z 1500 and m/z 2500 (10–30 ms)[49]. Application of two high amplitude frequency sweeps were used to eject any low m/z ions (< m/z1500) and any high m/z ions (> m/z 2500). All charge states between m/z 1500 and 2500 were isolated and subjected to dipolar direct current (DDC) collisional activation[50]. Mass analysis was performed using time of flight (TOF).
Database Search
Database searches were performed using a program written in MATLAB (MathWorks®, Natick, MA, USA). All entries of the SWISS-PROT database (556,197 entries) were processed to create separate entries for the cleavage of the N-terminal initiator methionine residue and cleavage into individual protein chains[51]. The processing of the SWISS-PROT entries resulted in a database of 624,084 entries across all species.
Uninterpreted MS/MS spectra were processed using a zero-charge THRASH deconvolution algorithm written in MATLAB, using a S/N threshold of three[52]. The output contained a list of monoisotopic zero-charge masses as well the total area under each fragment’s isotopic distribution. Theoretical zero-charge b- and y- fragment masses for each protein, within a ± 100 Da window of the experimentally derived protein mass, were compared to the masses of the deconvoluted peak list with a fragmentation tolerance of ± 15 ppm. A McLuckey Score, with modified coefficients for native-like protein charge states, was calculated for each protein within the 200 Da mass window as shown below,
where n and Σ are, respectively, the number of matched fragments and summation of fragment area (where the smallest area is normalized to an area of 1) for either D (aspartic acid), P (proline), or X (non-specific) cleavages[53]. These coefficients are based qualitatively off our understanding of the well-known aspartic acid effect and the fragmentation propensities of native intact proteins as determined by Kelleher and coworkers[36]. This scoring system accounts for the preferential cleavage at aspartic acid and proline observed with native-like protein charge states.
Results and Discussion
Experimental Design and Rationale
The approach to intact protein analysis as described here exploits the highly selective cleavages C-terminal to aspartic acid residues and N-terminal to proline residues upon collisional activation, resulting in fewer, more abundant fragment ions, when compared to conventional top-down protocols where complete sequence coverage is desirable. Using a 27-residue polypeptide represented schematically in Figure 1 as an example, cleavage at every residue would result in a minimum of 26 fragments ions, inherently dispersing the precursor ion intensity across 26 different fragmentation channels. Realistically, the resulting tandem mass spectrum will likely contain more than 26 fragment ions as both complementary fragments from a single cleavage often appear in the spectrum, as do products from small molecule losses, amino acid side chain losses, and from multiple backbone cleavages (i.e., internal fragment ions). As an alternative to complete sequence coverage, one could manipulate the protein charge state distribution to produce abundant aspartic acid cleavages. As shown in Figure 1, cleavage at aspartic acid residues, as represented by the red amino acids, generates six fragment ions. Consequently, the precursor ion intensity is divided into only six fragments of higher abundance. In principle, this approach can lead to an extension of the lower limit of the dynamic range for top-down protein analysis.
The influence of precursor ion charge state on the gas-phase fragmentation behavior of multiply charged proteins is illustrated by ubiquitin (Figure 2). The ion-trap CID product ion spectra of [M + 8H]8+, [M + 7H]7+, [M + 6H]6+, and [M + 5H]5+ precursor ions show drastic differences in both fragment ion identities and abundances. At intermediate charge states, as shown in Figure 2a and 2b, extensive non-specific cleavage was observed. For lower charge states, fragmentation at aspartic acid residues becomes more prominent. Activation of the [M + 6H]6+ ion, the spectrum of which is shown in Figure 2c, shows a base peak corresponding to the doubly charged y18 fragment (aspartic acid channel), and activation of the [M + 5H]5+ ion (Figure 2d) produces almost exclusively cleavages at aspartic acid residues. These results are consistent with fragmentation characterization studies of ubiquitin and are consistent with the mobile proton model for peptide and protein dissociation[26, 32]. For ubiquitin, a small protein containing four arginine residues, the number of ionizing protons for the [M + 6H]6+ and [M + 5H]5+ ions begins to approach the number of arginine residues. The arginine residues, which have the highest gas-phase basicity of any amino acid[54, 55], sequester the protons allowing for processes that do not require proton mobilization (e.g. cleavage at aspartic acid residues and small molecule loss) to dominate the spectrum. These processes are also observed for the [M + 4H]4+ precursor ions, where the number of ionizing protons are equal to the number of arginine residues. The ion trap CID spectrum, as shown in supplemental Figure S1, shows a dominant water loss as well as cleavages C-terminal to aspartic acid residues.
Based on the assumption that proton mobility in protein ions is inhibited when the number of excess protons is roughly equal to the number of arginine residues in the protein, we sought to determine the m/z range within which most protein ions with limited proton mobility are likely to fall. A program written in MATLAB was used to count the number of arginine residues and calculate the mass of each protein ≤ 250,000 Da in the SWISS-PROT database. Linear regression of mass versus arginine count was used to predict the number of arginine residues for each entry (Figure 3a). Next, the true count, the actual number of arginine residues present in the protein sequence, was compared to the predicted count, the number of arginine residues predicted by the linear model. For any sequence ± 3 arginine residues to the true count, seven mass-to-charge ratios were generated from [M + n(PR-3)H]n+ to [M + n(PR+3)H]n+, where PR is the number of predicted arginine residues. Inspection of the histogram plot of these mass-to-charge ratios provided in Figure 3b, reveals that greater than 75 percent of the charge states fall between m/z 1500 and m/z 2500.
Protocol Demonstration using Reduced and Alkylated Trypsinogen
nESI of the reduced and alkylated trypsinogen sample under acidic denaturing conditions generates a broad and bi-modal distribution of charge states (Figure 4a) with the highest and lowest observable charge states being the [M + 33H]33+ and [M + 9H]9+ species, respectively. Much of the protein signal at the highest charge states was concentrated into a single lower charge state, [M + 17H]17+, via a series of q0 transmission mode proton transfer ion/ion reactions and ion parking (Figure 4c). Signal enhancement is evidenced by the ion abundance of the [M + 17H]17+ species (compare Figure 4c to Figures 4a and 4b). Figure 4b demonstrates that there is no signal enhancement in the absence of ion parking, as no auxiliary excitation voltage is applied to inhibit the ion/ion reaction of the [M + 17H]17+ species in that experiment.
The concentrated [M + 17H]17+ ion was isolated in Q1 (Figure 4d) and transferred to q2 where it was subjected to a second set of proton transfer ion/ion reactions. Reaction conditions were tuned such that the reduced charge states fell largely into the m/z 1500–2500 window. This region is highlighted in green in Figure 4e. Two high amplitude frequency sweeps were applied to q2, one using a frequency to eject ions lower in m/z than 1500, and one using a frequency to eject ions higher in m/z than 2500. The resulting spectrum is shown in Figure 4e.
In the case of reduced and alkylated trypsinogen, m/z 1500 to m/z 2500 contains 7 charge states, ranging from the [M + 16H]16+ ion to the [M + 10H]10+ ion. These charge states were subjected to broadband DDC collisional activation, maximizing the chances of observing the preferential cleavage at aspartic acid and proline residues. The MS/MS spectrum is provided in Figure 5. Only b- and y-type fragment ions, matched within a ± 15 ppm tolerance, are labeled. As summarized in Table 1, a total of 12 fragments were identified, 6 coinciding with selective fragmentation, 3 at aspartic acid and 3 at proline. Visually, it is evident proline fragment ions are among the most abundant peaks in the spectrum.
Table 1.
Observed Mass (Monoisotopic) | 8559.72 |
Theoretical Mass (Monoisotopic) | 8559.62 |
Mass Difference (ppm) | 12.08 |
Identity | Ubiquitin |
Species | Various |
Sequence | MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIP PDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRL RGG |
Number of Basic Residues | Arg: 4, His: 1, Lys: 7, Total: 12 |
Matches | 22 of 100 (22 of 100)a |
McLuckey Score | 100719.97 (76750.99)b |
P-Score | 3.8e-29 (3.8e-29)c |
Matched Fragments | b7, b8, b9, b11, b13, b16, b18, b21, b32, b36, b39, b52, y9, y12, y13, y16, y18, y24, y25, y37, y55 |
| |
Observed Mass (Monoisotopic) | 16941.02 |
Theoretical Mass (Monoisotopic) | 16940.96 |
Mass Difference (ppm) | 3.27 |
Identity | Myoglobin |
Species | Equus caballus (Horse) |
Sequence | GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFT GHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVL TALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLE FISDAIIHVLHSKHPGDFGADAQGAMTKALELFRND IAAKYKELGFQG |
Number of Basic Residues | Arg: 2, His: 11, Lys: 19, Total: 32 |
Matches | 24 of 45 (11 of 45)a |
McLuckey Score | 36787.9 (5729.11)b |
P-Score | 6.1e-34 (9.9e-13)c |
Matched Fragments | b20, b41, b44, b47, b63, b78, b79, b96, b118, b122, y7, y8, y9, y10, y11, y12, y13, y27, y31, y34, y44, y74, y90 |
| |
Observed Mass (Monoisotopic) | 24662.18 |
Theoretical Mass (Monoisotopic) | 24661.82 |
Mass Difference (ppm) | 14.43 |
Identity | Trypsinogen |
Species | Bos taurus (Bovine) |
Sequence | VDDDDKIVGGYTCGANTVPYQVSLNSGYHFCGGS LINSQWVVSAAHCYKSGIQVRLGEDNINVVEGNEQ FISASKSIVHPSYNSNTLNNDIMLIKLKSAASLNSRV ASISLPTSCASAGTQCLISGWGNTKSSGTSYPDVLK CLKAPILSDSSCKSAYPGQITSNMFCAGYLEGGKDS CQGDSGGPVVCSGKLQGIVSWGSGCAQKNKPGVY TKVCNYVSWIKQTIASN |
Number of Basic Residues | Arg: 2, His: 3, Lys: 15, Total: 20 |
Matches | 12 of 143 (2 of 143)a |
McLuckey Score | 129384.11 (12828.84)b |
P-Score | 3.7e-10 (0.24)c |
Matched Fragments | b4, b5, b6, b10, b11, b12, b18, b59, b79, b111, y10, y23 |
| |
Observed Mass (Average) | 29024.55 |
Theoretical Mass (Average) | 29024.63 |
Mass Difference (ppm) | −2.83 |
Identity | Carbonic anhydrase 2 |
Species | Bos taurus (Bovine) |
Sequence | Ac-SHHWGYGKHNGPEHWHKDFPIANGERQSPVD IDTKAVVQDPALKPLALVYGEATSRRMVNNGHSFN VEYDDSQDKAVLKDGPLTGTYRLVQFHFHWGSSD DQGSEHTVDRKKYAAELHLVHWNTKYGDFGTAA QQPDGLAVVGVFLKVGDANPALQKVLDALDSIKT KGKSTDFPNFDPGSLLPNVLDYWTYPGSLTTPPLLE SVTWIVLKEPISVSSQQMLKFRTLNFNAEGEPELLM LANWRPAQPLKNRQVRGFPK |
Number of Basic Residues | Arg: 9, His: 11, Lys: 18, Total: 38 |
Matches | 21 of 69 (2 of 69)a |
McLuckey Score | 183239.01 (12439.74)b |
P-Score | 1.6e-28 (0.047)c |
Matched Fragments | b3, b6, b8, b17, b18, b31, b33, b38, b40, b40, b70, y17, y27, y47, y48, y61, y62, y63, y64, y67, y68 |
The number of matched fragments of the second ranked protein is provided in parenthesis.
The McLuckey Score of the second ranked protein is provided in parenthesis.
The P-Score of the second ranked protein is provided in parenthesis.
Identification of Model Proteins
For each protein, database searching was initiated by retrieval of all proteins within the user-defined precursor tolerance. In all cases, this tolerance was set to ± 100 Da of the experimentally derived precursor mass, obtained from deconvolution based on bayesian statistics within the PeakView Software (SCIEX, Concord, ON, Canada). Theoretical b- and y- fragment zero-charge masses were calculated for each protein in the retrieved list and compared to the experimental zero-charge masses. Proteins were ranked according to their McLuckey Score. The peak lists used in the database searches are provided in the supplemental information.
In the case of ubiquitin, 2350 proteins were retrieved and scored in the database search. The top 153 scores were identical and, upon inspection, all 153 were correctly identified as ubiquitin. This is an artifact of the database creation as there are several parent proteins, across all species, in the SWISS-PROT database containing ubiquitin chains. Of the 100 fragment masses obtained by the THRASH deconvolution algorithm, 22 corresponded to b- and y- fragments of ubiquitin. The results of the top scoring protein are summarized in Table 1. By concentrating the precursor signal via proton transfer ion/ion reaction in conjunction with ion parking, and moving the charge state distribution to a region which maximizes selective cleavages, we observed greater than 90% of the matched fragment ion signal corresponding to cleavage at aspartic acid and proline residues. Specifically, 86.4% of the total matched fragment area was contained in 7 unique aspartic acid fragments and 5.5% was contained in two proline fragment ions (Figure 6).
ProSight Lite was used to calculate a P-Score for each of the top five ranked proteins[56]. While the output of the database search ranked according to the McLuckey Score does not necessarily coincide with ranking based on P-Score, in the case of ubiquitin, the top five unique protein sequences based on P-Score were identical to the top five unique protein sequences ranked according to their McLuckey Score. As shown in Table 1 and Table 2, the top two ranked proteins each matched 22 fragment masses of the 100 experimental fragment masses, resulting in equivalent P-Scores[57]. In our approach, three specific aspartic acid fragments, b52, y37, and y55, provided sufficient information to distinguish between ubiquitin and ubiquitin related proteins (supplementary Table S1). These results demonstrate the utility of using a scoring system that is weighted both by abundance and cleavage site as the conventional approach was unable to distinguish between ubiquitin and the ubiquitin related protein.
Table 2.
Identity | Species | P-Score | Matches | |
---|---|---|---|---|
1 | Ubiquitin | Various | 3.8e-29 | 22 of 100 |
2 | Ubiquitin-related | Various | 3.8e-29 | 22 of 100 |
3 | Ubiquitin-related | Equus caballus (Horse) | 3.3e-24 | 19 of 100 |
4 | Ubiquitin-related 1 | Cricetulus griseus (Chinese hamster) | 1.3e-22 | 18 of 100 |
5 | Ubiquitin-related 1 | Dictyostelium discoideum (Slime mold) | 5.2e-21 | 17 of 100 |
For apomyoglobin, the database search returned 2446 results. The highest scoring protein was correctly identified as myoglobin from equus caballus, while the second and third scoring proteins, as shown in supplementary Table S2, were found to be myoglobin from different species. Here, greater than 50% of the matched fragment area is comprised of 7 aspartic acid fragments (Figure 6). While equine myoglobin contains only two arginine residues, there are 19 lysine residues and 11 histidine residues in addition to the N-terminus, for a total of 33 nominal basic sites. When the charge state distribution is moved to the m/z 1500 to m/z 2500 region, [M + 11H]11+ to [M + 7H]7+, the ionizing protons may well be solvated by multiple basic residues, thereby reducing the proton mobility and thereby allowing for prominent cleavage at the aspartic acid residues.
Like equine myoglobin, trypsinogen is another protein containing only two arginine residues. The database search, which included carbamidomethylation (+ 57.0198 Da) of cysteine residues, successfully identified trypsinogen, matching only 12 fragment masses. Despite the lowest number of matched fragments, there remains a factor 10 difference in the score between trypsinogen and the second ranking protein (Table 1). Unlike apomyoglobin, where the majority of information was comprised of aspartic acid fragment ions, activation of trypsinogen resulted in cleavage at three proline residues, accounting for 86.7% of the matched fragment area (Figure 6). The differences between the two systems arise from the total number of basic residues. Trypsinogen contains only 20 basic residues compared to myoglobin’s 32. As previously stated, the m/z region subjected to DDC excitation for trypsinogen contains the [M + 16H]16+ to [M + 10H]10+ charge states. While some protons may be solvated by two basic residues, a sufficient number of mobile protons remain to initiate cleavage at proline residues.
Lastly, two database searches for carbonic anhydrase were performed. The first search was performed with no N-terminal modifications while the second accounted for N-terminal acetylation (+ 42.0095 Da). The results of the former are summarized in supplementary Table S4. As shown in Table S4, the top two proteins exhibit the same score and the same number of matched fragments. The two sequences were identified as bovine carbonic anhydrase II, where the sequences differ only by the cleavage of the initiator methionine residue. In both cases, the 10 matched fragments were identical (y-type fragments). It was observed that the experimentally derived average mass of the precursor was 89 Da lower in mass than carbonic anhydrase with the N-terminal methionine, whereas the average mass of the precursor was 42 Da greater in mass than the sequence lacking the initiator methionine.
While it is possible to deduce the correct sequence based on mass, a second search was performed with N-terminal acetylation. Here, the top scoring protein was, again, correctly identified as carbonic anhydrase II lacking the initiator methionine with N-terminal acetylation. Note that the results of the top five proteins, as shown in Table S5, do not include carbonic anhydrase II with the initiator methionine since addition of the N-terminal modification puts this sequence outside of the ± 100 Da retrieval window. When comparing the two searches, N-terminal acetylation matched 11 additional fragments (b-type fragments), for a total of 21 matched fragments (Table 1). Despite being the largest protein examined in this study, selective cleavages are still observed, with 56.9% of the total matched fragment area held in aspartic acid and proline cleavages (Figure 6).
Conclusions
Proof-of-concept is presented here for a protein top-down tandem mass spectrometry methodology that exploits the advantages of ionization under denaturing conditions while also maximizing the likelihood for selective cleavages at aspartic acid and proline residues. The generation of precursor ions under denaturing conditions and the relatively short reaction times are amenable to an online LC-MS workflow, though demonstration of an online LC-MS workflow was outside the scope of the present work. Ion/ion reactions are used both to concentrate ions of interest into a single charge state for ion isolation and to generate subsequently precursor ions within a mass-to-charge range most likely to yield selective cleavages under collisional activation conditions. The objective is to generate sufficient fragmentation for identification while concentrating product ion signals into a relatively limited number of channels. This approach is not intended to compete with strategies that maximize the number of backbone cleavages for characterization purposes. Rather, it is a complementary approach that is effective in protein identification that offers advantages in dynamic range and quantitation via single- or multiple-reaction monitoring due to the fact that product ion signal is not distributed amongst a large number of product channels. Broad-band collisional activation using DDC of precursor ions within the targeted m/z window for all proteins studied here resulted in greater than 50% of the total matched fragment area arising from aspartic acid and proline cleavages (up to 91.9% for ubiquitin). While the conventional scoring system based on matching predicted and observed product masses can be used with this approach yielding respectable p-scores, a scoring system that weights most heavily C-terminal aspartic acid and N-terminal proline cleavages can add discriminatory power to an approach like this that is intended to maximize such cleavages. The combination of the ion processing work-flow described here with an abundance-weighted scoring system lead to the successful identification of four model proteins, ranging in mass from 8.5 kDa to 29.0 kDa. In the case of trypsinogen, successful identification was achieved with only 12 matched fragments (3 aspartic acid, 3 proline, and 6 non-specific cleavages), demonstrating the utility of maximizing selective fragmentation pathways. Interestingly, two of the selected model proteins contain an unusually low number of arginine residues. Nevertheless, selective cleavage at either proline or aspartic acid residues was noted. In addition to the application of intact protein identification, it stands to reason this workflow may be useful in multiple reaction monitoring, particularly of low level proteins.
Supplementary Material
Acknowledgments
This work was supported by the National Institutes of Health (NIH) under Grant GM R37-45372. Graduate student support for D.J.F. provided by W. Brooks Fortune Fellowship in Analytical Chemistry. D.J.F. would like to acknowledge Catherine Rawlins and Daniel Donnelly of the Agar Lab at Northeastern University for helpful discussion regarding top-down sample preparation.
References
- 1.Loo J, Edmonds C, Smith R. Primary sequence information from intact proteins by electrospray ionization tandem mass spectrometry. Science. 1990;248:201–204. doi: 10.1126/science.2326633. [DOI] [PubMed] [Google Scholar]
- 2.Schaffer LV, Shortreed MR, Cesnik AJ, Frey BL, Solntsev SK, Scalf M, Smith LM. Expanding Proteoform Identifications in Top-Down Proteomic Analyses by Constructing Proteoform Families. Anal Chem. 2018;90:1325–1333. doi: 10.1021/acs.analchem.7b04221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Khan A, Eikani CK, Khan H, Iavarone AT, Pesavento JJ. Characterization of Chlamydomonas reinhardtii Core Histones by Top-Down Mass Spectrometry Reveals Unique Algae-Specific Variants and Post-Translational Modifications. J Proteome Res. 2018;17:23–32. doi: 10.1021/acs.jproteome.7b00780. [DOI] [PubMed] [Google Scholar]
- 4.Lubeckyj RA, McCool EN, Shen X, Kou Q, Liu X, Sun L. Single-Shot Top-Down Proteomics with Capillary Zone Electrophoresis-Electrospray Ionization-Tandem Mass Spectrometry for Identification of Nearly 600 Escherichia coli Proteoforms. Anal Chem. 2017;89:12059–12067. doi: 10.1021/acs.analchem.7b02532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cai W, Tucholski T, Chen B, Alpert AJ, McIlwain S, Kohmoto T, Jin S, Ge Y. Top-Down Proteomics of Large Proteins up to 223 kDa Enabled by Serial Size Exclusion Chromatography Strategy. Anal Chem. 2017;89:5467–5475. doi: 10.1021/acs.analchem.7b00380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kilpatrick LE, Kilpatrick EL. Optimizing High-Resolution Mass Spectrometry for the Identification of Low-Abundance Post-Translational Modifications of Intact Proteins. J Proteome Res. 2017;16:3255–3265. doi: 10.1021/acs.jproteome.7b00244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zubarev RA, Kelleher NL, McLafferty FW. Electron Capture Dissociation of Multiply Charged Protein Cations. A Nonergodic Process J Am Chem Soc. 1998;120:3265–3266. [Google Scholar]
- 8.Syka JEP, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci USA. 2004;101:9528–9533. doi: 10.1073/pnas.0402700101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Brodbelt JS. Photodissociation mass spectrometry: new tools for characterization of biological molecules. Chem Soc Rev. 2014;43:2757–2783. doi: 10.1039/c3cs60444f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Compton PD, Zamdborg L, Thomas PM, Kelleher NL. On the Scalability and Requirements of Whole Protein Mass Spectrometry. Anal Chem. 2011;83:6868–6874. doi: 10.1021/ac2010795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Smith LM, Kelleher NL The Consortium for Top Down P. Proteoform: a single term describing protein complexity. Nature Methods. 2013;10:186. doi: 10.1038/nmeth.2369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.McCormack AL, Somogyi A, Dongre AR, Wysocki VH. Fragmentation of protonated peptides: surface-induced dissociation in conjunction with a quantum mechanical approach. Anal Chem. 1993;65:2859–2872. doi: 10.1021/ac00068a024. [DOI] [PubMed] [Google Scholar]
- 13.Dongré AR, Somogyi Á, Wysocki VH. Surface-induced Dissociation: An Effective Tool to Probe Structure, Energetics and Fragmentation Mechanisms of Protonated Peptides. J Mass Spectrom. 1996;31:339–350. doi: 10.1002/(SICI)1096-9888(199604)31:4<339::AID-JMS322>3.0.CO;2-L. [DOI] [PubMed] [Google Scholar]
- 14.Paizs B, Suhai S. Fragmentation pathways of protonated peptides. Mass Spectrom Rev. 2005;24:508–548. doi: 10.1002/mas.20024. [DOI] [PubMed] [Google Scholar]
- 15.Paizs B, Suhai S. Towards understanding the tandem mass spectra of protonated oligopeptides. 1: Mechanism of amide bond cleavage. J Am Soc Mass Spectrom. 2004;15:103–113. doi: 10.1016/j.jasms.2003.09.010. [DOI] [PubMed] [Google Scholar]
- 16.Yu W, Vath JE, Huberty MC, Martin SA. Identification of the facile gas-phase cleavage of the Asp-Pro and Asp-Xxx peptide bonds in matrix-assisted laser desorption time-of-flight mass spectrometry. Anal Chem. 1993;65:3015–3023. doi: 10.1021/ac00069a014. [DOI] [PubMed] [Google Scholar]
- 17.Kish MM, Wesdemiotis C. Selective cleavage at internal lysine residues in protonated vs. metalated peptides. Int J Mass spectrom. 2003;227:191–203. [Google Scholar]
- 18.Gehrig PM, Roschitzki B, Rutishauser D, Reiland S, Schlapbach R. Phosphorylated serine and threonine residues promote site-specific fragmentation of singly charged, arginine-containing peptide ions. Rapid Commun Mass Spectrom. 2009;23:1435–1445. doi: 10.1002/rcm.4019. [DOI] [PubMed] [Google Scholar]
- 19.Bleiholder C, Suhai S, Harrison AG, Paizs B. Towards Understanding the Tandem Mass Spectra of Protonated Oligopeptides. 2: The Proline Effect in Collision-Induced Dissociation of Protonated Ala-Ala-Xxx-Pro-Ala (Xxx = Ala, Ser, Leu, Val, Phe, and Trp) J Am Soc Mass Spectrom. 2011;22:1032–1039. doi: 10.1007/s13361-011-0092-1. [DOI] [PubMed] [Google Scholar]
- 20.McGee WM, McLuckey SA. The ornithine effect in peptide cation dissociation. J Mass Spectrom. 2013;48:856–861. doi: 10.1002/jms.3233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Schwartz BL, Bursey MM. Some proline substituent effects in the tandem mass spectrum of protonated pentaalanine. Biological Mass Spectrometry. 1992;21:92–96. doi: 10.1002/bms.1200210206. [DOI] [PubMed] [Google Scholar]
- 22.Vaisar T, Urban J. Probing Proline Effect in CID of Protonated Peptides. J Mass Spectrom. 1996;31:1185–1187. doi: 10.1002/(SICI)1096-9888(199610)31:10<1185::AID-JMS396>3.0.CO;2-Q. [DOI] [PubMed] [Google Scholar]
- 23.Tsaprailis G, Somogyi Á, Nikolaev EN, Wysocki VH. Refining the model for selective cleavage at acidic residues in arginine-containing protonated peptides22Dedicated to Bob Squires for his many seminal contributions to mass spectrometry and ion chemistry. Int J Mass spectrom. 2000;195–196:467–479. [Google Scholar]
- 24.Sullivan AG, Brancia FL, Tyldesley R, Bateman R, Sidhu K, Hubbard SJ, Oliver SG, Gaskell SJ. The exploitation of selective cleavage of singly protonated peptide ions adjacent to aspartic acid residues using a quadrupole orthogonal time-of-flight mass spectrometer equipped with a matrix-assisted laser desorption/ionization source. Int J Mass spectrom. 2001;210–211:665–676. [Google Scholar]
- 25.Gu C, Tsaprailis G, Breci L, Wysocki VH. Selective gas-phase cleavage at the peptide bond C-terminal to aspartic acid in fixed-charge derivatives of Asp-containing peptides. Anal Chem. 2000;72:5804–5813. doi: 10.1021/ac000555c. [DOI] [PubMed] [Google Scholar]
- 26.Reid GE, Wu J, Chrisman PA, Wells JM, McLuckey SA. Charge-State-Dependent Sequence Analysis of Protonated Ubiquitin Ions via Ion Trap Tandem Mass Spectrometry. Anal Chem. 2001;73:3274–3281. doi: 10.1021/ac0101095. [DOI] [PubMed] [Google Scholar]
- 27.Newton KA, Chrisman PA, Reid GE, Wells JM, McLuckey SA. Gaseous apomyoglobin ion dissociation in a quadrupole ion trap: [M + 2H]2+-[M + 21H]21+ Int J Mass Spectrom. 2001;212:359–376. [Google Scholar]
- 28.Engel BJ, Pan P, Reid GE, Wells JM, McLuckey SA. Charge state dependent fragmentation of gaseous protein ions in a quadrupole ion trap: bovine ferri-, ferro-, and apo-cytochrome c. Int J Mass Spectrom. 2002;219:171–187. [Google Scholar]
- 29.Hogan JM, McLuckey SA. Charge state dependent collision-induced dissociation of native and reduced porcine elastase. J Mass Spectrom. 2003;38:245–256. doi: 10.1002/jms.458. [DOI] [PubMed] [Google Scholar]
- 30.Huang Y, Triscari JM, Tseng GC, Pasa-Tolic L, Lipton MS, Smith RD, Wysocki VH. Statistical Characterization of the Charge State and Residue Dependence of Low-Energy CID Peptide Dissociation Patterns. Anal Chem. 2005;77:5800–5813. doi: 10.1021/ac0480949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Huang Y, Tseng GC, Yuan S, Pasa-Tolic L, Lipton MS, Smith RD, Wysocki VH. A Data-Mining Scheme for Identifying Peptide Structural Motifs Responsible for Different MS/MS Fragmentation Intensity Patterns. J Proteome Res. 2008;7:70–79. doi: 10.1021/pr070106u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cobb JS, Easterling ML, Agar JN. Structural characterization of intact proteins is enhanced by prevalent fragmentation pathways rarely observed for peptides. J Am Soc Mass Spectrom. 2010;21:949–959. doi: 10.1016/j.jasms.2010.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Newton KA, Pitteri SJ, Laskowski M, McLuckey SA. Effects of Single Amino Acid Substitution on the Collision-Induced Dissociation of Intact Protein Ions: Turkey Ovomucoid Third Domain. J Proteome Res. 2004;3:1033–1041. doi: 10.1021/pr049910w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Dongré AR, Jones JL, Somogyi Á, Wysocki VH. Influence of Peptide Composition, Gas-Phase Basicity, and Chemical Modification on Fragmentation Efficiency: Evidence for the Mobile Proton Model. J Am Chem Soc. 1996;118:8365–8374. [Google Scholar]
- 35.Loo JA, Loo RRO, Udseth HR, Edmonds CG, Smith RD. Solvent-induced conformational changes of polypeptides probed by electrospray-ionization mass spectrometry. Rapid Commun Mass Spectrom. 1991;5:101–105. doi: 10.1002/rcm.1290050303. [DOI] [PubMed] [Google Scholar]
- 36.Haverland NA, Skinner OS, Fellers RT, Tariq AA, Early BP, LeDuc RD, Fornelli L, Compton PD, Kelleher NL. Defining Gas-Phase Fragmentation Propensities of Intact Proteins During Native Top-Down Mass Spectrometry. J Am Soc Mass Spectrom. 2017;28:1203–1215. doi: 10.1007/s13361-017-1635-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ahadi E, Konermann L. Modeling the Behavior of Coarse-Grained Polymer Chains in Charged Water Droplets: Implications for the Mechanism of Electrospray Ionization. J Phys Chem B. 2012;116:104–112. doi: 10.1021/jp209344z. [DOI] [PubMed] [Google Scholar]
- 38.Wang G, Cole RB. Effect of Solution Ionic Strength on Analyte Charge State Distributions in Positive and Negative Ion Electrospray Mass Spectrometry. Anal Chem. 1994;66:3702–3708. [Google Scholar]
- 39.Pan P, McLuckey SA. Electrospray Ionization of Protein Mixtures at Low pH. Anal Chem. 2003;75:1491–1499. doi: 10.1021/ac020637w. [DOI] [PubMed] [Google Scholar]
- 40.Pan P, Gunawardena HP, Xia Y, McLuckey SA. Nanoelectrospray Ionization of Protein Mixtures: Solution pH and Protein pI. Anal Chem. 2004;76:1165–1174. doi: 10.1021/ac035209k. [DOI] [PubMed] [Google Scholar]
- 41.Yue X, Vahidi S, Konermann L. Insights into the Mechanism of Protein Electrospray Ionization From Salt Adduction Measurements. J Am Soc Mass Spectrom. 2014;25:1322–1331. doi: 10.1007/s13361-014-0905-0. [DOI] [PubMed] [Google Scholar]
- 42.National Resource for Translational and Developmental Proteomics Home Page. [Accessed Mar 3, 2018]; http://nrtdp.northwestern.edu/
- 43.Wessel D, Flügge UI. A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal Biochem. 1984;138:141–143. doi: 10.1016/0003-2697(84)90782-6. [DOI] [PubMed] [Google Scholar]
- 44.Xia Y, Chrisman PA, Erickson DE, Liu J, Liang X, Londry FA, Yang MJ, McLuckey SA. Implementation of Ion/Ion Reactions in a Quadrupole/Time-of-Flight Tandem Mass Spectrometer. Anal Chem. 2006;78:4146–4154. doi: 10.1021/ac0606296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Liang X, Xia Y, McLuckey SA. Alternately Pulsed Nanoelectrospray Ionization/Atmospheric Pressure Chemical Ionization for Ion/Ion Reactions in an Electrodynamic Ion Trap. Anal Chem. 2006;78:3208–3212. doi: 10.1021/ac052288m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Liang X, McLuckey SA. Transmission mode ion/ion proton transfer reactions in a linear ion trap. J Am Soc Mass Spectrom. 2007;18:882–890. doi: 10.1016/j.jasms.2007.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Emory JF, Hassell KH, Londry FA, McLuckey SA. Transmission mode ion/ion reactions in the radiofrequency-only ion guide of hybrid tandem mass spectrometers. Rapid Commun Mass Spectrom. 2009;23:409–418. doi: 10.1002/rcm.3894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.McLuckey SA, Reid GE, Wells JM. Ion Parking during Ion/Ion Reactions in Electrodynamic Ion Traps. Anal Chem. 2002;74:336–346. doi: 10.1021/ac0109671. [DOI] [PubMed] [Google Scholar]
- 49.Xia Y, Wu J, McLuckey SA, Londry FA, Hager JW. Mutual storage mode ion/ion reactions in a hybrid linear ion trap. J Am Soc Mass Spectrom. 2005;16:71–81. doi: 10.1016/j.jasms.2004.09.017. [DOI] [PubMed] [Google Scholar]
- 50.Webb IK, Londry FA, McLuckey SA. Implementation of dipolar direct current (DDC) collision-induced dissociation in storage and transmission modes on a quadrupole/time-of-flight tandem mass spectrometer. Rapid Commun Mass Spectrom. 2011;25:2500–2510. doi: 10.1002/rcm.5152. [DOI] [PubMed] [Google Scholar]
- 51.Giglione C, Boularot A, Meinnel T. Protein N-terminal methionine excision. Cellular and Molecular Life Sciences CMLS. 2004;61:1455–1474. doi: 10.1007/s00018-004-3466-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Horn DM, Zubarev RA, McLafferty FW. Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J Am Soc Mass Spectrom. 2000;11:320–332. doi: 10.1016/s1044-0305(99)00157-9. [DOI] [PubMed] [Google Scholar]
- 53.Reid GE, Shang H, Hogan JM, Lee GU, McLuckey SA. Gas-Phase Concentration, Purification, and Identification of Whole Proteins from Complex Mixtures. J Am Chem Soc. 2002;124:7353–7362. doi: 10.1021/ja025966k. [DOI] [PubMed] [Google Scholar]
- 54.Gorman GS, Speir JP, Turner CA, Amster IJ. Proton affinities of the 20 common. alpha.-amino acids. J Am Chem Soc. 1992;114:3986–3988. [Google Scholar]
- 55.Harrison AG. The gas-phase basicities and proton affinities of amino acids and peptides. Mass Spectrom Rev. 1997;16:201–217. [Google Scholar]
- 56.Fellers RT, Greer JB, Early BP, Yu X, LeDuc RD, Kelleher NL, Thomas PM. ProSight Lite: Graphical software to analyze top-down mass spectrometry data. Proteomics. 2015;15:1235–1238. doi: 10.1002/pmic.201570050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Meng F, Cargile BJ, Miller LM, Forbes AJ, Johnson JR, Kelleher NL. Informatics and multiplexing of intact protein identification in bacteria and the archaea. Nat Biotechnol. 2001;19:952. doi: 10.1038/nbt1001-952. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.