Abstract
A 7 kDa toxin isolated from the venom of the Texas coral snake (Micrurus tener tener) was subjected to collision-induced dissociation (CID) and electron-transfer dissociation (ETD) analyses both before and after reduction at low pH. Manual and automated approaches to de novo sequencing are compared in detail. Manual de novo sequencing utilizing the combination of high accuracy CID and ETD data and an acid-related cleavage yielded the N-terminal half of the sequence from the reduced species. The intact polypeptide, containing 3 disulfide bridges produced a series of unusual fragments in ion trap CID experiments: abundant internal amino acid losses were detected, and also one of the disulfide-linkage positions could be determined from fragments formed by the cleavage of two bonds. In addition, internal and c-type fragments were also observed.
Keywords: CID, ETD, de novo sequencing, fragmentation, disulfide-bridge, high mass accuracy, peak-picking
Introduction
Mass spectrometry has become a significant player in peptide sequence determination approximately a quarter of a century ago [1, 2]. While the function of this technique has dramatically shifted towards high-throughput peptide identification utilizing the ever-expanding protein and genomic databases with automated search engines, mass spectrometry’s role in de novo sequencing remains important. Quite a few automated search engines can compensate for a series of “problems” in identifying protein fragments such as non-specific cleavages, misidentification of the monoisotopic mass, amino acid substitutions, and unexpected covalent modifications, but the combination thereof may still prevent correct identification of a peptide. In addition, even fully-sequenced genomes yield incomplete protein databases because translations are usually predicted based on similarities with other species; species-specific sequences may not overlap sufficiently with known proteins for accurate prediction, and thus may be overlooked. Last but not least, we have very limited genomic (and therefore proteomic) information for a vast number of species.
Neuroscience frequently utilizes toxins as biochemical and pharmacological tools that can be used to manipulate key physiological receptors [3]. These toxins can be isolated from a wide variety of species, including sea snails, insects, fishes, centipedes, spiders, scorpions, lizards, and snakes. They represent a bewildering variety of structures, and in most instances genomic information is lacking. Furthermore, unusual covalent modifications frequently occur. Thus, de novo sequencing is typically required for the characterization of these interesting and valuable polypeptides.
Most de novo sequencing programs were developed for (and work most reliably for) tryptic peptides [4-6]. Presently, the PEAKS de novo sequencing software is used most frequently. Most available software cannot interpret/identify “atypical” fragmentation patterns very successfully. A new approach, composition-based sequencing (CBS), was developed to utilize the high mass accuracy of the precursor ions and some fragment masses for de novo sequencing. However, since the paradigm involves the determination of the amino acid composition of the entire peptide from a few per se unambiguous fragment ions, the method can only be used for relatively short sequences [7].
Collisional activation of non-tryptic sequences may represent a significant challenge because of the unpredictable fragmentation pattern (most tryptic peptides feature a single basic residue at the C-terminus, which leads to preferential charge retention and abundant y ions). Thus, the CBS method could be more successful for such peptides. Another potential solution would be to apply two complementary activation methods: electron-capture or electron-transfer dissociation (ECD & ETD) and collision-induced dissociation (CID) for the analysis of each molecule. While the primary cleavage site in CID is the peptide bond, yielding b and y fragments, the bond between the amino group and the α carbon will fragment in ECD/ETD experiments producing mostly z. and c ions [8]. The combination of CID and ECD data has been reported for improved peptide identification [9] as well as for de novo sequencing [10]. Identification of complementary fragment ions in corresponding CID and ECD/ETD spectra provides a strong framework for sequence determination, and high-accuracy mass measurements make these identifications more reliable. The mass accuracy afforded may be sufficient to apply the CBS paradigm for a single ion series and obtain unambiguous sequence assignments. Indeed, that is what we demonstrate in this paper.
In this study we describe a polypeptide, MitTxα purified from the venom of the Texas coral snake (Micrurus tener tener), that proved to be part of a heteromeric complex functioning as a selective agonist for acid-sensing ion channels [11]. MALDI mass measurement indicated a single, 7 kDa component in the fraction of interest. Thus, we set out to use different MS/MS techniques with high accuracy mass measurement to obtain sufficient sequence information either to identify the polypeptide (if it has been included in the public protein databases) or to provide suitable information for successful cloning experiments.
We will present how the combination of high accuracy ion trap CID and ETD data permitted sequencing of the N-terminal half of the molecule. We will show that de novo sequencing is still an area where human intervention is important for success. We will also show some unusual CID fragmentation for the intact polypeptide.
Experimental
Reduction/alkylation attempts
To aliquots of toxin solution in 25 mM ammonium bicarbonate buffer, pH~8, 15 nmoles of DTT was added, and the mixture was incubated at 56°C for 30 min. Then 32 nmoles of iodoacetamide or iodoacetic acid was added, and the mixture was incubated at room temperature for 30 min.
Successful reduction of the polypeptide
Toxin solution was incubated with 125 nmoles of TCEP in 0.1% formic acid, at 37°C, for 24h.
An aliquot of the mixture was analyzed on a NanoACQUITY (Waters)-LTQ-Orbitrap Velos LC/MS system in LC/MS mode in order to assess the success of reduction. For all LC/MS experiments solvents A and B were 0.1% formic acid in water or acetonitrile, respectively. Gradient elution from 5% to 40 % B in 35 min was used to fractionate the components.
Peptide fragmentation analyses
MS/MS and MS3 analyses of the intact polypeptide were performed with an infused solution (~400 nL/min) of the polypeptide, using an LTQ-Orbitrap XL with ETD or CID activation in the linear trap. Fragments were measured in the Orbitrap. The isolation window was 10 Th for both MS2 and MS3 experiments, and 2 microscans were acquired. ETD activation time was 30 msec. The CID activation energy was set at 35% and 40%, for the ETD->CID and the CID experiments, respectively. AGC targets were set at 104 for the linear trap and 2x105 for the Orbitrap, which was operated at a resolution of 30000.
MS/MS analysis of the short peptide was performed using an LTQ-Orbitrap Velos mass spectrometer – an aliquot of the TCEP-containing mixture was diluted with an equal amount of acetonitrile, and infused at a flow rate of ~400 nL/min. CID experiments on both the doubly and singly charged precursor ions were performed in the linear trap at a 35% normalized collision energy, while the fragments were measured in the Orbitrap. Higher-energy collision-dissociation (HCD) collision energy was also set to 35%. The precursor ion window was 2 Th, resolution was set to 30000, and the AGC setting was as in the LC/MS/MS experiment below.
MS/MS analysis of the long peptide was performed using an LTQ-Orbitrap Velos mass spectrometer – the analysis was performed in LC/MS mode. Precursor as well as MS/MS fragment masses were measured in the Orbitrap, at a resolution of 30000 and 15000, respectively. The isolation width was set to 7 Th, the minimum peak intensity to trigger CID or ETD analysis was set to 2000. The AGC target for MS/MS experiment was set to 104 and 8x104 for the linear trap and the Orbitrap, respectively. Single microscans were acquired for CID at 35% normalized collision energy and for ETD at 37.5 msec activation time. The fluoranthene AGC target was 106.
Data processing
Scans representing the same MS/MS experiments were merged using the Xcalibur (v2.1.0 build 1139) software.
Peak lists were generated using Xtract as well as manually. Xtract is a feature of Xcalibur, which converts the ions observed into a list of singly charged masses. The resolution afforded and the signal-to-noise ratio of the ions to be deconvoluted can be specified. These parameters are listed with the peak lists presented.
De novo sequencing was performed manually as presented below.
MS-product and MS-comp of Protein Prospector v5.8.0 (www.prospector.ucsf.edu) was used to display instrument-specific fragmentation and potential amino acid compositions, respectively. PEAKS Studio 5.3 build 20110719 was also tested for de novo sequencing [6].
Spectra were also evaluated using in-house software FAVA, that performs peak-picking using the sequence determined [12].
Results and discussion
A toxin-containing fraction, named MitTxα from the venom of the Texas coral snake (Micrurus tener tener) [11] was subjected to mass spectrometry analysis via infusion in an LTQ-Orbitrap. The polypeptide’s monoisotopic mass was determined from ions at m/z 888.3915(8+), 1015.1630(7+), 1184.1883(6+), and 1420.8202(5+) as 7100.0830 (MH+). The toxin molecule seemed to be both sufficiently small and sufficiently charged for ETD analysis. Unfortunately, the ETD analysis of m/z 1015(7+) (the 8+ ion was not abundant enough) yielded no information besides the charge reduced ions (Data not shown).
Considering that sometimes fragmentation occurs, but the newly formed ions are kept together by electrostatic interactions or hydrogen-bridges, an MS3 experiment was performed: first m/z 1015(7+) was subjected to ETD activation, and then the most abundant charge-reduced ion m/z 1422(5+) was subjected to CID analysis (Figure 1). Only a few abundant fragments were detected. The mass differences observed in the first ion cluster were 31.9719 Da (1204-1172) & (1238-1206), and 33.9881 Da (1206-1172) & (1238-1204). The theoretical mass for S is 31.9720 and for H2S is 33.9877, and so the observed mass differences are the signature of an intermolecular disulfide linkage and correspond to cleavages across the disulfide bridge [13]. Since the other abundant fragment at m/z 1474.89(4+) corresponds to the ‘missing’ part of the molecule, these data suggested the presence of a disulfide-bound heterodimer.
Figure 1.
MS3 ETD/CID 1015(7+) → 1422(5+) spectrum of the intact molecule. Sample solution was infused, activation was performed in the linear trap, and fragments were measured in the Orbitrap. The fragments detected suggested the presence of a disulfide-bound heterodimer. The shorter component displayed the characteristic ion-triplet, as indicated. Fragments formed via cleavage across the disulfide bridge are labeled with asterisks.
Disulfide-bridges in toxins are rather common. Thus, we proceeded with reduction/alkylation of the molecule following a general protocol with DTT and iodoacetamide. This reaction, attempted multiple times, led to complete sample loss. First we suspected solubility problems as a result of carbamidomethylation, and also experimented with iodoacetic acid with the same negative results. Oxidation of the disulfides was an option that we discarded, since the introduction of negatively charged groups (i.e. cysteic acids) would be unlikely to improve the charge density and ETD efficiency.
We decided to keep the peptide in an acidic solution (0.1% formic acid in water) resembling that which was used for its purification and performed the reduction at low pH with TCEP. Extended reaction time, elevated temperature and larger than usual reagent excess was used in order to achieve complete reduction. The results were somewhat surprising. The major component observed by mass spectrometry demonstrated a molecular weight equal to that of the full-length polypeptide with a 6 Da mass increase, indicating 3 reduced disulfide bridges. Its ions were detected at m/z 889.1488(8+), 1016.0285(7+), 1185.1955(6+) and 1422.0243(5+), yielding a monoisotopic mass of 7106.1281 (MH+). The (7+) ion of the full length toxin before and after the reduction is shown in Figure 2. In addition, the reduction yielded two individual peptides, one with MH+ = 5900.6184 (m/z 738.4600(8+), 843.8096(7+) and 984.2748(6+); Figure 2, lower panel), and the other with MH+ = 1224.5352 (also at m/z 612.7715(2+)), which is 18 Da larger than expected based on the MS3 fragmentation.
Figure 2.
Results of the ESIMS analyses. The upper and middle panels show the (7+) ion for the intact toxin before and after reduction, respectively. The lower panel shows the (8+) ion of the bigger individual peptide formed during the low pH reduction experiment.
The smaller peptide (MH+ =1224.5352) was analyzed first from the infused reaction mixture. Sequencing this peptide was relatively straightforward; we had good quality CID and HCD data for both the doubly and singly charged precursor ions, with the latter ones being more informative for de novo sequencing (Figure 3). We started with the CID data, since it was expected that both N-terminal and C-terminal ions would be present, while b-type fragments frequently do not survive the multiple collisions occurring during HCD activation (Nomenclature: [14]). The very abundant ion triplet at m/z 1063, 1091 and 1109 was readily identified as an-1, bn-1 and bn-1+H2O (where n= the number of residues in the peptide) revealing that the C-terminal residue must be Asp (1224-1109=115) and also suggesting that there must be a basic residue somewhere in the sequence, because such rearrangement has been reported for peptides with preferential charge retention at the N-terminus [15, 16]. The other a-b fragment pairs could be easily identified downstream at m/z 934 & 962, 771 & 799, 668 & 696, 521 & 549, identifying Glu, Tyr, Cys, and Phe, respectively. Interestingly and unexpectedly, internal ions were also detected in the CID spectrum, m/z 582 and 554, that were separated by 28 Da as well, but luckily ions belonging to the b ion series also featured an ammonia loss and were therefore distinguishable. Thus, the sequence could be tentatively determined at this point to be …FCYED. However, the clues fizzled out here, and we had to turn to the more complex HCD data to proceed (Figure 3, lower panel). From there the next a-b pair was identified at m/z 353 & 381 (the latter was also detected in the CID data). The 168 Da gap between this b fragment and the previously identified b at m/z 549 could only correspond to a Pro-Ala combination. Since Pro residues usually produce an abundant y fragment, it was easy to determine that these amino acids are indeed present and that they appear in this order. Hence, m/z 478 (and the highly abundant m/z 461 ion the represents an ammonia loss) could be unambiguously identified as the “missing” b-fragment, while the y fragment formed via cleavage at the N-terminal side of Pro is at m/z 844, and our working sequence is now …PAFCYED. Considering internal fragments from this sequence helps to assign numerous ions in the low mass region, such as m/z 169 & 141 (PA & PA-28), 316 & 288 (PAF & PAF-28), 267 & 239 (CY & CY-28) etc. (for complete list see Table 1). At the same time we believe that there must be a basic amino acid in the sequence, and from low mass ions at m/z 112 and 115, one suspects an Arg [17]. Indeed, there is an a-b pair at m/z 197 and 225, indicating that the next residue towards the N-terminus is an Arg. However, according to the ‘MS-Comp’ feature of Protein Prospector, there is no amino acid combination that would yield a b fragment at this mass, even if we permit a 50 ppm mass measurement error, which is much higher than our instrument actually affords. However, if a common N-terminal modification, the cyclization of Gln residue is considered, then the fragment is within 2 ppm for b2 of <Gln[Ile/Leu] (N-terminal acetylation was also considered and checked).
Figure 3.
CID (upper panel) and HCD (lower panel) of m/z 1224.53. Fragments in both analyses were measured in the Orbitrap. The peptide sequence was determined from these data as <QL/IRPAFCYED. In the CID spectrum the unexpected internal fragments are labeled, while some of the abundant fragments were assigned in the HCD spectrum. Annotation ○ indicates ammonia loss. Otherwise the Biemann nomenclature was applied [14]. For complete fragment assignment see Table 1. The base peak in the CID spectrum featured an intensity ~3000, while in HCD ~6000.
Table 1.
HCD fragment list “predicted” by MS-Product of Protein Prospector for the <GlnLeu/IleArgProAlaPheCysTyrGluAsp sequence, and the fragments detected (Figure 2).
| <Gln[Ile/Leu]ArgProAlaPheCysTyrGluAsp = <QL/IRPAFCYED | I cannot differentiate between isomeric amino acids Ile and Leu. | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| calc mass | measured | ppm | ID | calc mass | measured | ppm | ID | calc mass | measured | ppm | ID | calc mass | measured | ppm | ID | calc mass | measured | ppm | ID |
| 70.0651 | R | 251.0849 | 251.0842 | −3 | FC | 414.1482 | 414.1468 | −3 | FCY | 585.3507 | LRPAF | 849.3712 | RPAFCYE-H2O | ||||||
| 70.0651 | P | 253.1659 | LR-NH3 | 419.1748 | 419.1746 | 0 | PAFC | 586.2330 | AFCYE-28 | 850.3552 | RPAFCYE-NH3 | ||||||||
| 84.0808 | Q | 254.1612 | RP | 421.2558 | LRPA-NH3 | 596.2173 | AFCYE-H2O | 851.4233 | LRPAFCY | ||||||||||
| 86.0964 | L | 263.0874 | 263.0865 | −3 | y2 | 426.1507 | 426.1497 | −2 | y3 | 614.2279 | AFCYE | 867.3818 | RPAFCYE | ||||||
| 87.0917 | R | 265.1183 | 265.1174 | −3 | YE-28 | 433.2558 | 433.2545 | −3 | a4-NH3 | 651.3614 | 651.3593 | −3 | a6-NH3 | 917.4339 | a8-NH3 | ||||
| 88.0393 | D | 267.0798 | 267.0790 | −3 | CY | 438.2823 | LRPA | 658.2177 | y5-H2O | 934.4604 | 934.4568 | −4 | a8 | ||||||
| 100.0869 | R | 270.1925 | LR | 444.2718 | RPAF-28 | 660.3650 | LRPAFC-28 | 945.4288 | 945.4246 | −4 | b8-NH3 | ||||||||
| 101.0709 | Q | 275.1026 | YE-H2O | 450.2824 | a4 | 668.3879 | 668.3857 | −3 | a6 | 952.4709 | LRPAFCYE-28 | ||||||||
| 102.0550 | E | 288.1707 | 288.1697 | −3 | PAF-28 | 455.2401 | RPAF-NH3 | 671.3334 | LRPAFC-NH3 | 962.4553 | 962.4516 | −4 | LRPAFCYE-H2O | ||||||
| 112.0869 | R | 293.1132 | 293.1124 | −3 | YE | 457.1904 | AFCY-28 | 676.2283 | 676.2257 | −4 | y5 | 962.4553 | 962.4516 | −4 | b8 | ||||
| 116.0342 | y1-H2O | 294.1271 | AFC-28 | 461.2507 | 461.2492 | −3 | b4-NH3 | 679.3563 | 679.3540 | −3 | b6-NH3 | 963.4393 | LRPAFCYE-NH3 | ||||||
| 120.0808 | 120.0804 | −3 | F | 297.2034 | RPA-28 | 472.2667 | RPAF | 683.2858 | 683.2827 | −5 | PAFCYE-28 | 980.4658 | LRPAFCYE | ||||||
| 126.0550 | P | 308.1717 | 308.1708 | −3 | RPA-NH3 | 478.2773 | b4 | 688.3599 | LRPAFC | 980.4659 | 980.4615 | −4 | b8+H2O | ||||||
| 129.0659 | Q | 316.1656 | 316.1645 | −3 | PAF | 485.1853 | AFCY | 693.2701 | PAFCYE-H2O | 982.4087 | y8-H2O | ||||||||
| 134.0448 | 134.0445 | −2 | y1 | 322.1220 | AFC | 504.2929 | 504.2912 | −3 | a5-NH3 | 696.3828 | 696.3797 | −4 | b6 | 983.3927 | 983.3885 | y8-NH3 | |||
| 136.0757 | 136.0755 | −1 | Y | 325.1983 | RPA | 511.1493 | y4-H2O | 710.3443 | RPAFCY-28 | 1000.4193 | 1000.4151 | −4 | y8 | ||||||
| 141.1022 | PA-28 | 336.2031 | a3-NH3 | 515.1959 | FCYE-28 | 711.2807 | 711.2783 | −3 | PAFCYE | 1045.4924 | a9-H2O | ||||||||
| 169.0972 | 169.0968 | −2 | PA | 339.2503 | LRP-28 | 521.3195 | 521.3177 | −3 | a5 | 721.3126 | RPAFCY-NH3 | 1046.4765 | a9-NH3 | ||||||
| 180.1020 | a2-NH3 | 350.2187 | LRP-NH3 | 525.1802 | FCYE-H2O | 729.2549 | y6-H2O | 1063.5030 | 1063.4987 | −4 | a9 | ||||||||
| 191.1179 | AF-28 | 353.2296 | 353.2285 | −3 | a3 | 529.1599 | y4 | 738.3392 | RPAFCY | 1073.4874 | b9-H2O | ||||||||
| 197.1285 | 197.1280 | −3 | a2 | 364.1980 | 364.1969 | −3 | b3-NH3 | 532.2879 | 532.2860 | −4 | b5-NH3 | 747.2654 | y6 | 1074.4714 | b9-NH3 | ||||
| 208.0969 | b2-NH3 | 367.2452 | LRP | 543.1908 | 543.1891 | −3 | FCYE | 754.3705 | 754.3675 | −4 | a7-NH3 | 1091.4979 | 1091.4937 | −4 | b9 | ||||
| 219.1128 | AF | 368.1275 | CYE-28 | 547.2809 | RPAFC-28 | 771.3971 | 771.3942 | −4 | a7 | 1095.4928 | y9-H2O | ||||||||
| 223.0900 | FC-28 | 378.1118 | CYE-H2O | 549.3144 | 549.3123 | −4 | b5 | 782.3655 | 782.3622 | −4 | b7-NH3 | 1096.4768 | y9-NH3 | ||||||
| 225.1234 | 225.1229 | −2 | b2 | 381.2245 | 381.2234 | −3 | b3 | 554.2432 | 554.2414 | −3 | PAFCY-28 | 799.3920 | 799.3890 | −4 | b7 | 1109.5085 | 1109.5039 | −4 | b9+H2O |
| 226.1662 | RP-28 | 386.1533 | FCY-28 | 557.3558 | LRPAF-28 | 823.4283 | LRPAFCY-28 | 1113.5034 | y9 | ||||||||||
| 237.1346 | 237.1339 | −3 | RP-NH3 | 391.1798 | PAFC-28 | 558.2493 | 558.2476 | −3 | RPAFC-NH3 | 826.3076 | 826.3048 | −3 | y7-H2O | 1206.5249 | 1206.5191 | −5 | MH-H2O | ||
| 239.0849 | 239.0842 | −3 | CY-28 | 396.1224 | CYE | 568.3242 | LRPAF-NH3 | 834.3967 | LRPAFCY-NH3 | 1207.5089 | MH-NH3 | ||||||||
| 242.1975 | LR-28 | 408.1401 | y3-H2O | 575.2759 | RPAFC | 839.3869 | RPAFCYE-28 | 1224.5354 | 1224.5352 | 0 | MH | ||||||||
| 245.0768 | 245.0761 | −3 | y2-H2O | 410.2874 | LRPA-28 | 582.2381 | 582.2363 | −3 | PAFCY | 844.3182 | 844.3147 | −4 | y7 | ||||||
Thus, the final sequence of the short peptide is <Gln-[Ile/Leu]-Arg-Pro-Ala-Phe-Cys-Tyr-Glu-Asp. Fragments calculated by ‘MS-Product’ of Protein Prospector for ESI-Q-TOF instrument selection (which also corresponds to HCD fragmentation) and fragments observed are listed in Table 1, along with the errors in mass measurement.
We tested the most popular sequencing program, PEAKS, de novo with this spectrum. The raw data were loaded, the precursor ion m/z value was manually corrected, and the spectrum was processed by the program. For de novo sequencing, no enzyme was specified, and the mass accuracies considered were 5 ppm and 0.02 Da for the precursor and fragments, respectively (relative mass accuracy cannot be specified for fragments). After a series of failed attempts we realized that instrument selection was critical, as the software had to recognize the data as ‘high energy CID’ fragmentation in order to determine sequences (even though this definition is not correct). From this experience we arrived at the same conclusion as the 2011 ABRF iPRG study: being highly familiar with the software produces much better results [18]. When ion trap CID was considered, the abundant internal fragments proved sufficient to trip the algorithm. Q-TOF, Q-FT, and FTMS instrument selection yielded the correct sequence (obviously only if the software was informed about the possibility of blocked terminus, i.e. N-acetylation or pyroglutamate formation were permitted). However, less than 10% confidence was attached when the first 2 instruments were specified. With the FTMS instrument selection, the following sequence was determined with 90% confidence: [224.1]RPA[413.12]E[115]. The software considers y fragments twice as significant as b ions, and considers immonium ions and internal fragments only during the 3rd step while reevaluating the 10000 best candidates [6]. The first premise is not necessarily true for non-tryptic sequences, as illustrated with the abundant N-terminal ions of this peptide, and we believe, this fact definitely contributed to the low confidence of the sequence determination. At the same time we cannot tell which internal fragments were identified and used by the algorithm: The fragment labeling option included the internal fragments as default, however these ions were not assigned by the software. Strangely only m/z 112 and 169 were indicated, both incorrectly as PAFCY-28. The confidence difference between the assignments is also puzzling, since data processing with each instrument-type yielded 93 identical masses (judging from the mass accuracy charts accompanying each assignment).
Next, we turned to analyzing the larger (MH+ = 5900.6184) peptide that was observed after reduction of the toxin. While the short peptide gave good quality data upon direct infusion of the reaction mixture, the longer sequence had to be analyzed by LC/MS/MS analysis. The precursor ion selection was restricted, alternating CID and ETD data were acquired from m/z 738(8+). Both activation steps were performed in the linear trap, and the fragments were measured in the Orbitrap.
CID spectra and ETD spectra between 38.5 and 39 min (representing the apex of the eluting peak), were merged starting around 50% XIC intensity and were used for partial de novo sequencing (spectra are shown with Supplementary Tables 1 and 2). Monoisotopic masses and charge states were determined manually and inserted into an Excel workbook, using separate sheets for CID and ETD data (Supplementary Tables 1 and 2, Table 2 represents a combined, concise version). We compared the manual peak-picking with that of the deconvolution program, Xtract supplied by the manufacturer. Because of the overlapping multiply-charged peaks and weak or absent monoisotopic ions, neither solution was entirely satisfactory (Supplementary Tables 3-5). We found that the Xtract program tends to overlook or misassign some singly-charged ions (see Supplementary Table 3), but otherwise the greatest difficulties are caused by overlapping fragment ions. In general, for large multiply-charged ions both “eyeballing” and modeling based on the “averagine” peptide composition work equally well, but not completely reliably. The manually-generated peak list may not be as complete as the software-generated one, but it contained all of the important singly-charged ions and was definitely more reliable for de novo sequencing. This was especially true for the information-rich CID data. In the Excel workbook the singly-charged accurate masses were calculated from the monoisotopic masses and charges using the formula . Then, the fragment masses were sorted. Since members of the ion series usually cannot be identified per se, a series of other masses were calculated using the observed masses as reference points. In the CID Table (Table 2, Supplementary Table 1) the nominal masses of the corresponding complementary ions (y fragment for b and vice versa) were calculated using the formula . In the ETD Table (Table 2, Supplementary Table 2) two columns were created: with masses 17.0265 Da (NH3) lower and 16.0187 Da (NH2) higher than the fragments detected, representing the corresponding potential b and y fragments, respectively. With all this information in the Excel worksheets, the manual interpretation began. Determining the termini was relatively easy. ETD-fragment m/z 197 was identified as z2. consisting of a His and a Gly, while m/z 212 is c2 corresponding to ProPro. These assignments are unambiguous because these were the only potential structures within 5 ppm (according to the MS comp feature of Protein Prospector). Based on the ETD data, a series of b fragments could be identified in the CID spectrum, and so the sequence was built from the N-terminus. First, 2 Phe residues were added to the sequence, but these were followed by a 256.155 gap that could be filled by either an {Ala, Gly, Lys} or a {Gln, Lys} combination. Obviously an Ala, Gly combination would yield the very same ‘c’ fragment as the Gln. We discarded this option for the lack of any supporting ions in the CID. Based on this information, the c fragment at m/z 634 was considered, which, due to the afforded mass accuracy, identifies a Gln in position-5 and, consequently, a Lys in position-6.
Table 2.
Simplified peak list from the MS/MS analysis of the longer peptide. Comments indicate how the fragments were assigned. Residues linking up within the fragment series are indicated. Complementary y/b ions are listed for both CID and ETD data. Detailed explanation is in the text, and for complete lists, see Suppl. Tables 1 and 2.
| CID | corr. CID fragments | ETD | ||||||
|---|---|---|---|---|---|---|---|---|
| deconv. | Comments | Residues | comp.f. | b | y | deconv. | Comment | Residues |
| ProPro | 180.0532 | 213.0984 | 197.0797 | z2 for GH | {HisGly} | |||
| 342.1808 | b, based on ETD | Phe↑ | 5559.6 | 195.1128 | 228.1580 | 212.1393 | c2 for PP | ProPro |
| 349.1504 | 5552.6 | 298.1672 | 331.2124 | 315.1937 | ||||
| 461.2555 | a, correct mass | 5440.5 | 309.0639 | 342.1091 | 326.0904 | |||
| 489.2495 | b, based on ETD | Phe↑ | 5412.6 | 327.1700 | 360.2152 | 344.1965 | ||
| 506.2758 | 5395.5 | 342.1809 | 375.2261 | 359.2074 | c3, CID & MS comp | Phe↑ | ||
| 634.3351 | c, based on ETD | Gln↑ | 5267.5 | 398.1492 | 431.1944 | 415.1757 | ||
| 728.3748 | 5173.4 | 474.2378 | 507.2830 | 491.2643 | ||||
| 745.4025 | b, based on ETD | Lys↑ | 5156.4 | 489.2485 | 522.2937 | 506.2750 | c4, based on CID | Phe↑ |
| 788.3290 | 5113.5 | 539.2383 | 572.2835 | 556.2648 | ||||
| 848.4122 | 5053.4 | 617.3072 | 650.3524 | 634.3337 | c5, MS-comp | Gln↑ | ||
| 905.4288 | b, based on ETD | 4996.4 | 668.2931 | 701.3383 | 685.3196 | |||
| 976.4706 | b, based on ETD | Ala↑ | 4925.3 | 745.4035 | 778.4487 | 762.4300 | c6, based on CID | Lys↑ |
| 1004.4040 | 4897.4 | 848.4116 | 881.4568 | 865.4381 | c7, based on CID | Cys↑ | ||
| 1095.5432 | a, correct mass | 4806.3 | 890.4171 | 923.4623 | 907.4436 | |||
| 1105.5262 | b-water | 4796.3 | 905.4335 | 938.4787 | 922.4600 | c8, based on CID | Gly↑ | |
| 1123.5374 | b, based on ETD | 4778.3 | 976.4692 | 1009.5145 | 993.4958 | c9, based on CID | Ala↑ | |
| 1135.4384 | 4766.4 | 1123.5376 | 1156.5829 | 1140.5642 | c10, based on CID | Phe↑ | ||
| 1337.6290 | b, based on ETD | 4564.2 | 1218.4608 | 1251.5061 | 1235.4874 | |||
| 1621.6580 | y, based on ETD | 4280.1 | 1222.6068 | 1255.6521 | 1239.6334 | c11, from mass diff; acc | Val↑ | |
| 1735.7065 | y, based on ETD | 4166.1 | 1337.6322 | 1370.6775 | 1354.6588 | c12, based on CID | Asp↑ | |
| 1813.8824 | 4087.9 | 1424.6648 | 1457.7101 | 1441.6914 | c13, from mass diff; acc | Ser↑ | ||
| 1841.8738 | 4059.9 | 1425.6704 | 1458.7156 | 1442.6969 | ||||
| 1977.8071 | y, from mass diff; acc. | 3924.0 | 1466.5924 | 1499.6377 | 1483.6190 | |||
| 2059.8556 | y-ammonia | 3841.9 | 1555.6426 | 1588.6879 | 1572.6692 | |||
| 2076.8722 | y, based on ETD | Val↑ | 3824.9 | 1587.7274 | 1620.7727 | 1604.7540 | c14, from mass diff; acc | Tyr↑ |
| 2117.0048 | 3784.8 | 1604.6362 | 1637.6815 | 1621.6628 | must be y ion | |||
| 2191.8922 | Asp↑ | 3709.9 | 1750.7906 | 1783.8359 | 1767.8172 | c15, from mass diff; acc | Tyr↑ | |
| 2277.8844 | y-ammonia | 3623.9 | 1797.7370 | 1830.7823 | 1814.7636 | |||
| 2294.9131 | y, based on ‘b’ | Cys↑ | 3606.9 | 1846.7344 | 1879.7797 | 1863.7610 | ||
| 2310.9109 | 3590.9 | 1897.8578 | 1930.9031 | 1914.8844 | c16, from mass diff; acc | Phe↑ | ||
| 2387.9323 | 3513.9 | 1911.7830 | 1944.8283 | 1928.8096 | ||||
| 2404.9582 | y-water | 3496.8 | 1944.7648 | 1977.8101 | 1961.7914 | |||
| 2422.9661 | y, based on ‘b’ | 3478.8 | 2011.9030 | 2044.9483 | 2028.9296 | c17, from mass diff; acc | Asn↑ | |
| 2430.9634 | 3470.8 | 2151.9926 | 2185.0379 | 2169.0192 | ||||
| 2479.9888 | y, based on ‘b’ | 3421.8 | 2168.0048 | 2201.0501 | 2185.0314 | c18, from mass diff; acc | Arg↑ | |
| 2625.0301 | y-water | 3276.8 | 2168.0031 | 2201.0484 | 2185.0297 | |||
| 2626.0329 | y-ammonia | 3275.8 | 2174.8794 | 2207.9247 | 2191.9060 | |||
| 2643.0533 | y, based on ‘b’ | 3258.7 | 2228.8854 | 2261.9307 | 2245.9120 | |||
| 2728.2748 | 3173.5 | 2255.0190 | 2288.0643 | 2272.0456 | c19, from mass diff; acc | Ser↑ | ||
| 2790.1297 | y, based on ‘b’ | 3111.7 | 2390.9206 | 2423.9659 | 2407.9472 | |||
| 2799.3484 | a, b-a pair | 3102.5 | 2525.2424 | 2558.2877 | 2542.2690 | |||
| 2827.3435 | b, b-a pair | 3074.5 | 3345.3153 | 3378.3606 | 3362.3419 | |||
| 2936.4127 | a, b-a pair | 2965.4 | 3701.5464 | 3734.5917 | 3718.5730 | |||
| 2964.4032 | b, b-a pair | His↑ | 2937.4 | 3971.6732 | 4004.7184 | 3988.6997 | ||
| 3014.4105 | 2887.4 | 3972.6891 | 4005.7344 | 3989.7157 | ||||
| 3083.4793 | a, b-a pair | 2818.3 | 4117.7216 | 4150.7668 | 4134.7481 | |||
| 3111.4722 | b, b-a pair | Phe↑ | 2790.3 | 4281.8544 | 4314.8996 | 4298.8809 | ||
| 3161.4809 | 2740.3 | 4444.8224 | 4477.8676 | 4461.8489 | ||||
| 3230.5462 | a, b-a pair | 2671.3 | ||||||
| 3258.5382 | b, b-a pair | Phe↑ | 2643.3 | |||||
| 3393.6097 | a, b-a pair | 2508.2 | ||||||
| 3421.6042 | b, b-a pair | Tyr↑ | 2480.2 | |||||
| 3478.6262 | b, from mass diff; acc. | Gly↑ | 2423.2 | |||||
| 3495.6442 | 2406.2 | |||||||
| 3578.6767 | a, b-a pair | 2323.1 | ||||||
| 3606.6845 | b, b-a pair | Gln↑ | 2295.1 | |||||
Interestingly, this ‘c’ fragment was also detected in the CID spectrum, as will be discussed later. Assuming that m/z 865 in the ETD peak list was a c fragment, the next residue was identified as Cys, and the following c-fragments, originally assigned based on the corresponding b-ions in CID, determined that Gly, Ala, and Phe are the next three residues. Thus, our working sequence at this point is ProProPhePheGlnLysCysGlyAlaPhe.
The CID data did not help for the next two residues, but c ions detected in the ETD spectrum determined them to be Val and Asp. The next 3 residues were assigned as SerTyrTyr in a similar manner. This step also identified m/z 1621 as a potential y fragment because it was present in both datasets and does not belong to the N-terminal series. By the same logic, the CID fragment at m/z 1735 must be also a y fragment. Following the clues provided by the ETD data, considering amino acid mass differences, and taking advantage of the mass accuracy, we could confidently determine the N-terminal sequence as PPFFQKCGAFVDSYYFNRS.
The CID data provided an additional non-adjacent sequence stretch, which was determined as HFFYGQCDV. Just as the c ions in the ETD spectrum provided information for the N-terminal amino acids, the first 6 residues were identified from consecutive b fragments. The last b fragment (m/z 3606) could then be tied to a series of y ions. The complementary y fragment at m/z 2295 was the first in this series, and m/z 1977 identified by ETD data was the last. (Table 2, Supplementary Tables 1 and 2). The y fragment identifying the Val residue (m/z 2077) was detected in 2 different charge states, and the ETD data excluded it from the b-series. The 218 Da gap between this mass and m/z 2295 may correspond to 3 different amino acid combinations: AlaPhe, SerMet and CysAsp, and the latter one was supported by a y fragment detected. Since the ETD spectrum provided a string of c-fragments, c2-c18, each measured within 5 ppm, one would expect that a computer program could have easily determined the N-terminal sequence. Thus, the raw data were loaded into the PEAKS software with the FTMS(etd) instrument selection. We experimented with different mass accuracy windows from 0.005-0.1Da, and in one trial also permitted methionine oxidation. All attempts yielded the correct N-terminal sequence, but unfortunately very little confidence was attached to it. The most confident assignment was: PPFFQKCGAFVDSYYFNRSCTCGWLMVTGPCHGRNFYYSDVFAGCKGSMTRV, where the amino acids in bold were assigned with a confidence higher than 60%, and those in both bold and italic indicate higher than 90% confidence level. The mass error permitted for this assignment was 0.1 Da. Notably, the underlined sequences are identical to one another, but in reverse order. As it happens, while m/z 359 could still be identified unambiguously as a c fragment (within 5 ppm), all the other N-terminal fragments could ‘double’ as z. ions. Thus, it is clear that the supporting information from the CID data was essential to decide the position and correct order of the amino acids. In addition, when PEAKS software was used to search the CID data with ‘ion trap CID’ selected, it did not yield even a sequence tag. However, when the Orbitrap/Orbitrap combination was selected, then the HFFYG internal sequence was confidently identified. Interestingly, although this spectrum represented ion trap CID data, the number of peaks and masses seemed to be the same after peak-picking for both instrument selections (the same parameters were used for both searches).
Since the short peptide featured a C-terminal Asp, while the longer peptide had a Pro at its N-terminus, it was now easy to explain the 18 Da mass discrepancy between the expected and observed mass of the shorter peptide after reduction. Instead of a disulfide-bound heterodimer, the toxin consists of a single chain that experienced a double cleavage event in the MS3 experiment. It is possible that first the disulfide-bridge was broken by ETD [19], and then CID activation induced the Asp-Pro peptide-bond cleavage. However, there is another fragmentation pathway one has to consider: a smaller peptide population that survived the ETD activation intact underwent a double cleavage upon collisional activation. The fragmentation-prone Asp-Pro bond could have been broken first, and then fragmentation could have occurred along the disulfide bridge. Only this second sequence of events explains the characteristic ion-triplets that were observed. The Asp-Pro bond is acid-sensitive and must have hydrolyzed during the extended reduction at low pH, producing two individual peptides.
Finding sequence similarity among known proteins frequently serves as confirmation of sequences determined de novo. A BLAST search was performed using the N-terminal sequence stretch (QLRPAFCYEDPPFFQKCGAFVDSYYFNRS), which did not yield any significant hits. Thus, the full sequence and the protein family to which our toxin belongs was determined by cloning the toxin-encoding cDNA using degenerate oligonucleotide-probes representing the 10 N-terminal residues of the longer peptide, as described earlier [11].
QIRPAFCYEDPPFFQKCGAFVDSYYFNRSRITCVHFFYGQCDVNQNHFTTMSECNRVCG, the final sequence, matched the experimentally determined molecular mass perfectly, after adjustment for the N-terminal pyroglutamic acid and three disulfide bridges. A BLAST search with the complete sequence categorized this polypeptide as a Kunitz-type protein, showing 40% sequence identity to its closest relative up to date, Vestiginin-3, of Demansia vestigiata (Black whip snake) and complete conservation of the six Cys residues [11].
Once the full sequence was determined, we reevaluated the CID and ETD data of the longer sequence. Manual inspection of the ETD data provided some additional information in the form of Cys-specific ‘w’-type fragment ions, which are formed from z. fragments via side-chain losses, a phenomenon reported for alkylated Cys-residues in ECD [20]. While the observation of these fragments was not unexpected, the detection of a series of potential b+2H ions is most unusual and awaits explanation (Supplementary Table 2). We also used a sequence-based peak-picking software to test the information content of both the CID and ETD spectra [12]. Since the peak-picking of this software is based on the expected theoretical ion clusters calculated not from averagine but from real elemental composition, it can identify the fragment ions even from overlapping isotope clusters. While this approach did not make any difference for this particular ETD data (Supplementary Table 4), the CID spectrum contained substantially more information than either the manually prepared or Xtract-based peak lists revealed (Supplementary Table 3). Thus, as final confirmation a program utilizing the calculated masses and isotope distributions based on the sequence determined – as FAVA does – can reveal a wealth of supporting information still “hidden” in the spectrum, or may help to decide between different isobaric options.
While ETD analysis of the non-reduced peptide yielded very limited information even after CID activation of a charge-reduced ion (Figure 1) as discussed above, the polypeptide underwent significant fragmentation upon collisional activation that can be deciphered using the known sequence (Figure 4A and B, Supplementary Table 5). Because of the disulfide-bridges and the positions of the Cys-residues (residue 7 is the first Cys residue, and the last is in the 58th position) only 6 N-terminal and 2 C-terminal residues are outside of the cross-linked structure. Thus, very few fragments can be formed via single bond cleavages. Indeed, ‘regular’ fragmentation was only detected from the N-terminal part. Interestingly, the position of one of the disulfide-bridges could be determined because of the favored fragmentation at the Asp-Pro linkage. Fragments y3-y5 were detected with a 1203 Da shift, that indicates linkage between Cys-7 and Cys-58, since the mass increment corresponds to the N-terminal 10 amino acids linked to the C-terminal fragments. We assume that first the Asp-Pro bond most prone to fragmentation was cleaved, but this cleavage product has the very same molecular mass as the original precursor ion and thus was further activated (just like described above for the MS3 experiment). This could explain all the internal fragments as well as the internal amino acid losses from the intact molecular ion.
Figure 4.
Figure 4A. CID spectrum of the intact polypeptide. Precursor ion was 1015(7+) (Figure 2, upper panel). Sample solution was infused, CID activation was performed in the ion trap, and fragments were measured in the Orbitrap. ○ indicates ammonia loss. * indicates that the fragment contains a dehydroAla instead of the Cys. ** indicates that the Asp-Pro linkage was cleaved and the N-terminal peptide is linked to the C-terminal fragment with a disulfide-bridge, i.e. Cys-7 and Cys-58 are linked. ❖ indicates ‘c’ type internal fragment formation in front of Gln and Lys. For complete peak list and assignments see Supplementary Table 5.
Figure 4B. CID spectrum of the intact polypeptide. Precursor ion was 1015(7+) (Figure 2, upper panel). Sample solution was infused, CID activation was performed in the ion trap, and fragments were measured in the Orbitrap. Fragments ‘S’ and ‘L’ correspond to the short and long sequence when both the Asp-Pro linkage and the disulfide-bridge were broken. The subscript indicates that both sulfurs were retained on the fragment. Double bond cleavages and amino acid losses were observed as indicated from both the intact molecule and the larger fragment. For complete peak list and assignments see Supplementary Table 5.
This CID spectrum also features two c-type internal fragments, formed in front of Gln and Lys residues. It has been published that abundant c1 ions can be observed in sequences in which the second residue is a Gln and for which the activation was performed in a collision cell [21]. The authors suggested that the c ions were formed from the b fragment corresponding to the next amino acid via the loss of a 6-membered ring. Such ring formation is also possible for Ser, Arg, His and Lys residues, based on thermodynamic calculations [22]. Thus, the observed c ion in front of the Gln and Lys residues can be produced via the same mechanism. The very same c fragments as well as c29 were also detected in the CID of the bigger reduced peptide (Supplementary Table 1). This is the first time that such ion formation is reported for Lys residues, for higher sequence positions, and in ion trap CID experiments.
Conclusions
Even the most popular de novo sequencing program has difficulties assigning sequences from unusual or incomplete fragmentation patterns. Sequence-specific fragments, such as in our case the C-terminal a, b, b+H2O triplet, that significantly aid manual sequence determination frequently prove to be the undoing of the automated approach. In addition, reliable peak-picking/deconvolution from a complex spectrum is still a challenging task as illustrated with our high quality data. When peptide identification is the goal, these obstacles are easier to overcome than when a novel sequence has to be determined. Recently, numerous groups have started to advocate de novo sequencing instead of comparative database searching [23]. While the software available seems to function well for ion trap data, and for tryptic peptides, researchers aiming at developing such programs also should consider the wide-variety of sequence- or residue-specific fragmentation, which may provide important clues but remain not only underutilized, but rather completely ignored as insignificant. We would not recommend to incorporate such fragment ions into the search algorithm, but they could be (and should be) used to confirm the determined sequence and/or aid in the selection between similarly scoring alternatives. Furthermore, the recent surge in commercially available instruments that can measure precursor as well as fragment ions with a few ppm mass accuracy should force search engines to take advantage of this, since the results may not be reliable enough when only absolute mass accuracy can be specified.
Supplementary Material
Acknowledgments
We thank David Maltby for his technical assistance, and Shenheng Guan for his help with FAVA. KFM was supported by NIH grant NCRR P41RR001614 and the Howard Hughes Medical Institute (both support the National Bio-Organic Biomedical Mass Spectrometry Resource Center at UCSF, director: A.L. Burlingame). CJB was supported by a Ruth Kirschstein predoctoral fellowship (F31NS065597).
References
- 1.Hunt DF, Yates JR, III, Shabanowitz J, Winston S, Hauer CR. Protein sequencing by tandem mass spectrometry. Proc Natl Acad Sci USA. 1986;83:6233–6237. doi: 10.1073/pnas.83.17.6233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Johnson RS, Biemann K. The primary structure of thioredoxin from Chromatium vinosum determined by high-performance tandem mass spectrometry. Biochemistry. 1987;26:1209–1214. doi: 10.1021/bi00379a001. [DOI] [PubMed] [Google Scholar]
- 3.Tipton KF, Dajas F, editors. Neurotoxins in neurobiology: their actions and applications. Ellis Horwood Limited; Chichester: 1994. [Google Scholar]
- 4.Hines WM, Falick AM, Burlingame AL, Gibson BW. Pattern-based algorithm for peptide sequencing from tandem high energy collision-induced dissociation mass spectra. J Am Soc Mass Spectrom. 1992;3:326–336. doi: 10.1016/1044-0305(92)87060-C. [DOI] [PubMed] [Google Scholar]
- 5.Taylor JA, Johnson RS. Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal Chem. 2001;73:2594–2604. doi: 10.1021/ac001196o. [DOI] [PubMed] [Google Scholar]
- 6.Ma B, Zhang K, Hendrie C, Liang C, Li M, Doherty-Kirby A, Lajoie G. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom. 2003;17:2337–2342. doi: 10.1002/rcm.1196. [DOI] [PubMed] [Google Scholar]
- 7.Spengler B. De novo sequencing, peptide composition analysis, and composition-based sequencing: a new strategy employing accurate mass determination by fourier transform ion cyclotron resonance mass spectrometry. J Am Soc Mass Spectrom. 2004;15:703–714. doi: 10.1016/j.jasms.2004.01.007. [DOI] [PubMed] [Google Scholar]
- 8.Medzihradszky KF. Peptide sequence analysis. (Review) Meth Enzymol. 2005;402:209–244. doi: 10.1016/S0076-6879(05)02007-0. [DOI] [PubMed] [Google Scholar]
- 9.Nielsen ML, Savitski MM, Zubarev RA. Improving protein identification using complementary fragmentation techniques in fourier transform mass spectrometry. Mol Cell Proteomics. 2005;4:835–845. doi: 10.1074/mcp.T400022-MCP200. [DOI] [PubMed] [Google Scholar]
- 10.Samgina TY, Artemenko KA, Gorshkov VA, Ogourtsov SV, Zubarev RA, Lebedev AT. De novo sequencing of peptides secreted by the skin glands of the Caucasian Green Frog Rana ridibunda. Rapid Commun Mass Spectrom. 2008;22:3517–3525. doi: 10.1002/rcm.3759. [DOI] [PubMed] [Google Scholar]
- 11.Bohlen CJ, Chesler AT, Sharif-Naein N, Medzihradszky KF, Zhou S, King D, Sánchez EE, Burlingame AL, Basbaum AI, Julius D. Heteromeric toxin from coral snake targets acid sensing ion channels to produce pain. Nature. 2011;479:410–414. doi: 10.1038/nature10607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Guan S, Burlingame AL. Data processing algorithms for analysis of high resolution MSMS spectra of peptides with complex patterns of posttranslational modifications. Mol Cell Proteomics. 2010;9:804–810. doi: 10.1074/mcp.M900431-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bean MF, Carr SA. Characterization of disulfide bond position in proteins and sequence analysis of cystine-bridged peptides by tandem mass spectrometry. Anal Biochem. 1992;201:216–226. doi: 10.1016/0003-2697(92)90331-z. [DOI] [PubMed] [Google Scholar]
- 14.Biemann K. Nomenclature for peptide fragment ions (positive-ions) Meth Enzymol. 1990;193:886–887. doi: 10.1016/0076-6879(90)93460-3. [DOI] [PubMed] [Google Scholar]
- 15.Thorne GC, Gaskell SJ. Elucidation of some fragmentations of small peptides using sequential mass spectrometry on a hybrid instrument. Rapid Commun Mass Spectrom. 1989;3:217–221. doi: 10.1002/rcm.1290030704. [DOI] [PubMed] [Google Scholar]
- 16.Thorne GC, Ballard KD, Gaskell SJ. Metastable decomposition of peptide [M + H]+ ions via rearrangement involving loss of the C-terminal amino acid residue. J Am Soc Mass Spectrom. 1990;1:249–257. [Google Scholar]
- 17.Gehrig PM, Hunziker PE, Zahariev S, Pongor S. Fragmentation pathways of NG-methylated and unmodified arginine residues in peptides studied by ESI-MS/MS and MALDI-MS. J Am Soc Mass Spectrom. 2004;15:142–149. doi: 10.1016/j.jasms.2003.10.002. [DOI] [PubMed] [Google Scholar]
- 18.http://www.abrf.org/ResearchGroups/ProteomicsInformaticsResearch-Group/Studies/iPRG2011_poster_ABRF_final-2.pdf
- 19.Zubarev RA, Kruger NA, Fridrikson EK, Lewis MA, Horn DM, Carpenter BK, McLafferty FW. Electron Capture Dissociation of Gaseous Multiply-Charged Proteins is Favored at Disulfide Bonds and Other Sites of High Hydrogen Atom Affinity. J Am ChemSoc. 1999;121:2857–2862. [Google Scholar]
- 20.Chalkley RJ, Brinkworth CS, Burlingame AL. Side-chain fragmentation of alkylated cysteine residues in electron capture dissociation mass spectrometry. J Am Soc Mass Spectrom. 2006;17:1271–1274. doi: 10.1016/j.jasms.2006.05.017. [DOI] [PubMed] [Google Scholar]
- 21.Lee YJ, Lee YM. Formation of c1 fragment ions in collision/induced dissociation of glutamine-containing peptide ions: a tip for de novo sequencing. Rapid Commun Mass Spectrom. 2004;18:2069–2076. doi: 10.1002/rcm.1593. [DOI] [PubMed] [Google Scholar]
- 22.Farrugia JM, O’Hair RAJ, Reid GE. Do all b2 ions have oxazolone structures? Multistage mass spectrometry and ab initio studies on protonated N-acyl amino acid methyl ester model systems. Int J Mass Spectrom. 2001;210:71–87. [Google Scholar]
- 23.Kim S, Gupta N, Bandeira N, Pevzner PA. Spectral dictionaries: Integrating de novo peptide sequencing with database search of tandem mass spectra. Mol Cell Proteomics. 2009;8:53–69. doi: 10.1074/mcp.M800103-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





