Abstract
The ability to match Top-Down protein sequencing (TDS) results by MALDI-TOF to protein sequences by classical protein database searching was evaluated in this work. Resulting from these analyses were the protein identity, the simultaneous assignment of the N- and C-termini and protein sequences of up to 70 residues from either terminus. In combination with de novo sequencing using the MALDI-TDS data, even fusion proteins were assigned and the detailed sequence around the fusion site was elucidated. MALDI-TDS allowed to efficiently match protein sequences quickly and to validate recombinant protein structures—in particular, protein termini—on the level of undigested proteins.
Keywords: Top-down protein sequencing, MALDI-TOF, protein QC, Edman sequencing
After Edman sequencers disappeared from the commercial instrument market in 2008, there was great interest in finding new methods for intact protein sequencing. MALDI Top-Down Sequencing (MALDI-TDS) is one such method (Figure 1) and was applied to this year’s Association of Biomolecular Resource Facilities-Edman Sequencing Research Group (ABRF-ESRG) 2009 research study. MALDI-TDS provided up to ∼70 N-terminal and C-terminal sequence calls from a single MALDI-TDS spectrum of an intact protein. MALDI-TDS worked for N-terminally blocked proteins as well.
FIGURE 1.
Scheme of MALDI-TDS. Intense c-ions in MALDI-TDS spectra provide for direct N-terminal sequence assignment. Additional y (and z+2) ions call the C-terminal sequence at the same time. Spectrum acquisition takes seconds, and analysis may take only minutes.
MATERIALS AND METHODS
The ABRF-ESRG 2009 study samples—two proteins in the 35-kDa MW range—were analyzed. Protein (50 pmol) was prepared with super 2,5-dihydroxybenzoic acid on the MALDI target without any digestion steps. An Ultraflex III MALDI-TOF/TOF (Bruker Daltonics, Bremen, Germany) with a smart beam laser was used to acquire MALDI-TDS spectra by in-source decay analysis in reflector mode.
Proteins were identified from MALDI-TDS spectra with Mascot using “virtual precursor ions”. A new “MALDI-TDS” instrument was defined on Mascot 2.2 server (Matrix Science, Boston, MA, USA): comprising 1+, a, c, y, and z + 2 fragment ions for most specific scoring.
Retrieval of the IDed protein from Sample 2 into the BioTools 3.2 software (Bruker Daltonics) provided full assignment of G3P_RABIT (Figure 2). Sample 1, representing a fusion protein, required additional de novo sequencing using the dominant c-ions to read through the fusion site into the native ADH1_YEAST sequence.
FIGURE 2.
MALDI-TDS spectrum of the intact protein in ABRF-ESRG 2009 Sample 2. G3P_RABIT sequence assigned 68 N-terminal, 46 C-terminal calls. Database search using m/z 6086 as virtual parent ion provided a peptide score of 169.
RESULTS
Sample 2 was characterized as being pure (Figure 3). The G3P_RABIT sequence in SPROT appeared special with regard to Ala286, which is uncommon in other mammal sequences. Careful evaluation of the spectrum in Figure 4 provided evidence for a Ala286Asp error in the G3P_RABIT sequence.
FIGURE 3.
Based on the “pure” MALDI-TDS spectrum, the purity of the protein in Sample 2 was assessed. No other signals than those of a, c, y, and z + 2 ions matching G3P_RABIT were present.
FIGURE 4.
MALDI-TDS spectrum from Sample 2 assigned to the C-terminal G3P_BOVIN sequence. The match indicates a sequence error in the G3P_RABIT sequence (A286D), which does not allow assignment of y47 and higher (see Figure 7).
ADH1_YEAST was identified as the native C-terminal of Sample 1, a His6-tag fusion vector as its N-terminal (Figure 5). Manual work with the spectrum established the sequence of the fusion site, plus it detected a point mutation ADH1_YEAST (His21Tyr) at sequence call 55 (Figure 6).
FIGURE 5.
MALDI-TDS analysis of Sample 1. N-terminal His6-β-Gal fusion vector and C-terminal ADH1_YEAST IDed by two Mascot searches in the NCBI database. De novo sequencing in FlexAnalysis (Bruker Daltonics) allowed reading through the fusion site.
FIGURE 6.
MALDI-TDS analysis of Sample 1. The fusion protein contains a point mutation, assuming ADH1_YEAST (His21Tyr) permitted the call of 70 aa residues from the N-terminus.
CONCLUSIONS
Standard Mascot searches were used to call sequences from MALDI-TDS spectra of undigested proteins (Figure 7).
50–70 N-terminal sequence calls from undigested, 40 kDa proteins
30–50 C-terminal sequence calls from the same datasets as additional benefit from this technology
For routine protein quality control work, MALDI-TDS is largely a push-button analysis that takes minutes.
FIGURE 7.
MALDI-TDS analysis summary. Sequences and covered calls (red) by MALDI-TDS from ABRF-ESRG 2009 Samples 1 (upper) and 2 (lower); mutations/DB errors are in circles. Approximately 70 sequence calls from N-terminal and 50 residues from C-terminal were obtained from either sample.
REFERENCES
- 1.Hardouin J. Protein sequence information by matrix-assisted laser desorption/ionization in-source decay mass spectrometry. Mass Spectrom Rev 2007; 26: 672– 682 [DOI] [PubMed] [Google Scholar]
- 2.Suckau D, Resemann A. T3-sequencing: targeted characterization of the N- and C-termini of undigested proteins by mass spectrometry. Anal Chem 2003; 75: 5817– 5824 [DOI] [PubMed] [Google Scholar]
- 3.Suckau D, Resemann A, Schuerenberg M, Hufnagel P, Franzen J, Holle A. A novel MALDI LIFT-TOF/TOF mass spectrometer for proteomics. Anal Bioanal Chem 2003; 376: 952– 965 [DOI] [PubMed] [Google Scholar]
- 4.Anonymous Participant. ABRF-PRG-08 Study. [Accessed September 2009]. http://www.abrf.org/ResearchGroups/Proteomics/Studies/27960.pdf.
- 5.ABRF Proteomics Research Group. PRG08 Poster. [Accessed September 2009]. http://www.abrf.org/ResearchGroups/Proteomics/Studies/PRG08_poster.pdf.







