Skip to main content
Plant Physiology logoLink to Plant Physiology
. 2005 Jun;138(2):591–599. doi: 10.1104/pp.105.060285

Update on Proteomics in Arabidopsis. Where Do We Go From Here?1

Scott C Peck 1,*
PMCID: PMC1150380  PMID: 15955923

The “omics revolution” in science has been swift and, in many cases, borders on overwhelming. A number of powerful tools, including variations on the theme of proteomics, emerged over the past years. However, the near constant evolution of technology can make it difficult for outsiders to remain familiar with the options, let alone to make informed decisions about the relative merits of each approach. The purpose of this Update is to discuss the strengths and weaknesses of proteomics technologies. The emphasis on strategic considerations is meant to stimulate discussion among Arabidopsis (Arabidopsis thaliana) researchers not yet applying proteomics approaches to their biological questions.

WHY SHOULD YOU CARE ABOUT PROTEOMICS?

So, the Arabidopsis genome is sequenced. We have microarrays to look at changes in transcript levels and knockout lines for most of the genes. We're pretty much at the “mopping up” stage of science, right? An exceedingly important concept in biology is that one gene ≠ one transcript ≠ one protein (Fig. 1). Alternative transcription initiation and splicing of mRNAs can produce multiple transcripts from a single gene. Alternative translation initiation sites may produce different proteins from each of these transcripts, and these protein variants can be targeted to different compartments in the cell and/or have different functions. Protein maturation doesn't stop with translation. Posttranslational modifications (PTMs), such as phosphorylation, acylation, ubiquitylation, or proteolytic processing, can alter protein activity, location, and stability. Proteins move in and out of protein complexes depending on PTMs. Once a protein is produced, it can undergo a staggering array of highly regulated changes with enormous implications to biological processes. With this level of complexity in mind, researchers must question which of the 20 potential forms of a protein is responsible for the phenotype in their knockout mutant. Perhaps even more importantly, what processes involve the other 19 forms?

Figure 1.

Figure 1.

A single gene can produce many proteins with different functions and different locations. Alternative transcription initiation sites (black arrows) can produce multiple populations of mRNAs. Alternative splicing of mRNAs can produce unique combinations of exons. After translation, proteins containing a targeting sequence (gray rectangle) will be directed to an organelle. Proteins without the targeting sequence will remain in the cytosol, where they can undergo further posttranslational modifications such as phosphorylation (designated by the “P” within a circle).

Essentially, proteomics attempts to address these questions on a large(r) scale. Which proteins change in abundance, form, location, or activity during a biological response? Experimentally, answering these questions requires different approaches, making an exact definition of proteomics rather difficult. In general, the experimental differences pertain to how the proteins are prefractionated prior to analysis. Protein analysis is typically performed in one of two ways. Proteins can be separated by SDS-PAGE or two-dimensional (2-D) gel electrophoresis followed by identification by mass spectrometry (MS). Alternatively, proteins can be identified directly by liquid chromatography (LC)-MS/MS. Both approaches have advantages and disadvantages, and these will be discussed as they pertain to experimental considerations. Numerous excellent reviews provide overviews of the basic technologies of proteomics and highlight the pros and cons of different mass spectrometers, the work horses of proteomics (e.g. Aebersold and Goodlett, 2001; Aebersold and Mann, 2003). This review will (attempt to) avoid the issues of machines and equipment to focus instead on experimental and strategic considerations.

In terms of proteomics in plants, Arabidopsis is currently a unique system. For the identification of just a few proteins, MS-based identification can be successful using samples from many species with limited sequence information (i.e. about 100,000 expressed sequence tag sequences). For experiments involving complex samples, such as any large LC-MS/MS run, only experiments using Arabidopsis and rice (Oryza sativa) can fully exploit current proteomics technology. A typical proteomics experiment can easily generate 50 to 100,000 spectra. This amount of data makes MS-based identification of proteins totally reliant upon algorithms, and these algorithms are completely reliant upon a fully sequenced and annotated genome for accurate identifications. Using Arabidopsis sequence for other species by allowing conserved substitutions of amino acids allows unacceptable rates of false-positive hits when used on a large scale, and computationally, it is simply not a viable option to attempt de novo sequencing with this number of spectra.

WHY IS PROTEOMICS TAKING SO LONG TO DELIVER THE GOODS?

Proteomics promised a revolution. If genomes and microarrays gave us a glimpse into the blueprints of life, proteomics was going to unravel the working end of the cell: the protein machinery. But, as rapidly became apparent, proteins weren't going to give up their secrets without a fight. Proteins are much more diverse in their properties than nucleic acids, so a single protocol for sample preparation or analysis is unlikely. There is no PCR for proteins, so the amount of starting tissue and detection sensitivity are critical limitations. Protein concentrations extend over a far greater dynamic range than nucleic acids. Proteomics must deal with differences in abundance of 6 to 8 orders of magnitude, meaning that the few most abundant proteins often interfere with detection of low level proteins. None of these problems are insurmountable, but they have slowed the appearance of the expected biological results as each problem needed to be solved individually.

PROTEIN ABUNDANCE: IS PROTEOMICS ALWAYS THE ANSWER?

One of the most significant advancements in proteomics over the past years is the development of options for performing quantitative comparisons between samples. Experiments with static systems—sequencing proteins from an isolated organelle or with a particular PTM—can provide valuable information, not the least of which is the improvement/refinement of bioinformatics prediction programs. Ultimately, however, the goal is comparative experiments: How does the protein composition of an organelle or the population of proteins with a PTM change during a biological response? For these questions, we need reliable methods for quantitative comparisons. This section will introduce the primary tools available and discuss limitations. I need to emphasize that criticisms below generally refer to whole-cell analyses (i.e. grinding up an entire Arabidopsis seedling and looking at total protein) and not to comparisons of subproteomes. The tools are the same, but the differences in protein complexity greatly affect the theoretical success. The latter topic will be discussed in the section on the study of subproteomes.

2-D Gels and Difference Gel Electrophoresis

For many years, the only option for comparative studies was staining 2-D gels and examining patterns of spots. These types of comparisons are complicated by gel-to-gel differences that can affect spot positions, leading to problems with false-positives and false-negatives. Even though analysis software has improved greatly, these experiments still require a significant amount of manual intervention as well as numerous repeats to ensure trustworthy results. Recently, difference gel electrophoresis (DIGE; see Table I for a summary of abbreviations) has provided the means to compare two samples in the same gel, circumventing the problems of analysis. In this method, proteins from two different treatments are labeled with one of two fluorescent dyes together and then mixed with a third labeled mixture of the two samples as internal calibration (Tonge et al., 2001; Alban et al., 2003). After separation by 2-D gels, scanning with different lasers detects the proteins to create overlapping images that very rapidly reveal changes in protein abundance. Although limited to pairwise comparisons, DIGE allows very easy comparisons and, because the samples are run together, eliminates the problem of gel-to-gel variation. This approach, however, is still limited by all the problems associated with 2-D gels. The sheer complexity of the proteome prevents single-gel analyses, and the dynamic range of proteins generally prevents detection of low level proteins without some prefractionation. Together with the generally poor resolution of basic (pI > 9), large (molecular mass > 100 kD), or hydrophobic proteins, the “visible” proteome from 2-D gel analysis is likely to represent far less than one-half of the predicted Arabidopsis proteins in even the best cases.

Table I.

Summary of proteomic methods described in this Update

Definition Description Advantages Disadvantages
DIGE Difference gel electrophoresis Fluorescence-based comparison for quantitative 2-D gel analysis Easy and accurate pairwise comparisons Suffers from limitations of 2-D gels (poor for hydrophobic, basic, and large molecular mass proteins)
ICAT Isotope-coded affinity tags Quantitative method for LC-MS/MS based on isotopic tagging of Cys residues Good for higher sample complexity because only a few peptides per protein will be captured and analyzed Estimated one in seven proteins does not contain Cys residues; of limited value for PTM analysis
IMAC Immobilized metal affinity chromatography Method for enriching phosphopeptides from complex peptide mixtures Decreases sample complexity and decreases/eliminates suppression effects on phosphopeptides in LC-MS/MS Possible difficulties with contamination from acidic peptides (methyl esterification of acidic residues may eliminate nonspecific binding)
iTRAQ Quantitative method for LC-MS/MS based on isobaric tagging of all primary amines Good for quantitative comparisons of PTMs and subproteomes; allows comparison of four samples simultaneously May have difficulty with highly complex samples or with detection of low level proteins
TSAA Translational state array analysis Method using microarrays to compare translational status of mRNAs as an indirect measure of protein abundance Sensitive method for estimating protein content; requires far less tissue than necessary for standard proteomics experiments Will not detect protein turnover or PTMs

Isotope-Coded Affinity Tags and iTRAQ

An alternative to 2-D gel-based approaches is direct multidimensional LC-MS/MS analysis of total peptide digests. Because this approach is based on peptides, it overcomes many of the problems in detecting proteins troublesome for 2-D gels. In addition, avoiding gel-based separation is thought to offer increased sensitivity. The problem is that mass spectrometry relies on ionization of peptides for detection. Because ionization efficiency is affected by a number of factors, peak intensities of the same peptide from separate LC-MS/MS experiments are difficult to compare. One solution to this problem is the use of isotope-coded affinity tags (ICAT; for review, see Adam et al., 2002). This method relies on covalent modification of Cys residues with chemically identical biotinylated tags that differ only in mass because of inclusion of heavy and light isotopes. After tagging, the peptides are mixed and analyzed in the same experiment using the isotopic difference to determine which peptide originated from which sample. Because the only difference between the peptides is the mass, they will behave identically during LC separation and, more importantly, will be ionized in a comparable manner. Therefore, a comparison of peak intensities directly correlates with peptide abundance. The use of biotin allows rapid enrichment of the tagged peptides and greatly decreases the complexity of the sample. The fact that this method is based on covalent modification of Cys is both a blessing and a curse. Most proteins only contain a few Cys residues, contributing to lower sample complexity after affinity enrichment. On the other hand, about one in seven proteins does not contain Cys residues, guaranteeing limitations in the analysis. In addition, the probability of a peptide containing both a PTM and a Cys residue is relatively low, so ICAT is not a good option for PTM analysis.

A recent alternative is conceptually similar to ICAT but based on chemical modification of primary amines. The iTRAQ reagent (Applied Biosystems, Foster City, CA) contains an isobaric tag that, upon fragmentation of the peptide, releases a characteristic mass reporter (Ross et al., 2004). The experimental design is similar to that used for ICAT, but iTRAQ allows comparison of up to four samples in the same experiment. Because of the same logic explained for ICAT experiments, separation and ionization of iTRAQ-labeled peptides are identical, so comparison of the intensity of the reporter peak reflects quantitative differences in the peptide. As with ICAT, the modification chemistry of iTRAQ is also its blessing and curse. The modification of all amines means that all peptides will be labeled, which is a particular advantage for experiments involving PTM analysis (discussed below). However, comparisons of total protein digests will be far more complex than those from ICAT experiments, which may obscure lower abundant peptides from detection.

Is One Method Better Than the Other?

After reading the above, the most important impression you should come away with is that there is no single solution for all questions. For studies of protein abundance, ICAT and iTRAQ experiments have a significant advantage over 2-DE because they don't have the restrictions of 2-DE's poor resolution of difficult proteins. On the other hand, LC-MS/MS experiments will rarely identify more than a few peptides per protein, making it exceedingly unlikely to detect PTMs or truncated forms of proteins. In these regards, 2-D gels—even with their limitations—are better suited to detect these types of changes. Moreover, LC-MS/MS may have uncharacterized biases of its own. An in-depth study of rice proteins demonstrated that although LC-MS/MS identified 2,363 proteins from different tissues, it failed to identify 165 of the 556 proteins identified from the same samples run on 2-D gels (Koller et al., 2002). Because the 2-D gels separated only 75 μg of protein, the proteins identified with this method must be relatively abundant, yet they escaped detection by LC-MS/MS for unknown reasons. Therefore, any attempt to use proteomics to characterize whole-cell changes in protein accumulation—regardless of the technology employed—is likely to detect only a limited subset of the proteome. The prevailing opinion over the past few years is that gel-based and LC-MS/MS approaches are complementary methods to provide the most complete coverage of the proteome.

Should Proteomics Be Used for Whole-Cell Protein Profiling?

But even with these drawbacks, can proteomics provide unique information on whole-cell protein accumulation? One of the standard arguments for proteomic analyses is that the level of mRNA correlates very poorly with the level of protein (Gygi et al., 1999; Tian et al., 2004). This observation has been demonstrated in numerous systems and isn't very surprising if you think about it. Because of variation in translational efficiency and protein turnover, 10 transcripts may produce 1,000 molecules of Protein A or may produce 1 molecule of Protein B. Therefore, without further information, one cannot predict the amount of protein from the amount of corresponding mRNA. An important point, however, is that for more abundant proteins, i.e. the ones most people observe without going to very high resolution proteomic experiments, there was actually a relatively good correlation between transcript and protein levels (Gygi et al., 1999; Tian et al., 2004). In two proteomic studies in Arabidopsis, one examining programmed cell death in cell cultures (Swidzinski et al., 2004) and one comparing basal and R gene-mediated defense in leaves (Jones et al., 2004), differentially accumulating proteins were represented by only a small number of relatively abundant proteins, many of which were also transcriptionally regulated during the responses. An approximation, from these as well as yeast and mammalian studies, is that changes in mRNA alone generally account for 60% to 70% of the changes in protein abundance.

The estimated above do not take into account translational regulation. Numerous examples exist for protein levels changing during a response while transcript levels remain constant (e.g. Pradet-Balade et al., 2001; MacKay et al., 2004). Although a proteomics approach could address this issue, microarray experiments comparing free versus polysome-bound mRNAs (translation state array analysis [TSAA]) are faster, easier, and more sensitive than in-depth proteomics experiments. Recent studies on yeast responses to the mating pheromone α-factor concluded that TSAA accurately reflects qualitative changes in protein levels, although TSAA tends to overestimate the actual change in protein level because it does not factor in protein degradation (MacKay et al., 2004). Granted, the primary exception is that microarrays have no way to address targeted protein degradation. However, as discussed below, new directed methods hold promise for specifically examining changes in ubiquitylated proteins, which would be one of the main targets of plant biologists interested in regulated protein turnover.

In an ideal world, one would like to detect changes in abundance, processing, and PTMs of all proteins in a single experiment. The reality is that this goal is not possible, at least at the moment. None of the above precludes the use of 2-D gels, ICAT, or any other approach for whole-cell studies. However, one should embark on experiments with the knowledge that unique insights will be limited and that rare signaling proteins will not leap off a single 2-D gel. A thorough study absolutely requires the use of single-pI 2-D gels and/or protein prefractionation prior to analysis. Currently, TSAA is a far less painful method for simultaneously profiling changes in levels of transcripts and proteins. These microarray analyses may yield only 80% to 90% of the possible information, but the method has no bias against difficult (e.g. integral membrane) proteins. In addition, because the use of microarrays is far more sensitive than any (current) proteomics approach, TSAA could be used to profile the proteome of limited source material, such as meristems or guard cells. Although PTM analysis is important and essential, developing technology targeting specific PTMs is far more likely to provide meaningful, comprehensive, and quantitative insights than whole-cell analyses. All that being said, the constant evolution of proteomics technology in the past should warn against betting on the future. One can envision labeling very large amounts of protein using an iTRAQ-type reagent and then performing multidimensional off-line chromatography on the peptides to deconvolute the sample prior to LC-MS/MS of the individual fractions. In addition, recent experiments with samples of mid to low complexity indicate that using multiple proteases, including ones that cleave nonspecifically, to prepare the peptides might increase the percentage coverage of the protein to the extent that individual PTMs could be inferred from the data (MacCoss et al., 2002). Much of this process could be automated, and the greatest limitation would become computational analysis of this exceedingly large dataset.

SO WHAT ELSE CAN PROTEOMICS DO?

Initially, I defined proteomics as a large-scale approach to address four questions about the protein content of a cell or tissue. As argued above, proteomics technology currently has deficiencies for studying protein abundance for whole-cell studies. However, proteomics approaches are essential to address the three other main questions: forms or PTMs, location, and activity of proteins. These areas of study are developed to the point of yielding real biological answers right now, and a summary of Web-based tools and databases for interrogating existing proteomics data can be found in a recent review on plant proteomics (Rose et al., 2004). Moreover, the issues of increased sensitivity and quantitative comparisons are achievable because of the reduced complexity of these samples. It is important to state up front that these three issues are likely to be intimately intertwined. Phosphorylation of a transcription factor (PTM) may trigger migration to the nucleus (location), where it will activate transcription (activity). Thus, even though these three topics will be discussed independently, it will be the union of these approaches that will eventually have the greatest impact on our understanding of biological processes.

Identification of Different Protein Forms

The fact that proteins are modified by phosphorylation, glycosylation, and ubiquitylation is familiar to most plant biologists. The reality, however, is that about 300 potential protein modifications have been reported (Aebersold and Goodlett, 2001), and a protein may contain multiple different PTMs at any given time. The potential for diversity, therefore, is staggering. Currently, bioinformatics is not able to make accurate predictions for most of these modifications. Moreover, PTMs are often reversible, meaning that even if the protein may be modified at a particular site, it does not have to be modified all the time. When it comes to PTMs, we must rely almost entirely on empirical results. The growing trend is development of affinity capture/enrichment techniques to examine subproteomes of proteins containing specific PTMs. The advantages of this strategy include increased sensitivity resulting from the selective enrichment process and a simplification of data analysis. At present, only a handful of PTMs have been investigated in Arabidopsis. The relative ease of performing these types of tightly defined proteomic experiments together with the possibility of performing quantitative comparisons using iTRAQ-type reagents indicate that this area of study will undergo a substantial growth phase in the coming years.

Phosphorylation

Dynamic protein phosphorylation is a nearly ubiquitous regulatory process during biological responses and, therefore, is one of the most studied of the PTMs. Phosphorylation can turn proteins on or off, target proteins for degradation, or lead to the formation of new protein complexes. Despite the importance of phosphorylation, detection and analysis of phosphoproteins remains difficult for numerous reasons. The most obvious is that putative signaling proteins generally are not abundant. Detection is further complicated by stoichiometry—the phosphorylated form of the protein is usually only a relatively small fraction of the total population. So, what are the options for enrichment and visualization of phosphoproteins?

For many years, the primary options relied on 2-D gels. The simplest method was to look for the appearance of new protein isoforms. When a protein becomes phosphorylated, its pI becomes more acidic, and the phosphoprotein shifts to a new position on the 2-D gel. The problem, of course, is that many—or most—phosphoproteins are likely to escape detection. A recent alternative for this method is a fluorescent stain, Pro-Q Diamond (Molecular Probes, Eugene, OR). This stain has a strong preference for the phosphorylated form of proteins, simplifying analysis, and gives linear and sensitive detection. An even more sensitive approach is radioactive labeling cells with orthophosphate to “tag” phosphoproteins, followed by separation by 2-D gels. Arabidopsis suspension-cultured cells are very amenable to radioactive labeling and have proven an excellent system for studying rapid phosphorylation changes in response to microbial elicitors, such as the flagellin peptide (Peck et al., 2001; Nühse et al., 2003a). Whether because of the response kinetics or the nature of suspension-cultured cells, however, this system has not been successful with other abiotic or hormonal responses (S.C. Peck, unpublished data). All of these approaches would benefit from preenrichment of phosphoproteins to assist identification. Recently, commercial phosphoprotein-enrichment columns have become available from a number of companies. Although these columns do not bind all phosphoproteins, they do appear to reproducibly enrich for at least subphosphoproteomes.

A limitation in all of the above approaches is that they will fail to conclusively “prove” that the candidate is a phosphoprotein. The target protein will be excised from the gel and identified by mass spectrometry. However, ionization of phosphopeptides is usually suppressed in the presence of nonphosphopeptides. Because a peptide needs to be ionized to be detected by mass spectrometers, the suppression effect generally makes phosphopeptides “invisible” in complex mixtures. A standard method for circumventing this problem is using immobilized metal affinity chromatography (IMAC) to enrich for phosphopeptide(s). The strong positive charge of the transition metal, usually Fe3+ or Ga3+, binds the negatively charged phosphate group and selects it from the mixture. It should be emphasized that manipulation of microcolumn IMAC and performing phosphorylation site analysis by mass spectrometry are still not common practice in many mass spectrometry facilities and often require special training.

If IMAC can be used for phosphopeptide analysis of individual proteins, can it be used for complex mixtures? A potential downfall of IMAC is that it also may bind peptides containing many acidic residues. Methylation of acidic residues was found to improve the specific binding of phosphopeptides from yeast (Ficarro et al., 2002), and this approach was employed to identify eight phosphopeptides from Arabidopsis thylakoid membranes, including three new phosphopeptides (Hansson and Vener, 2003). Work in our laboratory has produced an alternative IMAC procedure that does not require secondary modification chemistry but still yields highly pure (75%–90%) phosphopeptides from Arabidopsis plasma membranes (Nühse et al., 2003b). In this case, including strong anion exchange prior to IMAC was found to increase representation of monophosphorylated peptides. Together, these methods identified more than 300 phosphorylation sites from about 200 proteins using 100 μg of plasma membrane protein (Nühse et al., 2003b, 2004). Another recent option is the use of cation-exchange columns under very specific pH conditions that greatly enrich for phosphopeptides, resulting in the identification of 2,000 phosphorylation sites from more than 900 proteins from HeLa cell nuclei (Beausoleil et al., 2004). This method has the advantage that it is relatively easy and very robust. It should be noted, however, that the experiments were performed with 8 mg of protein, so at present it is unknown how well this procedure will “scale down.” In the end, we may find that each of these approaches is better suited for particular applications. The primary goal now is to combine these IMAC methods with iTRAQ-type reagents to perform quantitative comparisons between genetic mutants or after treatments. These experiments should provide unique insights into how different signaling pathways interact.

Glycosylphosphatidylinositol

Glycosylphosphatidylinositol (GPI) is a PTM on the C terminus of some proteins, tethering them to the extracellular membrane. These GPI-anchored proteins (GPI-APs) have a number of unique properties, including greater flexibility than true integral membrane proteins because of their peripheral attachment and the possibility of dynamic protein release to the extracellular space by specific phospholipase cleavage of the GPI moiety. Perhaps because of these properties, GPI-APs have been implicated in signaling and differentiation. Although the C-terminal attachment motif allows a reasonable level of prediction in Arabidopsis (e.g. Borner et al., 2002), prediction programs are not always in agreement. Thus, empirical evidence was needed both for confirmation as well as for refinement of the prediction programs. Using the phospholipase Pi-PLC to “shave” GPI-APs from plasma membrane preparations, two studies identified 30 (Borner et al., 2003) and 44 (Elortza et al., 2003) GPI-APs from plasma membrane preparations of Arabidopsis suspension cells. In these approaches, the GPI moiety itself was not detected, meaning that absolute proof of the modification was lacking. The specificity of the Pi-PLC, however, is generally accepted as solid evidence that the proteins released are indeed GPI-APs. Interestingly, only 19 proteins overlapped between the two sets of data, perhaps reflecting the large differences in age of the cells used in the two studies. Borner et al. (2003) went on to use their empirical data to refine their prediction program, resulting in 64 new GPI-APs to be predicted from the Arabidopsis database. Because of the potential role of GPI-APs in growth and development, quantitative comparisons of GPI-AP populations from different stages of growth or hormone treatment could be very informative.

Ubiquitylation

Targeted degradation of ubiquitylated proteins is a rapidly expanding field of study in Arabidopsis, encompassing a wide range of fields, including light and hormone responses as well as plant defense. As various components of the ubiquitylation machinery are genetically defined in a response, it remains a major challenge to identify the target proteins. As discussed above, 2-D gels or ICAT/iTRAQ are options, but these approaches are severely limited by sensitivity of detection. As reviewed by Denison et al. (2005), proteomics technology is rapidly progressing to enable the study of populations of proteins modified by ubiquitin (Ub) and Ub-like (Ubl) proteins. The strategy is relatively straightforward. An epitope-tagged copy of Ub or Ubl is used to affinity purify the population of modified proteins, either in vitro or in vivo. One of the limiting factors in this approach could be the competition from high level of free Ub/Ubl, but dialysis or size separation columns to remove the free forms prior to the affinity step are likely to address this concern. Any of the previously described methods for quantitative comparisons should be amenable to these studies.

Determination of Protein Location in the Cell

Where a protein resides in the cell has tremendous implications for its function, but our ability to predict subcellular localization based on primary sequence remains incomplete. Moreover, sequence predictions do not account for variation that can arise from alternative transcription or splicing. A single gene may produce a protein with or without a targeting sequence, allowing the “same” protein to end up in two different compartments. As with PTM analysis, subcellular proteomics is essential both for a more complete understanding of the organelle's function and regulation as well as to detect dynamic changes that may occur during various responses. This area has probably attracted the most research in Arabidopsis proteomics, and a number of excellent reviews cover the literature in greater detail (e.g. Baginsky and Gruissem, 2004; Millar, 2004). Here, I will mainly highlight a few main points as well as some of the recent publications.

Chloroplasts and the Problems of Abundant Proteins

Perhaps unsurprisingly, the chloroplast has been the most thoroughly studied of the plant organelle proteomes. These studies have resulted in both large amounts of empirical confirmation of protein localization as well as significant refinement of targeting prediction programs (for review, see Baginsky and Gruissem, 2004). Moreover, these studies corrected database annotations of false N- and C-termini as well as exon/intron boundary predictions, demonstrating the value of thorough proteomics experiments when they provide sufficient sequence coverage.

This section is a good place to mention a major problem for chloroplast proteomics specifically and more generally for plant proteomics: Rubisco. Anyone who has extracted protein from a plant leaf recognizes that tremendously abundant band of Rubisco on their gel. To study proteins at the lower end of the dynamic range, it will be essential to remove Rubisco and possibly a few other highly abundant chloroplast proteins. In mammalian serum profiling, the problem of serum albumin has been addressed using antibody columns to affinity deplete this highly abundant protein. Perhaps a similarly useful community resource can be produced for plant proteomics. In the meantime, a method describing FPLC anion-exchange chromatography to deplete Rubisco from crude leaf extracts provides a working interim solution (Wienkoop et al., 2004).

Vacuoles and the Questions of Contamination

In 2004, three proteomic studies of the Arabidopsis vacuole were published, two using suspension-cultured cells (Shimaoka et al., 2004; Szponarski et al., 2004) and one using mature plants (Carter et al., 2004). All three studies found advantages to separating the integral membranes of the tonoplast by SDS-PAGE prior to LC-MS/MS analysis. In an interesting case of what can be inferred from data interrogation, Carter et al. (2004) found all four components—and only one of each—that would make up a SNARE complex, indicating (but not proving) that their proteomic analysis may have uncovered an entire SNARE-pin complex regulating vesicle trafficking to the vacuole.

The common finding of putative contaminant proteins in all three vacuolar analyses raises the important question about how the potential of contamination from other organelles affects interpretation of subproteomic analyses. Being a lytic compartment involved in turnover of proteins from other organelles (or perhaps even whole organelles), the vacuole is one of the more complicated cases, but the arguments should be considered in all studies. Even in highly enriched isolations of a particular organelle, some degree of contamination, particularly by extremely abundant proteins, is not unexpected. But once the first contaminant is found, one must question the putative assignment of an unknown protein to its “new” location. The converse, of course, also applies. Just because a sequence annotation predicts targeting to the mitochondria does not exclude that protein from being present in the vacuole. The prediction may be wrong, or a protein may move between compartments. Or, via alternative transcription/splicing/translation, the same gene may target proteins to two different compartments. The important point is that proteomic localization must be interpreted as evidence, not proof.

Endomembranes

The issue of contamination is particularly complicated when studying the endomembrane system (endoplasmic reticulum, Golgi, and plasma membranes) because of the overlapping profiles during centrifugation gradients. A recent paper demonstrates a clever approach toward solving this problem called Localization of Organelle Proteins by Isotope Tagging (Dunkley et al., 2004). In this method, proteins in different fractions are labeled with ICAT reagents after separation of the endomembrane organelles using a density gradient. After a series of pairwise comparisons by LC-MS/MS, proteins are clustered based on their distribution profiles. In this way, organelle assignments are made based on a complete profile rather than presence in a particular fraction. We have much to learn about how a protein matures through the endomembrane system, and this technical development provides an excellent foundation to pursue these studies.

Direct Measurement of Protein Activity

In the end, biology is about function. Identifying proteins in an organelle or with a particular PTM begins to give us clues that indicate a protein is active in a particular place or at a particular time. In this way, proteomics (hopefully) provides new candidates involved in biological responses for further study—candidates that may not have been obvious using other methods. A more direct approach is to interrogate enzyme activity itself. Recently, a robotics platform has been described that allows direct measurement of 23 enzymes involved in carbon and nitrogen metabolism (Gibon et al., 2004). These studies demonstrated that for numerous enzymes, transcript levels underwent significant diurnal fluctuations with little change in enzyme activity.

Although the above clearly supports the need for examining enzyme activity, many labs may be put off by the idea of establishing a robotics platform. A more accessible approach involves a rapidly evolving area of proteomics called activity-based protein profiling. The reaction mechanism of many enzymes allows stable, covalent attachment of a reactive group to the active site. This reactive group, or chemical probe, can be modified either with fluorescent tags for sensitive detection or affinity tags for subsequent isolation and identification. Each probe is very specific for a particular class of enzymes. Enzymes for which probes currently exist include many classes of proteases, phosphatases, glucosidases, deubiquitylating enzymes, and kinases (for review, see Campbell and Szardenings, 2003; Jessani and Cravatt, 2004). The specificity of these probes allows easy purification of the target enzyme(s) from complex samples. In one of the only studies performed in Arabidopsis, an activity probe for Cys proteases was used for a one-step isolation resulting in the identification of five Cys proteases active in leaves of mature plants. The level of one active protease, AALP, remained unchanged during leaf senescence, even though the transcript level had previously been shown to increase at this time, demonstrating the added information that comes from directly measuring activity (van der Hoorn et al., 2004). These types of facile and sensitive assays will add tremendously to our studies of plant biology, particularly as new probes for other classes of enzymes are developed.

CONCLUSION

The Arabidopsis genome is sequenced. We have microarrays to look at changes in transcript levels. We have mutants for most of the genes. And after many years of development, we also have a mature proteomics technology platform. Now is the time to bring these resources and tools together. Too many genetically defined pathways exist with large black boxes connecting the known mutants. We need to fill in the gaps. We need to define molecular mechanisms. Proteomics will blend perfectly and powerfully with genetics. Let the revolution begin.

Acknowledgments

The space and the format of this Update limited the range of topics that could be covered. My apologies to researchers whose work was not cited. I wish to thank members of my laboratory for comments on the manuscript.

1

This work was supported by the Gatsby Charitable Foundation (funding to S.C.P.).

References

  1. Adam GC, Sorensen EJ, Cravatt BF (2002) Chemical strategies for functional proteomics. Mol Cell Proteomics 1: 781–790 [DOI] [PubMed] [Google Scholar]
  2. Aebersold R, Goodlett DR (2001) Mass spectrometry in proteomics. Chem Rev 101: 269–295 [DOI] [PubMed] [Google Scholar]
  3. Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature 422: 198–207 [DOI] [PubMed] [Google Scholar]
  4. Alban A, David SO, Bjorkesten L, Andersson C, Sloge E, Lewis S, Currie I (2003) A novel experimental design for comparative two-dimensional gel analysis: two-dimensional difference gel electrophoresis incorporating a pooled internal standard. Proteomics 3: 36–44 [DOI] [PubMed] [Google Scholar]
  5. Baginsky S, Gruissem W (2004) Chloroplast proteomics: potentials and challenges. J Exp Bot 55: 1213–1220 [DOI] [PubMed] [Google Scholar]
  6. Beausoleil SA, Jedrychowski M, Schwartz D, Elias JE, Villén J, Li J, Cohn MA, Cantley LC, Gygi SP (2004) Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc Natl Acad Sci USA 101: 12130–12135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Borner GHH, Lilley KS, Stevens TJ, Dupree P (2003) Identification of glycosylphosphatidylinositol-anchored proteins in Arabidopsis. A proteomic and genomic analysis. Plant Physiol 132: 568–577 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Borner GHH, Sherrier DJ, Stevens TJ, Arkin IT, Dupree P (2002) Prediction of glycosylphosphatidylinositol-anchored proteins in Arabidopsis: a genomic analysis. Plant Physiol 129: 486–499 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Campbell DA, Szardenings AK (2003) Functional profiling of the proteome with affinity labels. Curr Opin Chem Biol 7: 296–303 [DOI] [PubMed] [Google Scholar]
  10. Carter C, Pan S, Zouhar J, Avila EL, Girke T, Raikhel NV (2004) The vegetative vacuole proteome of Arabidopsis thaliana reveals predicted and unexpected proteins. Plant Cell 16: 3285–3303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Denison C, Kirkpatrick DS, Gygi SP (2005) Proteomic insights into ubiquitin and ubiquitin-like proteins. Curr Opin Chem Biol 9: 69–75 [DOI] [PubMed] [Google Scholar]
  12. Dunkley TPJ, Watson R, Griffin JL, Dupree P, Lilley KS (2004) Localization of organelle proteins by isotope tagging (LOPIT). Mol Cell Proteomics 3: 1128–1134 [DOI] [PubMed] [Google Scholar]
  13. Elortza F, Nühse TS, Foster LJ, Stansballe A, Peck SC, Jensen ON (2003) Proteomic analysis of glycosylphosphatidylinositol-anchored membrane proteins. Mol Cell Proteomics 2: 1261–1270 [DOI] [PubMed] [Google Scholar]
  14. Ficarro SB, McCleland ML, Stukenber PT, Burke DJ, Ross MM, Shabanowitz J, Hunt DF, White FM (2002) Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat Biotechnol 20: 301–305 [DOI] [PubMed] [Google Scholar]
  15. Gibon Y, Blaesing OE, Hannemann J, Carillo P, Höhne M, Hendriks JHM, Palacios N, Cross J, Selbig J, Stitt M (2004) A robot-based platform to measure multiple enzyme activities in Arabidopsis using a set of cycling assays: comparisons of changes in enzyme activities and transcript levels during diurnal cycles in prolonged darkness. Plant Cell 16: 3304–3325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gygi SP, Rochon Y, Franza BR, Aebersold R (1999) Correlation between protein and mRNA abundance in yeast. Plant Physiol 19: 1720–1730 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hansson M, Vener AV (2003) Identification of three previously unknown in vivo protein phosphorylation sites in thylakoid membranes of Arabidopsis thaliana. Mol Cell Proteomics 2: 550–559 [DOI] [PubMed] [Google Scholar]
  18. Jessani N, Cravatt BF (2004) The development and application of methods for activity-based protein profiling. Curr Opin Chem Biol 8: 54–59 [DOI] [PubMed] [Google Scholar]
  19. Jones AME, Thomas V, Truman B, Lilley K, Mansfield J, Grant M (2004) Specific changes in the Arabidopsis proteome in response to bacterial challenge: differentiating basal and R-gene mediated resistance. Phytochemistry 65: 1805–1816 [DOI] [PubMed] [Google Scholar]
  20. Koller A, Washburn MP, Lange BM, Andon NL, Deciu C, Haynes PA, Hays L, Schieltz D, Ulaszek R, Wei J, et al (2002) Proteomic survey of metabolic pathways in rice. Proc Natl Acad Sci USA 99: 11969–11974 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. MacCoss MJ, McDonald WH, Saraf A, Sadygov R, Clark JM, Tasto JJ, Gould KL, Wolters D, Washburn M, Weiss A, et al (2002) Shotgun identification of protein modifications from protein complexes and lens tissue. Proc Natl Acad Sci USA 99: 7900–7905 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. MacKay VL, Li X, Flory MR, Turcott E, Law GL, Serikawa KA, Xu XL, Lee H, Goodlett DR, Aebersold R, et al (2004) Gene expression analyzed by high-resolution state array analysis and quantitative proteomics. Mol Cell Proteomics 3: 478–489 [DOI] [PubMed] [Google Scholar]
  23. Millar AH (2004) Location, location, location: surveying the intracellular real estate through proteomics in plants. Funct Plant Biol 31: 563–571 [DOI] [PubMed] [Google Scholar]
  24. Nühse TS, Boller T, Peck SC (2003. a) A plasma membrane syntaxin is phosphorylated in response to the bacterial elicitor flagellin. J Biol Chem 278: 45248–45254 [DOI] [PubMed] [Google Scholar]
  25. Nühse TS, Stansballe A, Jensen ON, Peck SC (2003. b) Large-scale analysis of in vivo phosphorylated membrane proteins by immobilized metal ion affinity chromatography and mass spectrometry. Mol Cell Proteomics 2: 1234–1243 [DOI] [PubMed] [Google Scholar]
  26. Nühse TS, Stansballe A, Jensen ON, Peck SC (2004) Phosphoproteomics of the Arabidopsis plasma membrane and a new phosphorylation site database. Plant Cell 16: 2394–2405 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Peck SC, Nühse TS, Iglesias A, Hess D, Meins F, Boller T (2001) Directed proteomics identifies a plant-specific protein rapidly phosphorylated in response to bacterial and fungal elicitors. Plant Cell 13: 1467–1475 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Pradet-Balade B, Boulmé F, Beug H, Müllner EW, Garcia-Sanz JA (2001) Translational control: bridging the gap between genomics and proteomics? Trends Biochem Sci 26: 225–229 [DOI] [PubMed] [Google Scholar]
  29. Rose JKC, Bashir S, Giovannoni JJ, Jahn MM, Saravanan RS (2004) Tackling the plant proteome: practical approaches, hurdles, and experimental tools. Plant J 39: 715–733 [DOI] [PubMed] [Google Scholar]
  30. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, et al (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics 3: 1154–1169 [DOI] [PubMed] [Google Scholar]
  31. Shimaoka T, Ohnishi M, Sazuka T, Mitsuhashi N, Hara-Nishimura I, Shimazaki KI, Maeshima M, Yokota A, Tomizawa KI, Mimura T (2004) Isolation of intact vacuoles and proteomic analysis of tonoplast from suspension-cultured cell of Arabidopsis thaliana. Plant Cell Physiol 45: 672–683 [DOI] [PubMed] [Google Scholar]
  32. Swidzinski JA, Leaver CJ, Sweetlove LJ (2004) A proteomic analysis of plant programmed cell death. Phytochemistry 65: 1829–1838 [DOI] [PubMed] [Google Scholar]
  33. Szponarski W, Sommerer N, Boyer JC, Rossignol M, Gibrat R (2004) Large-scale characterization of integral proteins from Arabidopsis vacuolar membrane by two-dimensional liquid chromatography. Proteomics 4: 397–406 [DOI] [PubMed] [Google Scholar]
  34. Tian Q, Stepaniants SB, Mao M, Weng L, Feetham MC, Doyle MJ, Yi EC, Dai H, Thorsson V, Eng J, et al (2004) Integrated genomic and proteomic analyses of gene expression in mammalian cells. Mol Cell Proteomics 3: 960–969 [DOI] [PubMed] [Google Scholar]
  35. Tonge R, Shaw J, Middleton B, Rowlinson R, Rayner S, Young J, Pognan F, Hawkins E, Currie I, Davison M (2001) Validation and development of fluorescent two-dimensional gel electrophoresis proteomics technology. Proteomics 1: 377–396 [DOI] [PubMed] [Google Scholar]
  36. van der Hoorn RAL, Leeuwenburgh MA, Bogyo M, Joosten MHAJ, Peck SC (2004) Activity profiling of papain-like cysteine proteases in plants. Plant Physiol 135: 1170–1178 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Wienkoop S, Glinski M, Tanaka N, Tolstikov V, Fiehn O, Weckwerth W (2004) Linking protein fractionation with multidimensional monolithic reversed-phase peptide chromatography/mass spectrometry enhances protein identification from complex mixtures even in the presence of abundant proteins. Rapid Commun Mass Spectrom 18: 643–650 [DOI] [PubMed] [Google Scholar]

Articles from Plant Physiology are provided here courtesy of Oxford University Press

RESOURCES