Significance
Here we report deep, quantitative, and replicated proteome analysis of a developing multicellular organism. We quantified protein abundance and levels of protein phosphorylation during development of the maize seed. The depth and quantitative nature of the data enabled a network-based approach to identify kinase-substrate relationships as well as the reconstruction of biochemical and signaling networks that underpin seed development and seed storage product production. We found that many of the most abundant proteins are not associated with detectable levels of their mRNAs and vice versa. These data significantly add to our understanding of seed development and facilitate knowledge-based crop improvement.
Keywords: quantitative proteomics, protein phosphorylation, systems biology
Abstract
A comprehensive knowledge of proteomic states is essential for understanding biological systems. Using mass spectrometry, we mapped an atlas of developing maize seed proteotypes comprising 14,165 proteins and 18,405 phosphopeptides (from 4,511 proteins), quantified across eight tissues. We found that many of the most abundant proteins are not associated with detectable levels of their mRNAs, and we provide evidence for three potential explanations: transport of proteins between tissues; diurnal, out-of-phase accumulation of mRNAs and cognate proteins; and differential lifetimes of mRNAs compared with proteins. Likewise, many of the most abundant mRNAs were not associated with detectable levels of their proteins. Across the entire dataset, protein abundance was poorly correlated with mRNA levels and was largely independent of phosphorylation status. Comparisons between proteotypes revealed the quantitative contribution of specific proteins and phosphorylation events to the spatially and temporally regulated starch and oil biosynthetic pathways. Reconstruction of signaling networks established associations of proteins and phosphoproteins with distinct biological processes acting during seed development. Additionally, a protein kinase substrate network was reconstructed, enabling the identification of 762 potential substrates of specific protein kinases. Finally, examination of 694 transcription factors revealed remarkable constraints on patterns of expression and phosphorylation within transcription factor families. These results provide a resource for understanding seed development in a crop that is the foundation of modern agriculture.
A central goal of biology is to understand phenotype. Proteins make or regulate every component of cells, and therefore phenotype is an emergent property of the specific state of the proteome. The proteomic state of a cell is its proteotype, which integrates the constraints of its genotype, developmental history, and environment. Thus, a complete description of the proteotype should define a phenotype at the molecular level. Typically, measurements of mRNA abundance are used to infer the proteotype (1, 2). However, it has become clear that mRNA levels are poorly correlated with protein abundance (3–8). Proteome-wide surveys are crucial for bridging this gap and defining specific cellular proteotypes.
Maize is a model organism with a rich history in fundamental research in addition to being the world’s largest production crop. The maize seed is a developmentally complex structure comprised of two major compartments, the diploid embryo and the triploid endosperm, that arise from two separate fertilization events (double fertilization) and are enclosed within the maternally derived pericarp (9). Like in other grasses, the maize endosperm is persistent throughout seed development (10). The endosperm consists primarily of starchy endosperm cells that are responsible for synthesis of starch and storage proteins and its perimeter is comprised of a single layer of aleurone cells. At maturity, the embryo is comprised of a root meristem, a shoot meristem, and five or six leaf primordia enclosed within the scutellum (9, 11). Additionally, the embryo is the primary site of lipid biosynthesis in the seed. The production of storage products during seed formation is tightly regulated, and their accumulation is directly correlated with cell number and cell size (12, 13). Thus, the maize seed is an excellent model for profiling the proteotypes from a complex set of tissues that exhibit extensive spatiotemporal control and coordinated morphogenesis.
We used mass spectrometry (MS) to build an atlas of proteotypes for the developing maize seed based on protein abundance and levels of protein phosphorylation. These quantitative, highly replicated data enabled the reconstruction of protein networks for key biochemical processes and for developmental pathways.
Results
Mapping the Maize Seed Proteotype Atlas.
To enhance our understanding of regulatory events controlling seed development as well as the key harvested traits of starch, lipid, and storage protein accumulation, we hand-dissected the maize seed into compartments at seven stages of development for MS analyses (Fig. 1A). These compartments include embryo, endosperm, and aleurone/pericarp tissues. Total protein was extracted from each sample, and tryptic peptides from the samples, with or without phosphopeptide enrichment, were analyzed by MS. The spectra were searched by using the B73 RefGen_v2 5a Working Gene Set (WGS) (14). By using stringent cutoffs to maintain a low false discovery rate at the spectral, peptide, and protein level, we identified 13,459 proteins (protein groups), originating from 13,203 gene models, based solely on 108,786 distinct nonmodified peptides (Fig. 1B and Dataset S1). The genes responsible for producing 12,453 of the proteins could be unequivocally assigned by the identification of at least one uniquely mapping peptide (Dataset S1). These proteins are predominantly in the filtered gene set (FGS), which consists of 39,656 high-confidence gene models that exclude transposons, pseudogenes, and other low-confidence members of the WGS (Fig. 1 D and E and Dataset S1). Finally, we identified and measured 4,511 phosphoproteins based on 18,405 distinct phosphopeptides containing 19,049 sites of phosphorylation, 8,889 of which were localized to a specific amino acid (Fig. 1C and Dataset S2).
Fig. 1.
Overview of the maize seed proteotype atlas. (A) Nonmodified proteins (n = 4–7 biological replicates) and phosphoproteins (n = 3–6 biological replicates) were identified from diverse tissues in the developing maize seed. Tissues sampled are shown in color and ploidy is indicated in parentheses (DAI, days after imbibition). (B) Summary of sampled spectra, peptides, and proteins identified. (C) Number of total phosphorylation sites as well as localized phosphorylation sites. (D) Percentage of detected proteins that are in the FGS or WGS. (E) Breakdown of detected proteins based on annotation. For D and E, only the subset of proteins (n = 12,453) identified via uniquely mapping peptides were used.
The ∼5 million identified mass spectra, collected from several biological replicates (n = 4–7 for each nonmodified proteotype and n = 3–6 for each phosphoproteotype), were used to quantify protein abundance and phosphorylation levels by spectral counting (15, 16). To assess reproducibility and accuracy of the biological replicates, we computed Pearson correlations and found averages of 0.92 and 0.62 for the nonmodified proteome and phosphoproteome replicates, respectively (Dataset S3). Additionally, the data accurately reflect known patterns of protein accumulation (Fig. S1) (9, 17–22). Examination of the data revealed that the majority of proteins are present in multiple tissues whereas only 1,203 are tissue-specific (Fig. S2 A and B). These tissue-specific proteins are enriched in MapMan functional categories (i.e., ontological terms) (23) including “receptor kinases” and “regulation of transcription.” Additionally, most of the globally expressed proteins exhibit dynamic patterns of accumulation during seed development. Specifically, of the 4,709 globally expressed proteins, only 180 are stably expressed throughout development (change less than twofold in abundance; Fig. S2C), suggesting that dynamic changes in protein abundance underpin seed development.
Relationship Between Transcript Level and Protein Abundance.
We explored the relationship between mRNA and protein levels in the endosperm 12 d after pollination (DAP) and in the embryo 20 DAP by comparing our prototype data with publically available transcript profiling data (24). Considering only genes for which mRNA and protein were both reliably measured (endosperm, n = 5,922; embryo, n = 7,257), we found poor correlation between transcript and protein levels (endosperm r = 0.414, embryo r = 0.413; Dataset S4). Although the global correlation was low, there were a wide range of correlations dependent on the functional category (MapMan bins; Dataset S4); for instance, “aspartate metabolism” (r = 0.99), “phosphoenolpyruvate carboxylase” (r = 0.98), “oxidative pentose phosphate 6-phosphogluconate dehydrogenase” (r = 0.92), “abscisic acid signal transduction” (r = −0.17), “auxin metabolism” (r = −0.06), “auxin response factor” (ARF; r = −0.001), “basic leucine zipper (bZIP) transcription factor (TF) family” (r = 0.03), and “cell wall modification” (r = −0.70), indicating that posttranscriptional regulation of protein abundance is function-specific.
We observed that 22% (endosperm) and 21% (embryo) of the genes had matching rank abundance between their mRNA and protein (Fig. 2A, yellow dots) and 29% of these genes were the same in both tissues (i.e., protein and mRNA; Fig. 2B), suggesting that they are largely free of posttranscriptional regulation (Dataset S5). Additionally, many genes produced high abundance mRNA but little or no detectable protein (Fig. 2, Fig. S3A, and Dataset S5). Such cases may be explained by translational inhibition or targeted protein degradation.
Fig. 2.
Relationship of transcript level to protein abundance. (A) Rank order abundance of mRNA and protein in the endosperm 12 DAP and the embryo 20 DAP. Protein abundance was quantified by using unique and multimapping peptides. Detected proteins lacking a corresponding microarray probe were excluded. (B) Overlap in gene products exhibiting similar regulation in the endosperm and embryo. (C) Transcript abundance cycling behavior of protein greater than mRNA genes. (D) Number of 12 DAP endosperm protein greater than mRNA genes that are detected at the transcript level in the 12 DAP endosperm and/or WS. (E) Number of 20 DAP embryo protein greater than mRNA genes that are detected at the transcript level in the 20 DAP embryo, endosperm, and/or WS. Transcript data shown in A, B, D, and E were reported by Sekhon et al. (24), whereas transcript data for C were described in the work of Khan et al. (25).
Surprisingly, for many of the most abundant proteins there was little or no detectable mRNA (Fig. 2, Fig. S3A, and Dataset S5). To verify this remarkable finding, we performed quantitative reverse transcriptase-PCR (RT-qPCR) on the same samples that were used for proteomics and found concordance between the published array data and our RT-qPCR values (Fig. S3B). For example, the protein produced by GRMZM2G472236 is a Late Embryogenesis Abundant family member and was among the most abundant proteins in the 20 DAP embryo, but the corresponding transcript was not detected by RT-qPCR or microarray in either tissue (Fig. S3 B and C).
We explored three possible scenarios that could explain the high-protein, low-mRNA discrepancy: (i) transcript levels cycle diurnally while the protein remains, (ii) transcription and translation occur earlier in development and the proteins are stable while the mRNA is not, and (iii) transcription and translation occur in another tissue from which the protein moves. We found evidence to support all three hypotheses by comparing the protein greater than mRNA genes with microarray data characterizing maize leaf circadian cycling genes (25) as well as additional seed microarrays from Sekhon et al. (24), which profiled 12 and 20 DAP whole seed (WS), 12 DAP endosperm, and 16, 18, and 20 DAP embryos. (i) Seven of the endosperm and three of the embryo protein greater than mRNA genes encode transcripts known to be circadian regulated (Fig. 2C). (ii) Transcripts for 5 of the 20 DAP embryo protein greater than mRNA genes are detected in 16 and/or 18 DAP embryos but not 20 DAP embryos (Fig. S3D). Further, 20 DAP embryo protein greater than mRNA genes, for which mRNA is detected in 16 DAP embryos, are transcriptionally expressed at a higher level at 16 DAP compared with 20 DAP (Fig. S3E). (iii) A total of 15 of the 50 endosperm protein greater than mRNA genes, which were not detected at the mRNA level in the 12 DAP endosperm, were detected in the 12 DAP WS (Fig. 2D). Additionally, 15 of the 54 embryo protein greater than mRNA genes, which were not detected at the mRNA level in the 20 DAP embryo, were detected in the 20 DAP WS and/or 20 DAP endosperm (Fig. 2E).
Phosphorylation Levels Are Independent of Protein Abundance.
The lack of concordance between mRNA and protein levels prompted us to ask whether protein abundance dictates phosphorylation level. For this, we focused on the 3,805 nonmodified proteins that were also observed as phosphoproteins (Fig. 3A). After clustering proteins based on phosphoprotein abundance, it was apparent that phosphorylation level and protein abundance are largely independent (Fig. 3 B and C), as observed in mice (15). Further, individual sites of phosphorylation exhibit tissue-specific levels that are not dictated by protein abundance (Fig. 3D).
Fig. 3.
Phosphorylation levels are largely independent of protein abundance. (A) Venn diagram showing the overlap between nonmodified and phosphoproteins. (B) Heat maps ordered by hierarchical clustering of phosphoprotein abundance of all proteins detected at the nonmodified and phosphoprotein level. (C) Selected phosphopeptides exhibiting site-specific phosphorylation that is not dictated by protein abundance. Data are means of independent biological replicates ± SE.
Protein Kinase Network Reconstruction.
Next, we used the atlas to reconstruct a regulatory network of protein kinases and their substrates. A common feature of protein kinases is their activation loop, which requires phosphorylation to enable catalysis (26). We exploited this feature to quantify kinase activity during seed development by measuring phosphopeptides from each activation loop. Importantly, kinase activation could not have been predicted from kinase abundance (Fig. 4A). We next performed a correlation analysis to identify proteins whose phosphorylation levels corresponded with activation of a specific kinase, inferring that these proteins may be substrates of the kinase. This enabled the reconstruction of a network containing nine activated kinases and 762 potential substrate proteins (Fig. 4B and Dataset S6). For validation, we compared our predicted substrates of mitogen-activated protein kinase 6 (ZmMPK6) with known substrates of the orthologous Arabidopsis MPK6 (27) and found a significant overlap (P = 2.99 × 10−3). Additionally, glycogen synthase kinase 3/SHAGGY (GSK) consensus motif, (S/T)XXX(S/T), is overrepresented (P = 0.045) in the substrates of the GSK-related kinase (GRMZM2G155836). Further, the MAPK consensus motif, PX(S/T)P, is overrepresented in the substrates of AtMPK6-like (P = 0.056) and AtMPK9-like (P = 0.023). Examination of the network revealed kinases predicted to phosphorylate a number of well-studied maize proteins (Dataset S6). For example, a GSK kinase is predicted to phosphorylate the bZIP TFs OPAQUE2 HETERODIMERIZING PROTEIN 1 and 2 (OHP1 and OHP2) (28) on a conserved serine (Fig. S4). Taken together, this approach has enabled the creation of a robust predictive network of potential kinase–substrate pairs.
Fig. 4.
Prediction of protein kinase–substrate relationships. (A) Hierarchical clustering based upon the number of spectra mapping to the kinase activation loop was used to order heat maps depicting the amount of activated kinase (Left) and nonmodified form of the kinase (Right). (B) Network of activated kinases and proteins whose phosphorylation is correlated. Hub identifiers: 1, GRMZM2G306028 (AtMPK9-like); 2, GRMZM2G424582 (AtKEG-like); 3, GRMZM2G149286 (CDKD related); 4, GRMZM2G028452 (CDKC related); 5, GRMZM2G067734; 6, GRMZM2G171987 (SRPK4-like); 7, GRMZM2G167280 (LRR Receptor Like Kinase Related); 8, GRMZM2G155836 (GSK-related); and 9, GRMZM2G020216 (AtMPK6-like).
Spatiotemporal Regulation of Starch and Lipid Biosynthesis.
In the developing maize seed, starch and triglyceride accumulation are spatiotemporally regulated, resulting in their accumulation in endosperm and embryo, respectively (13, 29, 30). Thus, we examined the starch and triglyceride pathways in detail, hypothesizing that protein abundance or phosphorylation may regulate photosynthate partitioning. For this, we manually curated the pathways (SI Materials and Methods) to identify proteins that are known or predicted, based on homology, to perform each biochemical step. This enabled identification and quantification of known and novel paralogs of enzymes and transporters at each stage of development (Fig. 5, Fig. S5, and Dataset S1). Many lipid biosynthesis enzymes, including the key determinant of seed oil content DGAT1-2 (29), peaked in the early embryo, where most of the seed oil accumulates. In contrast, proteins known to be important for seed starch biosynthesis such as SH1, BT2, SH2, BT1, WX1, SU2, DU1, AE, and SU1 exhibited maximal abundance in the endosperm crown at 27 DAP, which corresponds with the peak time of starch synthesis in our samples (Fig. 5).
Fig. 5.
Dynamics of starch biosynthetic enzymes during seed development. Heat maps depict the relative abundance of individual proteins throughout development. The green “P” denotes tissues in which the corresponding phosphoprotein was detected. The starch biosynthesis pathway was adapted from Comparot-Moss and Denyer (30).
We also discovered numerous sites of phosphorylation that may regulate starch biosynthesis. In the endosperm of maize and other grasses, glucose-1-P is converted to ADP-glucose predominantly in the cytosol and then transported into the plastid by BT1, which is a nucleotide transporter of the mitochondrial carrier family (MCF) (30). A key feature of MCF proteins is their transmembrane barrel composed of six α-helices (31). Mutations in the yeast MCF protein ANT1 that mimic dephosphorylation of α-helix four abolish transport activity (32). Accordingly, we observed a phosphopeptide that matches α-helix one of BT1, suggesting that serine phosphorylation may regulate its ADP-glucose transport activity (Fig. 5, Fig. S6, and Dataset S2). Research on maize and wheat has established that phosphorylation causes starch synthesis enzymes to form active, multiprotein complexes; the complexes include SBEI/AE/SP, SSI/SU2/AE, and SSI/SU2/SBEI/SBE2a/SP, with SBEI, AE, and SP identified as phosphoproteins (33, 34). The sites of phosphorylation on these proteins have not been reported. Our atlas of proteotypes revealed specific sites of phosphorylation for the starch synthesis complex members SBEI, SBE2a, AE, and SP (Fig. 5 and Dataset S2). Identification of these phosphorylation sites enables targeted mutational studies aimed at regulating starch synthesis complex assembly, with the goal of tailoring starch quantity or quality for specific applications.
Seed Development Proteotypes.
To gain insight into biological processes functioning throughout seed development and to associate specific proteins with key seed phenotypes, we performed hierarchical clustering-based network reconstruction (35) on protein abundance and levels of phosphorylation. Consistent with phosphorylation status being independent of protein abundance (Fig. 3), MapMan bins identified as enriched at the phosphopeptide level were largely distinct from bins enriched in the nonmodified proteome (Fig. 6 and Dataset S7). We detected enrichment of MapMan bins for well-characterized biological processes including starch synthesis and lipid metabolism at the expected time and place (10, 12, 13, 29). Additionally, enzymes of phenylpropanoid metabolism were enriched in the pericarp/aleurone (Per/Aleu) tissue, which is known to accumulate phenylpropanoids that are associated with insect and pathogen resistance (36, 37). Jasmonate enzymes were also enriched in the Per/Aleu, suggesting that this defense hormone (38) forms an additional layer of defense in the protective pericarp tissue.
Fig. 6.
Hierarchical clustering of protein abundance and phosphorylation status throughout seed development. (A) Clustering of nonmodified proteins that had at least five normalized spectral counts in one or more tissue. (B) Clustering of phosphopeptides containing localized phosphorylation sites that were detected in at least two biological replicates. Vertical bars to the right of the heat maps denote the cluster (composed of all terminal nodes in the hierarchical tree) of proteins (A) or phosphopeptides (B) selected as tissue-enriched. Selected MapMan functional categories that are significantly enriched (hypergeometric test) in a given tissue (cluster) are listed to the right of each heat map. All enriched MapMan categories are listed in Dataset S7.
TF Dynamics.
Finally, we investigated TFs, which are key regulators of growth, development, and cell fate that have traditionally been difficult to detect by proteomics because of their low abundance (39, 40). However, in the developing seed, we identified and measured 694 (28%) of the 2,516 annotated TFs (Fig. 7). Clustering the TF data revealed extensive enrichment of protein accumulation and phosphorylation in specific tissues (Fig. 7A and Dataset S8). These tissue-enriched TFs represent candidate proteins responsible for patterning maize seed development as well as controlling the spatiotemporal expression of starch, lipid, and storage protein biosynthesis pathways.
Fig. 7.
Dynamics of TFs during seed development. (A) Hierarchical clustering showing the nonmodified and phosphoprotein abundance profiles of TFs. Vertical bars to the right of the heat maps denote the cluster (comprised of all terminal nodes in the hierarchical tree) selected as tissue-enriched. Cluster members are listed in Dataset S8. (B) Spectral counts were summed for each TF family. Red bars indicate the tissue of maximal abundance. Numbers to the right of the bars list the total number of detected TFs for each family.
As an alternative to looking at individual TFs, we searched for family-level patterns of TF accumulation by summing all the spectral counts within a tissue for each TF family. Surprisingly, the abundance of most TF families was greatest in a specific tissue, with 25 of 47 TF families collectively peaking in the 20 DAP embryo (Fig. 7B). Specifically, TF families with well established roles in tissue pattern formation such as MYB, ARF, homeobox, and TCP accumulated predominantly in the 20 DAP embryo, whereas MADS TFs peaked in the 8 DAP endosperm and YABBY TFs built up in the Per/Aleu. Consistent with the enrichment of defense proteins in the Per/Aleu (Fig. 6), the defense-related WRKY TF family peaked in the Per/Aleu. We also observed asymmetries in the phosphorylation status of TF families. For example, an increase in bZIP family abundance corresponded to increases in bZIP phosphorylation, whereas the opposite pattern was observed for the ARF family. Considering that the DNA binding activity of OPAQUE2 (a maize bZIP) and ARF2 (in Arabidopsis) is abolished by hyperphosphorylation (41, 42), our TF phosphorylation data may be used to infer TF activities during development.
Discussion
The maize seed is a developmentally complex entity composed of two major compartments, the diploid embryo and the triploid endosperm, that arise from double fertilization events. The endosperm and embryo develop inside the maternally derived pericarp, which arises from the ovary wall (9). Despite the biological complexity of the sampled tissues relatively few proteins exhibit tissue-specific accumulation patterns. Although tissue-specific proteins are strong candidates as key regulators of cell fate, the finding suggests that knowledge of quantitative changes in proteins abundance and phosphorylation status are critical for understanding seed development as well as biosynthetic processes that occur in a spatiotemporal manner during seed development. Thus, the tissue-enriched proteins and phosphorylation events identified via hierarchical clustering are likely important candidate regulators of tissue identity.
Analysis of the data highlights the complexity of accurately characterizing proteotype. Specifically, measures of mRNA correlate poorly with protein abundance, which is itself largely independent of phosphorylation status. The lack of correlation between mRNA and protein levels has been documented in other organisms (3–8). However, we were surprised by the observation that many of the most abundant proteins had little to no measurable mRNA. We found evidence suggesting that multiple mechanisms underpin this phenomenon, including (i) transcription and translation occur earlier in development and the proteins are stable while the mRNA is not; (ii) transcript levels cycle diurnally while the protein remains; and (iii) transcription and translation occur in another tissue from which the protein moves. Consistently, in mammalian cells, proteins are approximately five times more stable than mRNAs (3), and, in Arabidopsis, most transcripts cycle diurnally while the encoded proteins do not (43).
Because TFs establish regulatory networks that shape growth and development, knowledge of where and when TFs are active is of widespread interest (40). Interrogation of our proteotype atlas enabled the identification of TFs that exhibit enrichment in abundance or phosphorylation in a specific tissue, making these proteins strong candidate regulators of seed development. Consistent with this idea, we observed maximal accumulation of MADS TFs in the 8 DAP endosperm, which contains maternal nucellar tissue (Fig. S1A and ref. 44). In rice, MADS29 is expressed in the nucellus, where it regulates seed development by controlling programmed cell death of the maternal tissues (45). Finally, the observed conservation of tissue-specific accumulation and phosphorylation within many TF families suggests that there are significant evolutionary constraints on diversification of TF function.
Phosphorylation is a fundamental mechanism for regulating protein activity, and identification of thousands of phosphorylation sites by MS is now feasible. However, identification of the kinase responsible for substrate phosphorylation remains challenging. We therefore reconstructed a kinase–substrate regulatory network by correlating kinase activation with substrate phosphorylation. The resulting network predicts kinases responsible for phosphorylation of numerous “classical” maize genes that have been identified by mutant phenotypes (46). Additionally, we observed multiple instances of a kinase being predicted to phosphorylate a conserved phosphorylation site on paralogous TFs. These phosphorylation sites are of particular interest because functional phosphorylation sites are more likely to be conserved than nonfunctional sites (47). For example, a GSK kinase is predicted to phosphorylate a conserved serine on OHP1 and OPH2, which are involved in regulating zein storage protein synthesis in the endosperm (28).
In conclusion, we have created an atlas of maize seed proteotypes by using MS that quantifies protein abundance and phosphorylation levels across developmental time. The atlas comprises 14,165 proteins and 18,405 phosphopeptides, making it the most complete, quantitative proteome to date. The reconstruction of metabolic and developmental networks illustrates the utility of the atlas as well as the causal relationships between proteotypes and phenotypes. The atlas and derived protein networks add significantly to our understanding of seed development, and they should facilitate knowledge-based crop improvement.
Materials and Methods
Plant Material.
All samples were collected from Zea mays (maize) inbred line B73 grown outdoors on the University of California, San Diego, campus during summer 2009, following manual self or intersibling pollination. A detailed description of the sampled tissues is provided in SI Materials and Methods.
MS.
Sample preparation and MS are based on previously described methods (48–50) and are detailed in SI Materials and Methods. Briefly, the generated spectra were searched using the B73 RefGen_v2 5a WGS (14). Phosphorylation sites were localized to a particular amino acid within a peptide by using the variable modification localization score in Agilent Spectrum Mill software (51). Proteins that share common peptides were grouped by using principles of parsimony to address protein database redundancy. Thus, proteins within the same group share the same set or subset of peptides. Protein abundance and phosphorylation levels were quantified by spectral counting. Spectral counts for each protein represent the total number of peptide spectral matches to that protein (15, 16, 48). MS runs (replicates) were normalized so that the total number of spectral counts was equal for each run. Spectral counts from technical replicates, when present, were then averaged to get the spectral counts for each biological replicate at the protein level. Raw spectra are deposited at the Mass Spectrometry Interactive Virtual Environment (MassIVE) repository (nonmodified proteome ID MSV000078444 and phosphoproteome ID MSV000078443).
Relationship of mRNA to Protein.
Normalized mRNA expression data from a previous work (24), corresponding to the B73 RefGen_v2 5a Working Gene Set, were downloaded from PLEXdb Accession Zm37 (www.plexdb.org). A detailed description is provided in SI Materials and Methods.
PCR.
Detailed information on PCR is provided in SI Materials and Methods.
Functional Annotations.
Detailed information on functional annotations is provided in SI Materials and Methods.
Hierarchical Clustering.
Detailed information on hierarchical clustering is provided in SI Materials and Methods.
Functional Category Enrichment.
Detailed information on functional category enrichment is provided in SI Materials and Methods.
Pathway Analysis.
Detailed information on pathway analysis is provided in SI Materials and Methods.
Protein Kinase Substrate Network.
Detailed information on the protein kinase substrate network is provided in SI Materials and Methods.
Supplementary Material
Acknowledgments
This work was supported by National Science Foundation Grant 0924023 (to S.P.B.) and a National Institutes of Health National Research Service Award Postdoctoral Fellowship F32GM096707 (to J.W.W.).
Footnotes
The authors declare no conflict of interest.
Data deposition: Raw spectra have been deposited at the Mass Spectrometry Interactive Virtual Environment (MassIVE) repository, http://proteomics.ucsd.edu/ProteoSAFe/datasets.jsp (nonmodified proteome ID MSV000078444 and phosphoproteome ID MSV000078443).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1319113110/-/DCSupplemental.
References
- 1.Lockhart DJ, Winzeler EA. Genomics, gene expression and DNA arrays. Nature. 2000;405(6788):827–836. doi: 10.1038/35015701. [DOI] [PubMed] [Google Scholar]
- 2.Belmonte MF, et al. Comprehensive developmental profiles of gene activity in regions and subregions of the Arabidopsis seed. Proc Natl Acad Sci USA. 2013;110(5):E435–E444. doi: 10.1073/pnas.1222061110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Schwanhäusser B, et al. Global quantification of mammalian gene expression control. Nature. 2011;473(7347):337–342. doi: 10.1038/nature10098. [DOI] [PubMed] [Google Scholar]
- 4.Vogel C, et al. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol. 2010;6:400. doi: 10.1038/msb.2010.59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ghaemmaghami S, et al. Global analysis of protein expression in yeast. Nature. 2003;425(6959):737–741. doi: 10.1038/nature02046. [DOI] [PubMed] [Google Scholar]
- 6.Taniguchi Y, et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science. 2010;329(5991):533–538. doi: 10.1126/science.1188308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Vogel C, Marcotte EM. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet. 2012;13(4):227–232. doi: 10.1038/nrg3185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Petricka JJ, et al. The protein expression landscape of the Arabidopsis root. Proc Natl Acad Sci USA. 2012;109(18):6811–6818. doi: 10.1073/pnas.1202546109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Scanlon MJ, Takacs E. Kernel biology. In: Bennetzen J, Hake S, editors. Handbook of Maize: Its Biology. New York: Springer; 2009. pp. 121–143. [Google Scholar]
- 10.Sabelli PA, Larkins BA. The development of endosperm in grasses. Plant Physiol. 2009;149(1):14–26. doi: 10.1104/pp.108.129437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Consonni G, Gavazzi G, Dolfini S. Genetic analysis as a tool to investigate the molecular mechanisms underlying seed development in maize. Ann Bot (Lond) 2005;96(3):353–362. doi: 10.1093/aob/mci187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Pirona R, Hartings H, Lauria M, Rossi V, Motto M. Genetic control of endosperm development and of storage products accumulation in maize seeds. Maydica. 2005;50(3-4):515–530. [Google Scholar]
- 13.Val LD, Schwartz SH, Kerns MR, Deikman J. Development of a high oil trait for maize. In: Kriz AL, Larkins BA, editors. Biotechnology in Agriculture and Forestry, Biotechnology in Agriculture and Forestry. Vol 63. Berlin: Springer; 2009. pp. 303–323. [Google Scholar]
- 14.Schnable PS, et al. The B73 maize genome: Complexity, diversity, and dynamics. Science. 2009;326(5956):1112–1115. doi: 10.1126/science.1178534. [DOI] [PubMed] [Google Scholar]
- 15.Huttlin EL, et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell. 2010;143(7):1174–1189. doi: 10.1016/j.cell.2010.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Liu H, Sadygov RG, Yates JR., 3rd A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem. 2004;76(14):4193–4201. doi: 10.1021/ac0498563. [DOI] [PubMed] [Google Scholar]
- 17.Shen B, et al. Expression of ZmLEC1 and ZmWRI1 increases seed oil production in maize. Plant Physiol. 2010;153(3):980–987. doi: 10.1104/pp.110.157537. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bowman VB, Huang V, Huang AH. Expression of lipid body protein gene during maize seed development. Spatial, temporal, and hormonal regulation. J Biol Chem. 1988;263(3):1476–1481. [PubMed] [Google Scholar]
- 19.Halford NG, Shewry PR. The structure and expression of cereal storage protein genes. In: Olsen O-A, editor. Endosperm, Plant Cell Monographs. Vol 8. Berlin: Springer; 2007. pp. 195–218. [Google Scholar]
- 20.Huang AHC. Oil bodies and oleosins in seeds. Annu Rev Plant Physiol Plant Mol Biol. 1992;43(1):177–200. [Google Scholar]
- 21.Serna A, et al. Maize endosperm secretes a novel antifungal protein into adjacent maternal tissue. Plant J. 2001;25(6):687–698. doi: 10.1046/j.1365-313x.2001.01004.x. [DOI] [PubMed] [Google Scholar]
- 22.Reyes FC, et al. Delivery of prolamins to the protein storage vacuole in maize aleurone cells. Plant Cell. 2011;23(2):769–784. doi: 10.1105/tpc.110.082156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Thimm O, et al. MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J. 2004;37(6):914–939. doi: 10.1111/j.1365-313x.2004.02016.x. [DOI] [PubMed] [Google Scholar]
- 24.Sekhon RS, et al. Genome-wide atlas of transcription during maize development. Plant J. 2011;66(4):553–563. doi: 10.1111/j.1365-313X.2011.04527.x. [DOI] [PubMed] [Google Scholar]
- 25.Khan S, Rowe SC, Harmon FG. Coordination of the maize transcriptome by a conserved circadian clock. BMC Plant Biol. 2010;10(1):126. doi: 10.1186/1471-2229-10-126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Adams JA. Activation loop phosphorylation and catalysis in protein kinases: Is there functional evidence for the autoinhibitor model? Biochemistry. 2003;42(3):601–607. doi: 10.1021/bi020617o. [DOI] [PubMed] [Google Scholar]
- 27.Popescu SC, et al. MAPK target networks in Arabidopsis thaliana revealed using functional protein microarrays. Genes Dev. 2009;23(1):80–92. doi: 10.1101/gad.1740009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Pysh LD, Aukerman MJ, Schmidt RJ. OHP1: A maize basic domain/leucine zipper protein that interacts with opaque2. Plant Cell. 1993;5(2):227–236. doi: 10.1105/tpc.5.2.227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zheng P, et al. A phenylalanine in DGAT is a key determinant of oil content and composition in maize. Nat Genet. 2008;40(3):367–372. doi: 10.1038/ng.85. [DOI] [PubMed] [Google Scholar]
- 30.Comparot-Moss S, Denyer K. The evolution of the starch biosynthetic pathway in cereals and other grasses. J Exp Bot. 2009;60(9):2481–2492. doi: 10.1093/jxb/erp141. [DOI] [PubMed] [Google Scholar]
- 31.Pebay-Peyroula E, et al. Structure of mitochondrial ADP/ATP carrier in complex with carboxyatractyloside. Nature. 2003;426(6962):39–44. doi: 10.1038/nature02056. [DOI] [PubMed] [Google Scholar]
- 32.Feng J, et al. Tyrosine phosphorylation by Src within the cavity of the adenine nucleotide translocase 1 regulates ADP/ATP exchange in mitochondria. Am J Physiol Cell Physiol. 2010;298(3):C740–C748. doi: 10.1152/ajpcell.00310.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Liu F, et al. The amylose extender mutant of maize conditions novel protein-protein interactions between starch biosynthetic enzymes in amyloplasts. J Exp Bot. 2009;60(15):4423–4440. doi: 10.1093/jxb/erp297. [DOI] [PubMed] [Google Scholar]
- 34.Tetlow IJ, et al. Protein phosphorylation in amyloplasts regulates starch branching enzyme activity and protein-protein interactions. Plant Cell. 2004;16(3):694–708. doi: 10.1105/tpc.017400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Petricka JJ, Benfey PN. Reconstructing regulatory network transitions. Trends Cell Biol. 2011;21(8):442–451. doi: 10.1016/j.tcb.2011.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Bily AC, et al. Dehydrodimers of ferulic acid in maize grain pericarp and aleurone: Resistance factors to Fusarium graminearum. Phytopathology. 2003;93(6):712–719. doi: 10.1094/PHYTO.2003.93.6.712. [DOI] [PubMed] [Google Scholar]
- 37.de O Buanafina MM. Feruloylation in grasses: Current and future perspectives. Mol Plant. 2009;2(5):861–872. doi: 10.1093/mp/ssp067. [DOI] [PubMed] [Google Scholar]
- 38.Browse J. Jasmonate passes muster: A receptor and targets for the defense hormone. Annu Rev Plant Biol. 2009;60(1):183–205. doi: 10.1146/annurev.arplant.043008.092007. [DOI] [PubMed] [Google Scholar]
- 39.Kaufmann K, Pajoro A, Angenent GC. Regulation of transcription in plants: Mechanisms controlling developmental switches. Nat Rev Genet. 2010;11(12):830–842. doi: 10.1038/nrg2885. [DOI] [PubMed] [Google Scholar]
- 40.Moreno-Risueno MA, Van Norman JM, Benfey PN. Transcriptional switches direct plant organ formation and patterning. Curr Top Dev Biol. 2012;98:229–257. doi: 10.1016/B978-0-12-386499-4.00009-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Ciceri P, et al. Phosphorylation of Opaque2 changes diurnally and impacts its DNA binding activity. Plant Cell. 1997;9(1):97–108. doi: 10.1105/tpc.9.1.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Vert G, Walcher CL, Chory J, Nemhauser JL. Integration of auxin and brassinosteroid pathways by Auxin Response Factor 2. Proc Natl Acad Sci USA. 2008;105(28):9829–9834. doi: 10.1073/pnas.0803996105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Baerenfaller K, et al. Systems-based analysis of Arabidopsis leaf growth reveals adaptation to water deficit. Mol Syst Biol. 2012;8:606. doi: 10.1038/msb.2012.39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Vernoud V, Hajduch M, Khaled A-S, Depege N, Rogowsky PM. Maize embryogenesis. Maydica. 2005;50(3-4):469–483. [Google Scholar]
- 45.Yin L-L, Xue H-W. The MADS29 transcription factor regulates the degradation of the nucellus and the nucellar projection during rice seed development. Plant Cell. 2012;24(3):1049–1065. doi: 10.1105/tpc.111.094854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Schnable JC, Freeling M. Genes identified by visible mutant phenotypes show increased bias toward one of two subgenomes of maize. PLoS ONE. 2011;6(3):e17855. doi: 10.1371/journal.pone.0017855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Landry CR, Levy ED, Michnick SW. Weak functional constraints on phosphoproteomes. Trends Genet. 2009;25(5):193–197. doi: 10.1016/j.tig.2009.03.003. [DOI] [PubMed] [Google Scholar]
- 48.Qiao H, et al. Processing and subcellular trafficking of ER-tethered EIN2 control response to ethylene gas. Science. 2012;338(6105):390–393. doi: 10.1126/science.1225974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Charest PG, et al. A Ras signaling complex controls the RasC-TORC2 pathway and directed cell migration. Dev Cell. 2010;18(5):737–749. doi: 10.1016/j.devcel.2010.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Castellana NE, et al. Discovery and revision of Arabidopsis genes by proteogenomics. Proc Natl Acad Sci USA. 2008;105(52):21034–21038. doi: 10.1073/pnas.0811066106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Chalkley RJ, Clauser KR. Modification site localization scoring: Strategies and performance. Mol Cell Proteomics. 2012;11(5):3–14. doi: 10.1074/mcp.R111.015305. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







