Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2013 Jan 14;110(5):E435–E444. doi: 10.1073/pnas.1222061110

Comprehensive developmental profiles of gene activity in regions and subregions of the Arabidopsis seed

Mark F Belmonte a,1,2, Ryan C Kirkbride a,1, Sandra L Stone a,3, Julie M Pelletier a, Anhthu Q Bui b,4, Edward C Yeung c, Meryl Hashimoto a, Jiong Fei a, Corey M Harada a, Matthew D Munoz a,5, Brandon H Le b, Gary N Drews d, Siobhan M Brady a,e, Robert B Goldberg b,6, John J Harada a,6
PMCID: PMC3562769  PMID: 23319655

Significance

Seeds are complex structures that are comprised of the embryo, endosperm, and seed coat. Despite their importance for food, fiber, and fuel, the cellular processes that characterize different regions of the seed are not known. We profiled gene activity genome-wide in every organ, tissue, and cell type of Arabidopsis seeds from fertilization through maturity. The resulting mRNA datasets provide unique insights into the cellular processes that occur in understudied seed regions, revealing unexpected overlaps in the functional identities of seed regions and enabling predictions of gene regulatory networks. This dataset is an essential resource for studies of seed biology.

Keywords: laser-capture microdissection, mRNA localization, transcriptome

Abstract

Seeds are complex structures that consist of the embryo, endosperm, and seed-coat regions that are of different ontogenetic origins, and each region can be further divided into morphologically distinct subregions. Despite the importance of seeds for food, fiber, and fuel globally, little is known of the cellular processes that characterize each subregion or how these processes are integrated to permit the coordinated development of the seed. We profiled gene activity genome-wide in every organ, tissue, and cell type of Arabidopsis seeds from fertilization through maturity. The resulting mRNA datasets offer the most comprehensive description of gene activity in seeds with high spatial and temporal resolution, providing unique insights into the function of understudied seed regions. Global comparisons of mRNA populations reveal unexpected overlaps in the functional identities of seed subregions. Analyses of coexpressed gene sets suggest that processes that regulate seed size and filling are coordinated across several subregions. Predictions of gene regulatory networks based on the association of transcription factors with enriched DNA sequence motifs upstream of coexpressed genes identify regulators of seed development. These studies emphasize the utility of these datasets as an essential resource for the study of seed biology.


The significance of seeds is reflected by their relevance to diverse biological areas. Evolutionarily, the ability of flowering plants to make seeds has conferred significant selective advantages, accounting, in part, for their dominance among the Plantae. The seed habit facilitates fertilization in nonaqueous environments, provides protection and nutrients for the developing embryo, and permits the embryo to remain quiescent until conditions are favorable for seedling development (1). Seeds are a key to global food security, because they account for the large majority of calories consumed by humans. An estimated 70–100% more food will need to be produced worldwide by 2050 without an appreciable increase in arable land and despite global climate change (2). A detailed understanding of seed development may enable the design of cogent strategies to enhance seed quality and yield.

The developmental significance of seeds is that they are complex yet elegant structures, consisting of embryo, endosperm, and seed-coat regions that are each divided into subregions (3). The complexity of the seed originates with its precursor, the ovule, which consists of the female gametophyte embedded within integument layers. Seed development is initiated with the fusion of the egg and central cells of the female gametophyte with two sperm cells from the pollen tube. This double fertilization, unique to flowering plants, produces the progenitors of the embryo and endosperm regions of the seed, respectively. Patterning and morphological differentiation occur in the embryo and endosperm regions early in seed development, during the morphogenesis phase. In many plants, including Arabidopsis, the embryo undergoes stereotypic cell-division patterns, differentiating into the embryo proper that becomes the body of the vegetative plant and the suspensor, an ephemeral structure that serves as a conduit between the embryo proper and the seed coat (Fig. 1 A–F). The primary endosperm cell undergoes nuclear but not cell divisions, and nuclei migrate to form three subregions: micropylar, which is nearest the young embryo; peripheral, in the center of the endosperm region; and chalazal, at the pole opposite to the embryo. Cellularization of the endosperm proceeds in a wave-like manner from the micropylar to chalazal end (4). Ovule integument cells divide and differentiate into the distinct cell types of the seed coat that envelope the embryo and endosperm. Late in seed development during the maturation phase, the embryo accumulates storage macromolecules and becomes tolerant of desiccation. Although development of these subregions has been well-characterized morphologically, little is known of the cellular processes that occur in these subregions or how the development of the subregions is coordinated within the context of seed development.

Fig. 1.

Fig. 1.

Gene activity in Arabidopsis seed regions and subregions throughout Development. (A–F) Representation of Arabidopsis seeds at the (A) preglobular stage, (B) globular stage, (C) heart stage, (D) linear cotyledon stage, and (E) mature green stage. (F) Diagram of a seed showing micropylar (MP) and chalazal (CZ) poles. (G) Number of distinct mRNAs detected in seed subregions (colored bars), regions, and seeds (light gray bars) at different stages. Dark gray bars indicate the number of distinct transcription factor mRNAs. Lists of mRNAs and their levels are in Dataset S2. (H) Number of distinct mRNAs that accumulate specifically in a subregion or region at a given stage. Subregion and region-specific mRNAs are listed in Dataset S3. (I) Number of distinct mRNAs that accumulate at a specific stage in a subregion or region. Stage-specific mRNAs are listed in Dataset S3. Abbreviations are given in Table 1.

A key to dissecting seed development is to obtain an integrated understanding of gene activity, and therefore the cellular processes that occur in seed regions throughout development. We previously analyzed genes expressed in developing whole seeds at several developmental stages and identified seed-specific genes and transcription factors, and these data provided clues about temporally regulated processes that occur during seed development (5). Questions remain, however, about the processes that occur specifically in a subregion and the interactions among different regions. A number of recent studies have reported gene activity in specific seed regions at the whole-genome level, such as the embryo, endosperm, and seed coat (reviewed by ref. 6). However, these studies do not enable an integrated understanding of seed development, because most focused on a specific stage of seed development and few reported gene activity in more than one region. Here, we describe gene activity genome-wide in all subregions and regions of seeds of the model plant Arabidopsis, from fertilization through maturity. The temporal and spatial integration of cellular and physiological processes in multiple subregions and stages permits seminal insights into the developmental processes that characterize specific seed regions and the gene regulatory programs that underlie seed development. Use of uniform platforms of subregion isolation and RNA analyses permit direct comparisons of mRNA levels in different subregions and stages, enabling an integrated understanding of seed development.

Results

Spatial and Temporal Resolution of mRNA Profiles During Seed Development.

We profiled mRNA populations from six to seven seed subregions at five stages of development (Fig. 1 A–E and Fig. S1 A–E) to obtain the most comprehensive description of gene activity in seed development, representing 31 combinations of subregions and stages. Laser-capture microdissection (LCM) (Materials and Methods) was used to isolate the embryo proper (EP) and suspensor (SUS) of the embryo region, micropylar (MCE), peripheral (PEN), and chalazal (CZE) subregions of the endosperm region, and the chalazal (CZSC) and distal (SC) seed coat (Fig. S1 F–Q and Dataset S1, Table S1). The subregions were isolated in replicate at the preglobular, globular, and heart stages that collectively represent the morphogenesis phase (Fig. 1 A–C). Subregions isolated at the mature green stage correspond to the maturation phase (Fig. 1E), whereas the linear-cotyledon stage (Fig. 1D) is a transition between the two phases. All subregions and stages and their abbreviations are listed in Table 1. The ephemeral SUS was isolated only at the globular stage because an average of 1,700 captured subregions were needed for each biological replicate (Dataset S1, Table S2). mRNAs in each subregion were detected and quantified using stringent analyses of Affymetrix ATH1 GeneChip hybridization data (Materials and Methods and Dataset S2). These data are available at the Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/geo) as series GSE12404 and in an interactive form at http://seedgenenetwork.net. Exhaustive control experiments, including validation of relative mRNA levels using quantitative RT-PCR (qRT-PCR), comparisons of mRNA accumulation patterns with promoter activities, and confirmation that mRNA sequence amplification was unbiased, established that the LCM seed dataset is representative of subregion RNA populations, qualitatively and quantitatively (Fig. S2 and Dataset S1, Tables S3 and S4).

Table 1.

Abbreviations for developmental stages and seed subregions

Abbreviation Region
Stage identifiers
 pg Preglobular
 g Globular
 h Heart
 lc Linear cotyledon
 mg Mature green
Subregion identifiers
 EP Embryo proper
 SUS Suspensor
 MCE Micropylar endosperm
 PEN Peripheral endosperm
 CZE Chalazal endosperm
 CZSC Chalazal seed coat
 SC Distal seed coat

Fig. 1G summarizes gene activity in subregions, regions, and whole seeds at each developmental stage. We detected between ∼6,000 and 13,000 distinct mRNAs in each subregion. The number of mRNAs detected in a region, calculated as the union of mRNAs present in each of its constituent subregions, was not appreciably higher than that of individual subregions. Similarly, the union of mRNAs present in embryo, endosperm, and seed-coat regions, representing whole-seed mRNA number, was not appreciably higher than that of a single region. These results indicate that there is substantial overlap in the genes expressed in regions and subregions of a seed. An average of ∼14,800 distinct mRNAs was detected throughout the seed at each stage, and collectively a minimum of 17,594 distinct mRNAs were detected in at least one subregion and stage of seed development. These are minimum estimates given the stringency of the analyses, the absence of probes for ∼17.5% of Arabidopsis genes on the ATH1 GeneChip, and the lack of consistent detection of mRNAs present at the lowest prevalence levels (fraction of mRNA < 10−5) in GeneChip experiments (5).

Global Comparisons of mRNA Populations Reveal Functional Relationships Among Seed Subregions.

To provide a framework to understand how seed development is coordinated, we compared mRNA populations from all 31 combinations of subregions and stages globally using principal component analysis (PCA). The rationale was that subregions identified as being most closely associated in the analysis were likely to share the greatest similarity in overall gene expression and, therefore, cellular functions. Fig. 2 shows that four groups of subregions were identified by the analysis: (i) the EP, SUS, MCE, and PEN subregions at the preglobular to heart/linear cotyledon stage (green); (ii) CZE subregions at these early stages (yellow); (iii) all embryo and endosperm subregions late in development (red); and (iv) all CZSC and SC subregions at all stages (blue).

Fig. 2.

Fig. 2.

PCA of seed subregion mRNA populations. PCA plot shows four distinct groups of subregion mRNA populations: subregions of the seed coat region at all stages (blue), EP, SUS, MCE, and PEN subregions at early stages (green), CZE subregions at early stages (yellow), and the EP and all endosperm subregions at the maturation phase (red). Principal components one through three collectively represent 55.6% of the variance in the dataset. Abbreviations are given in Table 1.

The finding that the MCE and PEN early in development shared greater similarity with both embryo subregions than with the CZE was surprising, because the embryo and endosperm arise from separate fertilization events. Late in development, all embryo and endosperm subregions, including the CZE, formed a group that was distinct from the same subregions early in development, suggesting that a major shift in gene expression occurs during the transition from early to late stages. The CZSC and SC at all stages grouped, suggesting that these subregions of the seed-coat region share greater similarity with each other than with embryo and endosperm subregions. Identical results were obtained using hierarchical clustering of mRNA populations from all subregions and stages (Fig. S3A), further supporting the biological significance of the groupings. The same relationships among subregions were obtained when mRNAs at each stage were clustered hierarchically (Fig. S3 B–F). Taken together, the results suggest that the maternally derived seed coat differs fundamentally from the embryo and endosperm that both arise from fertilization events and that embryo and endosperm subregions share a complex relationship that is dependent on spatial and temporal cues.

Diverse Sets of Coexpressed Genes Underlie Seed Development.

Subregion-specific gene sets.

We identified mRNAs that accumulate specifically in a subregion to begin to discover the gene-expression programs that underlie the complex relationship among subregions of the seed. We defined subregion-specific mRNAs as those that accumulated at a statistically significant (q < 0.001, mixed-model ANOVA), fivefold or higher level in one subregion relative to all others at a given stage. The fivefold cutoff value was based on the finding that fold-change values for mRNAs significantly higher in one subregion versus all others ranged from 1.01 to 210, with a median of 4.7. Fig. 1H shows that between 0 (mature-green PEN) and 545 (heart CZE) subregion-specific mRNAs were identified (Dataset S3). Thus, few mRNAs accumulated specifically at the cell or tissue level relative to the total number of mRNAs in each subregion.

To determine how subregion-specific mRNAs changed over time, we clustered mRNAs from all subregions and stages to define 47 dominant expression patterns (DPs) (Fig. 3A and Fig. S4) (7) and assigned mRNAs to these patterns on the basis of correlation (Pearson’s correlation > 0.8) (Dataset S4). Several of the coexpressed gene sets consisted of mRNAs that accumulated primarily in one subregion (Fig. 3A and Fig. S4) (DPs 15, 18, 20, 21, 24, 29, and 37), and an average of 70% of mRNAs in these gene sets were subregion-specific (≥fivefold enrichment, q < 0.001). The accumulation patterns show that some mRNAs accumulated predominately in one subregion at several stages. In contrast, each subregion, with the exception of the mature-green PEN, contained between 6 and 135 mRNAs that accumulated subregion-specifically at only one stage. Thus, some genes were expressed subregion-specifically at a single stage, whereas others were expressed specifically over several stages. These temporal variations of subregion-specific expression patterns add complexity to the gene regulatory networks that operate during seed development.

Fig. 3.

Fig. 3.

Dominant patterns of gene expression during seed development. (A) Forty-seven DPs were identified using Fuzzy K means clustering of the 50% most variant mRNAs in all seed subregions and stages. Bar graphs depict median mRNA levels in each subregion (colored bars) at each stage (left to right, preglobular to mature-green stage). DPs representing the indicated stage-specific, subregion-specific, region-specific, and CZE-delayed coexpressed gene sets are shown. Remaining DPs are shown in Fig. S4, and mRNAs in all DPs are listed in Dataset S4. The average number of mRNAs in each DP gene set was 103. (B) Heat map of conceptualized CZE-delayed mRNA accumulation patterns in embryo and endosperm subregions. mRNA accumulation in seed coat subregions is not shown.

Stage-specific gene sets.

Because global comparisons of mRNA populations suggested a concerted temporal shift in gene expression during development (Fig. 2), we identified mRNAs in each subregion that accumulated specifically at a single stage. We defined stage-specific mRNAs as those present at a statistically significant (q < 0.001, mixed-model ANOVA), fivefold or higher level at one stage relative to all others in a given subregion (Dataset S3). Fig. 1I shows that relatively few genes were expressed stage-specifically early during seed development. Rather, an average of 71% of the stage-specific mRNAs of each subregion accumulated at the mature-green stage. These results indicate that a major transition in gene activity occurs at the mature-green stage.

Many of the same mature-green stage-specific mRNAs accumulated in all embryo and endosperm subregions. Two coexpressed gene sets, DP 12 and DP 19 (Fig. 3A), consisted of mRNAs that accumulated primarily at the mature-green stage in all embryo and endosperm subregions and to a lesser extent in seed-coat subregions. An average of 58% and 66% of mRNAs in DP 12 and DP 19, respectively, were also designated as mature-green stage-specific mRNAs (≥fivefold enrichment, q < 0.001) in the EP, MCE, PEN, and CZE. By contrast, averages for the CZSC and SC were 16% and 14%, respectively. Taken together, these results suggest a common set of genes is coordinately up-regulated in embryo and endosperm subregions late in seed development.

Roles of Subregion-Specific Genes in Seed Development.

We obtained insight into the cellular processes that characterize each subregion by identifying Gene Ontology (GO) terms and metabolic pathways that were significantly overrepresented (P < 0.001, hypergeometric distribution) (Dataset S3) among subregion-specific mRNAs. Fig. 4A lists GO terms and metabolic pathways for subregion-specific mRNAs that were overrepresented at two or more stages and/or for the DP gene set that exhibits subregion specificity. The analysis confirmed known functions for some subregions and provided a glimpse into the specific functions of other subregions whose roles in seed development were not known.

Fig. 4.

Fig. 4.

Functions of subregion-specific genes. (A) Heat map showing the P value significance of enrichment of GO terms () or metabolic pathways (*) for subregion-specific mRNAs (Dataset S3). The listed GO terms are for biological processes or cellular components and metabolic pathways that were overrepresented at two or more stages and/or for the DP gene set that exhibit subregion specificity. (B) Histochemical staining of starch granules (arrowheads) in the suspensor. (Scale bar, 3 µm.) (Inset) The location of the enlarged area relative to the embryo proper and suspensor. (C–E) Relative mRNA levels determined in GeneChip experiments (bar plots) and qRT-PCR experiments (line plots) for the indicated genes involved in (C) auxin biosynthesis, (D) auxin response, and (E) polar auxin transport.

Embryo proper.

EP-specific mRNAs were significantly enriched for GO terms known to be associated with patterning events that occur during embryo development, such as determination of bilateral symmetry and abaxial cell fate specification (8).

Seed coat.

SC-specific mRNAs were overrepresented for processes associated with the synthesis of flavonoids that serve to provide protection for seeds against biotic and abiotic stresses (9).

Peripheral endosperm.

At the preglobular and globular stages, PEN-specific mRNAs were overrepresented for GO terms for chloroplast compartments and metabolic pathways related to photosynthesis. These processes were not known to occur in the PEN, but their occurrence was validated functionally as discussed below.

Chalazal seed coat.

mRNAs associated with trehalose and cytidine metabolism were overrepresented in the CZSC. Although trehalose plays an essential role in seed development (10), localization of key enzymes required for its synthesis to the CZSC was not known previously.

Suspensor.

The SUS is an embryonic structure of 8–10 cells, and little is known about its cellular functions (11). Because SUS-specific mRNAs in DP 15 (Fig. S4) that were overrepresented for the GO term, pattern-specification process, encode efflux transporters for the hormone auxin, we analyzed a number of mRNAs involved in auxin signaling. Fig. 4 C–E shows that mRNAs encoding an auxin biosynthetic enzyme, YUC3, an auxin efflux carrier, PIN7, and a transcription factor responsive to auxin, ARF16, were more prevalent in the SUS than in the EP. These results are consistent with the SUS serving as a site of perception of the auxin gradient across the SUS and EP early in seed development that is essential for patterning of the embryo (12, 13). Additionally, they open the possibility that the SUS may serve as an auxin source for the gradient. We also discovered that mRNAs encoding all enzymes involved in starch biosynthesis were detected in the SUS (Dataset S1, Table S5) and demonstrated the presence of a functional pathway by showing that starch grains accumulate in SUS cells (Fig. 4B).

Chalazal endosperm.

The CZE is a unique subregion developmentally. Early in seed development, the CZE possessed the largest number of subregion-specific mRNAs (Fig. 1H). Moreover, the CZE was highly enriched for mRNAs that were only detected during seed development. Seed-specific mRNAs were identified as those present in at least one subregion and stage of seed development but that were not detected in GeneChip experiments in seedlings, leaves, stems, or roots of vegetatively growing plants and flower buds or ovules of reproductively growing plants, as described previously (5). Of at least 1,316 seed-specific mRNAs (Dataset S2), the largest fraction accumulated predominately in the CZE, as shown in Fig. 5A. Additionally, 244 of 788 CZE-specific mRNAs accumulated seed specifically. These results are consistent with our previous report that the promoters of several seed-specific transcription factor genes are active specifically in the CZE (5). Together, these gene expression patterns support the conclusion that the CZE differs fundamentally from other subregions early in seed development, suggesting that novel processes occur in the CZE.

Fig. 5.

Fig. 5.

CZE is a unique seed subregion developmentally. (A) Hierachical clustering of seed-specific mRNAs. The largest number of seed-specific mRNAs accumulate primarily in the CZE. (B) Heat map depicting relative levels of mRNAs encoding rate-limiting enzymes for gibberellic acid (GA; GA3ox), abscisic acid (ABA; NCED), and cytokinin (CK; IPT) biosynthesis.

Given the uniqueness of the CZE, we were interested to understand its role in seed development. CZE-specific mRNAs were overrepresented for the GO term ubiquitin-dependent protein catabolism (Fig. 4A), suggesting a potential regulatory role for the CZE. We also showed that rate-limiting enzymes for the biosynthesis of the hormones gibberellic acid, abscisic acid, and cytokinin, accumulated primarily, but not exclusively, in the CZE (Fig. 5B), consistent with other reports showing that hormone metabolism genes are expressed in the CZE (1416). Because of the importance of these hormones for seed development, these results suggest that the CZE may serve as a communication hub that integrates developmental processes within the seed.

Integration of Gene Activity and Cellular Function Across Subregions and Stages.

Gene sets temporally regulated in embryo and endosperm subregions.

Although many gene sets expressed subregion- and stage-specifically were identified, we were interested to know the extent to which gene expression was coordinated across distinct subregions and stages during seed development. We identified gene sets that were coexpressed in several subregions and stages. The most prominent coexpression pattern, representing 11 DPs (Fig. 3A and Fig. S4) (DP 1, 6, 7, 8, 9, 14, 22, 26, 30, 33, and 36), involved mRNAs that accumulated in the EP and all endosperm subregions but whose accumulation in the CZE was delayed relative to the other subregions (Fig. 3B).

Functions of CZE-delayed gene sets.

The expression patterns of CZE-delayed gene sets (Fig. 3B) suggest that specific cellular processes that occur in all embryo and endosperm subregions are delayed in the CZE. We identified significantly enriched (P < 0.001, hypergeometric distribution) GO terms and metabolic pathways for the CZE-delayed gene sets and showed that several gene sets were implicated to play important roles in seed development (Dataset S4). For example, the DP 26 gene set was overrepresented for GO terms related to cytokinesis and the phragmoplast, a cytoskeletal structure specific to dividing plant cells (Figs. 3A and 6E). Cytokinesis occurs in the embryo throughout the morphogenesis phase early in development. By contrast, the endosperm initially undergoes nuclear but not cell divisions, with subsequent cellularization and cell division occurring sequentially from the micropylar to the chalazal end (4). Thus, the DP 26 coexpression pattern is coincident with the patterns of cytokinesis during seed development.

Fig. 6.

Fig. 6.

Functions of CZE-delayed coexpressed gene sets. (A) Autofluorescent chloroplasts in the endosperm of a globular-stage seed. (B) Transmission electron microscopy of chloroplasts (arrows) in the PEN. (C) Histochemical staining of starch granules (arrowheads) in the SC, PEN, and EP. (D) Transmission electron micrograph of oil bodies (arrows) in cellularized PEN. (E) Heat map showing P value significance of enrichment of selected GO terms () or metabolic pathways (*) associated with the indicated CZE-delayed gene sets. (F) Heat maps showing mRNAs involved in the RNA-dependent DNA methylation pathway (Upper) and 1,155 probesets corresponding to transposons (Lower) (21). (G) Detection of protein bodies (PB) in the EP and PEN. (H) Heat map showing P values for GO-term enrichment of mature-green stage-specific mRNAs and the indicated DPs at the globular and mature-green stages. (Scale bars: 25 µm in A, 0.5 µm in B, 10 µm in C, 3 µm in D, and 10 µm in G.)

Another CZE-delayed gene set, DP 33, was significantly enriched for GO terms and metabolic pathways related to photosynthesis and carbon metabolism, including chloroplast structure and function, glycolysis, gluconeogenesis, starch biosynthesis, and fatty acid biosynthesis (Figs. 3A and 6E, and Dataset S4). Similarly, DP 8 was associated with glycolysis and gluconeogenesis. These associations were surprising, because photosynthesis and carbon metabolism are known to occur in the embryo, but much less was known about these processes in the endosperm (17). In support of the prediction that starch and lipid biosynthesis occur in the endosperm, qRT-PCR experiments showed that mRNAs encoding all enzymes required for starch and fatty acid biosynthesis were detected in endosperm subregions (Dataset S1, Table S5). The enriched GO terms were predictive of cellular function, because differentiated chloroplasts, starch grains, and lipids were detected in endosperm cells (Fig. 6 A–D). Thus, processes involved in photosynthesis and carbon metabolism that are known to characterize the embryo also occur in all three endosperm subregions, although these processes are delayed in the CZE.

Two other CZE-delayed gene sets, DP 6 and DP 36, were significantly enriched for the DNA methylation GO term (Figs. 3A and 6E, and Dataset S4). DNA methylation in plants is mediated primarily through three pathways involving METHYLTRANSFERASE1 (MET1), CHROMOMETHYLASE3 (CMT3), and the RNA-directed DNA methylation (RdDM) enzymes (18). Of the three, only mRNAs involved in RdDM exhibited a CZE-delayed accumulation pattern (Fig. 6F, Upper). DNA methylation functions primarily to silence transposon activity (19), and the endosperm is notable because transposons become hypomethylated in the central cell of the female gametophyte, the precursor of the endosperm (20). Consistent with these observations, transposon activity was high in all three endosperm subregions early in seed development but decreased late in seed development coincident with the increase in RdDM mRNA levels (Fig. 6F, Lower) (21). The anticorrelation between RdDM mRNAs and transposon activity opens the possibility that DNA methylation is required to silence transposons late in endosperm development.

Reprogramming of Seed Development During the Maturation Phase.

Maturation occurs in embryo and endosperm subregions.

The reprogramming of gene expression that occurs late in seed development appears to be associated with the onset of the maturation phase. Several gene sets comprised of mRNAs that accumulated primarily at the mature-green stage, including mature-green stage-specific mRNAs in the EP, MCE, PEN, and CZE (Fig. 1I) and DP 12 and 19 (Fig. 3A), were all significantly enriched (P < 0.001, hypergeometric distribution) for GO terms characteristic of seed maturation, including nutrient reservoir activity, lipid storage, and seed oil body biogenesis, among others (Fig. 6H and Datasets S3 and S4).

The finding that the same sets of maturation-related mRNAs accumulated in the embryo and in all three endosperm subregions was unexpected. Although the embryo undergoes maturation, and lipids are known to accumulate in the endosperm (22), the extent to which the maturation program occurs in endosperm subregions was not known. We showed that cellular structures that accumulate in maturation-phase embryos, such as storage protein bodies (Fig. 6G) and oil bodies (Fig. 6D), were detected in endosperm subregions. These results provide compelling evidence that maturation processes associated with seed filling occur in the embryo and all three endosperm subregions and involve many of the same genes.

Predicted regulatory circuitry controlling genes expressed during the maturation phase.

One key to improving seeds as food is to define the gene regulatory networks that control the accumulation and composition of storage products during the maturation phase. We developed a framework to predict transcriptional modules that link transcription factors with their potential coexpressed target genes (Materials and Methods). The strategy associates DNA sequence motifs that are significantly enriched in the upstream regions of coexpressed genes (P < 0.001, hypergeometric distribution) with coexpressed transcription factors known or predicted to bind the overrepresented motifs. We showed that several transcriptional modules were predicted for seed-coat region-specific genes linking enriched MYB, HD-ZIP IV, and AG DNA sequence motifs with transcription factors known to be involved in seed coat and ovule development, such as MYB5, GL2, SHP2, STK, SHP1, and SEP (Fig. S5A and Dataset S1, Table S6) (23, 24). Thus, the approach identified known developmental regulators.

Fig. 7E and Dataset S1, Table S6 show that the DP 12 gene set, consisting of mRNAs that accumulate predominately in embryo and endosperm subregions during the maturation phase, defined a transcriptional module in which significantly enriched DNA motifs with a G-box core, including ABRE, ABRE-like, and DPBF1 and -2, were associated with bZIP transcription factors. Many of these overrepresented DNA sequence motifs are known to characterize the promoters of maturation expressed genes (25), and two of the associated transcription factors, EEL and bZIP67, play roles in regulating maturation genes (26, 27). Furthermore, transcriptional modules derived from coexpressed genes associated with a specific GO term were also identified. These submodules identify potential regulatory circuits that control processes associated with the GO term during the maturation phase. We also generated transcriptional modules for mature-green stage-specific genes expressed in the EP, MCE, PEN, and CZE and showed that there was substantial overlap in the enriched DNA motifs and associated transcription factors identified in each subregion (Fig. 7 A–D). The results suggest that maturation processes are regulated similarly but not identically in the embryo and endosperm subregions.

Fig. 7.

Fig. 7.

Predicted transcriptional modules regulating maturation in seeds. DNA motifs (green triangles) and GO terms (blue parallelogram) that are significantly overrepresented (P < 0.001, hypergeometric distribution) within the coexpressed gene set (open squircle). Coexpressed transcription factors are represented as circles. Transcriptional modules were predicted for mature-green stage-specific genes in the (A) EP, (B) MCE, (C) CZE, and (D) PEN, and for (E) DP 12. All four mature green-stage subregions possess transcriptional modules in which bZIP transcription factors known to regulate maturation genes such as bZIP67 (AT3G44460), EEL (AT2G41070), or ABI5 (AT2G36270) are associated with overrepresented G box-like DNA motifs such as ABRE and DPBF1 and -2. Edges in red indicate known interactions between transcription factors and DNA motifs, whereas dashed lines represent predicted interactions. All enriched DNA motifs and GO terms are listed in Dataset S1, Table S6.

Analyses of other coexpressed gene sets identified transcription factors known to play critical roles in seed development. For example, genes expressed primarily in the MCE [Fig. 3A (DP 18), Fig. S5D, and Dataset S1, Table S6] were enriched for the W-box DNA sequence motif that is predicted to associate with MINISEED3, a WRKY transcription factor that is expressed primarily in the MCE and is a regulator of seed size (28). Similarly, CCA1, a transcription factor involved in controlling seed dormancy as a central circadian clock regulator, was associated with the CCA1 binding-site motif that is enriched in the promoter of genes expressed early in endosperm development [Fig. 3A (DP 27), Fig. S5C, and Dataset S1, Table S6] (29). Thus, the transcriptional modules identified key regulators of seed development, suggesting their utility as predictive tools to provide insight into gene regulatory networks controlling seed development.

Discussion

We profiled RNA populations in every cell type, tissue, subregion, and region of Arabidopsis seeds throughout development to obtain an integrated understanding of the processes that underlie seed development. A minimum of 17,594 distinct mRNAs were detected in at least one subregion and stage, indicating that at least 60% of the Arabidopsis genome is expressed during seed development. The use of LCM facilitated gene discovery. Compared with our previous analyses of mRNA populations in whole-mount seeds (5), the LCM profiling experiments detected more mRNAs throughout seed development (17,594 vs. 15,577), a higher average number of mRNAs present in seeds at each stage (14,800 vs. 11,780), and a higher number of seed-specific mRNAs (1,316 vs. 289). The dataset provides the most comprehensive description of gene activity during seed development.

Coexpressed Gene Sets Inform the Cellular Processes that Underlie Seed Development.

The LCM profiling experiments describe global gene activity in seed subregions that were previously inaccessible to such analyses. Identification of both region-specific mRNAs and subregion-specific mRNAs (Fig. 1H) suggests that subregions within the same region have both shared and distinct functions. For example, mRNAs that accumulate specifically in the seed-coat region are overrepresented for the GO terms flavonoid biosynthetic process and proanthocyanidin biosynthetic process (Dataset S3), suggesting that processes associated with responses to biotic and abiotic stresses occur in both the SC and CZSC (9). By contrast, the CZSC alone is overrepresented for subregion-specific mRNAs associated with trehalose and cytidine metabolism, but these mRNAs are not detected at substantial levels in the SC (Fig. 4A and Dataset S3). Thus, distinct gene sets are involved in controlling region-specific and subregion-specific functions.

Our data suggest that the functional differentiation of subregions within a region occur through at least two distinct sets of processes. First, genes expressed specifically within a subregion appear to play a significant role in specifying its function. GO terms and metabolic pathways enriched for mRNAs specifically expressed in the EP, SC, and PEN accurately predict functions that are known or that we have shown to occur in these subregions (Figs. 4 and 6). Many subregion-specific genes are active at the preglobular stage, suggesting that subregion identity is specified at the earliest stage of seed development (Figs. 1H and 3A, and Fig. S4). For example, consistent with our finding that many genes are expressed specifically in each endosperm subregion at the preglobular stage, others have shown that the MCE, PEN, and CZE can be differentiated morphologically at the 16-nuclei stage that corresponds to the zygote-stage of seed development (30). Second, subregion function is also influenced by temporal differences in the expression of gene sets. CZE-delayed gene sets consist of mRNAs that accumulate later in the CZE than in the other embryo and endosperm subregions (Fig. 3B). Delayed accumulation of these mRNAs accounts, in part, for the finding that the CZE differs from the EP, SUS, MCE, and PEN early in seed development (Fig. 2). Together, these results define gene sets with diverse coexpression patterns that contribute to the overall complexity of seed development.

Gene Sets Associated with the Control of Seed Mass.

Seed mass is positively correlated with seedling survival and, therefore, is a determinant of plant fitness (31). The ability to modulate seed mass has important implications for altering crop yield. Several CZE-delayed gene sets are associated with processes that control seed mass (Figs. 3A and 6). For example, DP 26 is overrepresented for mRNAs involved in cytokinesis. Because the timing of endosperm cellularization is correlated with seed mass (32, 33), with smaller seeds undergoing early cellularization, mechanisms that regulate this gene set are likely to be involved in controlling seed mass. Similarly, a second CZE-delayed gene set, DP 33, is enriched for mRNAs involved in photosynthesis and carbon metabolism, and photosynthetic activity in seeds is correlated with seed biomass (34).

Two other CZE-delayed gene sets, DP 6 and DP 36, which are associated with DNA methylation via the RdDM pathway, may also be related to the control of seed size (Figs. 3A and 6). A potential tie between the RdDM pathway and seed mass is that DNA methylation is implicated to control the expression of many imprinted genes in the endosperm, genes that are expressed specifically or preferentially from either maternal or paternal alleles (20, 35). Imprinted genes are predicted and, in one case, shown to be involved in controlling resource allocation to the embryo, a process that is critical in determining seed mass (36, 37). Moreover, imprinted genes are often flanked by transposons, and the methylation status of the transposable element is thought to determine the activity of many imprinted genes (38, 39). Consistent with the interpretation that transposons affect the activity of neighboring genes, NRPD1a, a component of the RdDM pathway, is required to silence genes encoding endosperm-specific transcription factors that are adjacent to transposons (40). Thus, accumulation of RdDM mRNAs late in endosperm development that appears to correlate with transposon silencing and, presumably, DNA methylation may also result in the silencing of imprinted genes (Fig. 6F). We estimate that 35 of 47 imprinted genes (39) are down-regulated in the endosperm coincident with the activation of RdDM genes. Because imprinted genes are thought to enable the endosperm to promote early embryo development, silencing of these genes late in seed development may be required to allow induction of the maturation phase in the endosperm.

Coordinated Gene Expression in the Embryo and Endosperm and Its Relevance to the Origin of the Endosperm.

An overriding theme that emerges from this comprehensive developmental profile of mRNA populations is that there is extensive overlap in the gene-expression programs that characterize embryo and endosperm subregions. Although each subregion possesses mRNAs that accumulate specifically in that subregion (Fig. 1H), a global comparison of mRNA populations demonstrated unexpected similarities between embryo and endosperm subregions (Fig. 2). These similarities result, at least in part, from the large number of CZE-delayed genes that are coexpressed in EP, SUS, MCE, and PEN subregions early in seed development (Fig. 3 and Fig. S4). Thus, the same sets of genes that are associated with photosynthesis, carbon metabolism, cytokinesis, and DNA methylation are active in all embryo and endosperm subregions early in seed development, although their activity is delayed in the CZE. Consistent with the extensive overlap in embryo and endosperm gene activity, a large set of genes is up-regulated coordinately in the embryo proper and all endosperm subregions during the transition from the morphogenesis to the maturation phase, and many of the same putative regulators operate in the two seed regions [Fig. 3 (DP 12 and 19) and Fig. 7]. These results suggest that there is substantial coordination of the biological processes that occur in embryo and endosperm regions.

Parallels in embryo and endosperm expression programs have implications for resolving longstanding questions about the evolution of seeds. The endosperm region is unique to angiosperms, and two major hypotheses have been advanced to explain its evolutionary origin (41, 42). One hypothesis is that the endosperm is a modified supernumerary embryo resulting from a second fertilization event that acquired embryo-nourishing functions. The second hypothesis proposes that the endosperm is homologous with the gymnosperm female gametophyte, the development of which is promoted by a second fertilization event. Morphological analysis of endosperm development in basal angiosperm taxa suggests that there are shared features of early embryo and endosperm development, including unequal division and polarization of the initial cell and differential development at the micropylar and chalazal poles (43). Our results demonstrating strong overlap in embryo and endosperm gene activity are consistent with an embryo-based evolutionary origin of the endosperm, although we cannot exclude the possibility of homology between the endosperm and the female gametophyte.

Conclusions

The LCM seed dataset represents a robust resource to support studies of seed biology. We have demonstrated that the dataset can be used to identify sets of genes that are expressed in specific subregions and stages and other gene sets the expression patterns of which are integrated across multiple subregions and stages. Thus, these data define coexpressed gene sets with extremely high spatial and temporal resolution. Analysis of these coexpressed genes can accurately predict the biological function of seed subregions, providing fresh insights into the cellular processes that underlie seed development.

A key to understanding seed development is to define the regulatory circuitry that governs these diverse coexpressed gene sets. The dataset serves as a platform to identify the gene regulatory networks that operate during seed development, in part, by identifying the transcription factors that accumulate in spatially restricted locations within the seed at specific stages. We have shown that known regulators of seed development can be identified by the association of overrepresented DNA sequence motifs with coexpressed transcription factors.

The biological stories that we have presented demonstrate the utility of the dataset in uncovering new and significant information about the processes that underlie seed development. Although much remains to be learned about seed biology to obtain the basic information needed for the design of strategies to improve crops for agriculture and enhanced food security, we anticipate that this dataset will be a critical tool in enabling these discoveries.

Materials and Methods

Profiling Subregion mRNA Populations.

Siliques containing seeds of Arabidopsis thaliana (L.) Heynh, ecotype Wassilewskija (Ws-0) were staged according to criteria described previously (5) and detailed in Dataset S1, Table S1. Details about seed collection, histological protocols, and microdissection using a Leica LMD6000 Laser Microdissection System (Leica Microsystems) are given in SI Materials and Methods.

Total RNA was extracted, purified from captured microdissected subregions, analyzed, and amplified as described in SI Materials and Methods. The number of captured subregions per biological replicate and total RNA yields are summarized in Dataset S1, Table S2. Amplifed cDNA was hybridized with the Affymetrix GeneChip ATH1 Arabidopsis Genome Array as previously described (5). The effects of amplification on relative RNA levels were determined using qRT-PCR experiments on cDNA amplified from 2 ng of total RNA and cDNA synthesized from 1 µg of total RNA (ThermoScript RT-PCR Systems). Dataset S1, Table S4 shows that linear cDNA amplification did not appreciably alter the representation of mRNAs in the population.

Data Analysis.

GeneChip hybridization data were analyzed and detection calls for mRNAs (present, absent, or marginal) were made as previously described (5). For quantitative comparisons of mRNA levels, signal intensities from all 75 GeneChip experiments were normalized using RMA (44). Correlation between RMA normalized biological replicates averaged 0.96 (Pearson's correlation, Dataset S2) and ranged between 0.93 and 0.98, which was higher than that obtained using other normalization methods. Relative RNA levels were validated with qRT-PCR experiments as previously described (45). DNA sequences and efficiencies of primer pairs used for qRT-PCR experiments and comparison of relative mRNA levels determined in GeneChip and qRT-PCR experiments are given in Dataset S1, Table S3.

Mixed-model linear ANOVA, used to assess the significance of mRNA level comparisons in different samples, and identification of dominant expression patterns was done as previously described (7). Other analyses, including hierarchical clustering and bootstrapping analysis and PCA, are described in SI Materials and Methods.

Identification of Transcriptional Modules.

The software package, ChipEnrich (46), was modified to identify significantly enriched DNA sequence motifs upstream of coexpressed genes that were associated with transcription factors known or predicted to bind the motifs as described in SI Materials and Methods. Files used to generate the transcriptional modules are in Dataset S1, Table S6.

Microscopy.

Procedures for light and fluorescence microscopy, transmission electron microscopy, and confocal laser scanning microscopy are described in SI Materials and Methods.

Supplementary Material

Supporting Information

Acknowledgments

We thank Samantha Duong, Maichi Phan, Emilia Madejska, Xiaohua Lu, Alexander Olson, and Chen Cheng for technical assistance, and Bob Fischer for comments about the manuscript. This work was supported by grants from the National Science Foundation Plant Genome Program (to R.B.G. and J.J.H.) and a postdoctoral fellowship from the Natural Sciences and Engineering Research Council (to M.F.B.).

Footnotes

The authors declare no conflict of interest.

Data deposition: The data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE12404).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1222061110/-/DCSupplemental.

References

  • 1.Steeves TA. The evolution and biological significance of seeds. Can J Bot. 1983;61(12):3550–3560. [Google Scholar]
  • 2.Godfray HCJ, et al. Food security: The challenge of feeding 9 billion people. Science. 2010;327(5967):812–818. doi: 10.1126/science.1185383. [DOI] [PubMed] [Google Scholar]
  • 3.Ohto M, Stone SL, Harada JJ. Genetic control of seed development and seed mass. In: Bradford KJ, Nonogaki H, editors. Seed Development, Dormancy, and Germination. Oxford: Blackwell; 2007. pp. 1–24. [Google Scholar]
  • 4.Brown RC, Lemmon BE, Nguyen H, Olsen O-A. Development of endosperm in Arabidopsis thaliana. Sex Plant Reprod. 1999;12(1):32–42. [Google Scholar]
  • 5.Le BH, et al. Global analysis of gene activity during Arabidopsis seed development and identification of seed-specific transcription factors. Proc Natl Acad Sci USA. 2010;107(18):8063–8070. doi: 10.1073/pnas.1003530107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Harada JJ, Pelletier J. Genome-wide analyses of gene activity during seed development. Seed Sci Res. 2012;22(Suppl S1):S15–S22. [Google Scholar]
  • 7.Brady SM, et al. A high-resolution root spatiotemporal map reveals dominant expression patterns. Science. 2007;318(5851):801–806. doi: 10.1126/science.1146265. [DOI] [PubMed] [Google Scholar]
  • 8.Capron A, Chatfield S, Provart N, Berleth T. Embryogenesis: Pattern formation from a single cell. Arabidopsis Book. 2009;7:e0126. doi: 10.1199/tab.0126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pourcel L, Routaboul J-M, Cheynier V, Lepiniec L, Debeaujon I. Flavonoid oxidation in plants: From biochemical properties to physiological functions. Trends Plant Sci. 2007;12(1):29–36. doi: 10.1016/j.tplants.2006.11.006. [DOI] [PubMed] [Google Scholar]
  • 10.Schluepmann H, Paul M. Trehalose metabolites in Arabidopsis-elusive, active and central. Arabidopsis Book. 2009;7:e0122. doi: 10.1199/tab.0122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kawashima T, Goldberg RB. The suspensor: Not just suspending the embryo. Trends Plant Sci. 2010;15(1):23–30. doi: 10.1016/j.tplants.2009.11.002. [DOI] [PubMed] [Google Scholar]
  • 12.Wang JW, et al. Control of root cap formation by MicroRNA-targeted auxin response factors in Arabidopsis. Plant Cell. 2005;17(8):2204–2216. doi: 10.1105/tpc.105.033076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Weijers D, et al. Auxin triggers transient local signaling for cell specification in Arabidopsis embryogenesis. Dev Cell. 2006;10(2):265–270. doi: 10.1016/j.devcel.2005.12.001. [DOI] [PubMed] [Google Scholar]
  • 14.Hu JH, et al. Potential sites of bioactive gibberellin production during reproductive growth in Arabidopsis. Plant Cell. 2008;20(2):320–336. doi: 10.1105/tpc.107.057752. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lefebvre V, et al. Functional analysis of Arabidopsis NCED6 and NCED9 genes indicates that ABA synthesized in the endosperm is involved in the induction of seed dormancy. Plant J. 2006;45(3):309–319. doi: 10.1111/j.1365-313X.2005.02622.x. [DOI] [PubMed] [Google Scholar]
  • 16.Miyawaki K, Matsumoto-Kitano M, Kakimoto T. Expression of cytokinin biosynthetic isopentenyltransferase genes in Arabidopsis: Tissue specificity and regulation by auxin, cytokinin, and nitrate. Plant J. 2004;37(1):128–138. doi: 10.1046/j.1365-313x.2003.01945.x. [DOI] [PubMed] [Google Scholar]
  • 17.Xiang DQ, et al. Genome-wide analysis reveals gene expression and metabolic network dynamics during embryo development in Arabidopsis. Plant Physiol. 2011;156(1):346–356. doi: 10.1104/pp.110.171702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Law JA, Jacobsen SE. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat Rev Genet. 2010;11(3):204–220. doi: 10.1038/nrg2719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lisch D. Epigenetic regulation of transposable elements in plants. Annu Rev Plant Biol. 2009;60(1):43–66. doi: 10.1146/annurev.arplant.59.032607.092744. [DOI] [PubMed] [Google Scholar]
  • 20.Köhler C, Wolff P, Spillane C. Epigenetic mechanisms underlying genomic imprinting in plants. Annu Rev Plant Biol. 2012;63(1):331–352. doi: 10.1146/annurev-arplant-042811-105514. [DOI] [PubMed] [Google Scholar]
  • 21.Slotkin RK, et al. Epigenetic reprogramming and small RNA silencing of transposable elements in pollen. Cell. 2009;136(3):461–472. doi: 10.1016/j.cell.2008.12.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Penfield S, et al. Reserve mobilization in the Arabidopsis endosperm fuels hypocotyl elongation in the dark, is independent of abscisic acid, and requires PHOSPHOENOLPYRUVATE CARBOXYKINASE1. Plant Cell. 2004;16(10):2705–2718. doi: 10.1105/tpc.104.024711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Colombo L, Battaglia R, Kater MM. Arabidopsis ovule development and its evolutionary conservation. Trends Plant Sci. 2008;13(8):444–450. doi: 10.1016/j.tplants.2008.04.011. [DOI] [PubMed] [Google Scholar]
  • 24.Li SF, et al. The Arabidopsis MYB5 transcription factor regulates mucilage synthesis, seed coat development, and trichome morphogenesis. Plant Cell. 2009;21(1):72–89. doi: 10.1105/tpc.108.063503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gutierrez L, Van Wuytswinkel O, Castelain M, Bellini C. Combined networks regulating seed maturation. Trends Plant Sci. 2007;12(7):294–300. doi: 10.1016/j.tplants.2007.06.003. [DOI] [PubMed] [Google Scholar]
  • 26.Bensmihen S, et al. The homologous ABI5 and EEL transcription factors function antagonistically to fine-tune gene expression during late embryogenesis. Plant Cell. 2002;14(6):1391–1403. doi: 10.1105/tpc.000869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Yamamoto A, et al. Arabidopsis NF-YB subunits LEC1 and LEC1-LIKE activate transcription by interacting with seed-specific ABRE-binding factors. Plant J. 2009;58(5):843–856. doi: 10.1111/j.1365-313X.2009.03817.x. [DOI] [PubMed] [Google Scholar]
  • 28.Luo M, Dennis ES, Berger F, Peacock WJ, Chaudhury A. MINISEED3 (MINI3), a WRKY family gene, and HAIKU2 (IKU2), a leucine-rich repeat (LRR) KINASE gene, are regulators of seed size in Arabidopsis. Proc Natl Acad Sci USA. 2005;102(48):17531–17536. doi: 10.1073/pnas.0508418102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Penfield S, Hall A. A role for multiple circadian clock genes in the response to signals that break seed dormancy in Arabidopsis. Plant Cell. 2009;21(6):1722–1732. doi: 10.1105/tpc.108.064022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Brown RC, Lemmon BE, Nguyen H. Events during the first four rounds of mitosis establish three developmental domains in the syncytial endosperm of Arabidopsis thaliana. Protoplasma. 2003;222(3–4):167–174. doi: 10.1007/s00709-003-0010-x. [DOI] [PubMed] [Google Scholar]
  • 31.Westoby M, Jurado E, Leishman M. Comparative evolutionary ecology of seed size. Trends Ecol Evol. 1992;7(11):368–372. doi: 10.1016/0169-5347(92)90006-W. [DOI] [PubMed] [Google Scholar]
  • 32.Garcia D, et al. Arabidopsis haiku mutants reveal new controls of seed size by endosperm. Plant Physiol. 2003;131(4):1661–1670. doi: 10.1104/pp.102.018762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Scott RJ, Spielman M, Bailey J, Dickinson HG. Parent-of-origin effects on seed development in Arabidopsis thaliana. Development. 1998;125(17):3329–3341. doi: 10.1242/dev.125.17.3329. [DOI] [PubMed] [Google Scholar]
  • 34.Goffman FD, Alonso AP, Schwender J, Shachar-Hill Y, Ohlrogge JB. Light enables a very high efficiency of carbon storage in developing embryos of rapeseed. Plant Physiol. 2005;138(4):2269–2279. doi: 10.1104/pp.105.063628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Raissig MT, Baroux C, Grossniklaus U. Regulation and flexibility of genomic imprinting during seed development. Plant Cell. 2011;23(1):16–26. doi: 10.1105/tpc.110.081018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Costa LM, et al. Maternal control of nutrient allocation in plant seeds by genomic imprinting. Curr Biol. 2012;22(2):160–165. doi: 10.1016/j.cub.2011.11.059. [DOI] [PubMed] [Google Scholar]
  • 37.Haig D, Westoby M. Genomic imprinting in the endosperm: Its effect on seed development in crosses between species, and between different ploidies of the same species, and its implications for the evolution of apomixis. Philos T R Soc B. 1991;333(1266):1–13. [Google Scholar]
  • 38.Gehring M, Bubb KL, Henikoff S. Extensive demethylation of repetitive elements during seed development underlies gene imprinting. Science. 2009;324(5933):1447–1451. doi: 10.1126/science.1171609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Hsieh TF, et al. Genome-wide demethylation of Arabidopsis endosperm. Science. 2009;324(5933):1451–1454. doi: 10.1126/science.1172417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lu J, Zhang C, Baulcombe DC, Chen ZJ. Maternal siRNAs as regulators of parental genome imbalance and gene expression in endosperm of Arabidopsis seeds. Proc Natl Acad Sci USA. 2012;109(14):5529–5534. doi: 10.1073/pnas.1203094109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Baroux C, Spillane C, Grossniklaus U. Evolutionary origins of the endosperm in flowering plants. Genome Biol. 2002;3(9) doi: 10.1186/gb-2002-3-9-reviews1026. reviews1026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Friedman WE. Developmental and evolutionary hypotheses for the origin of double fertilization and endosperm. C R Acad Sci III. 2001;324(6):559–567. doi: 10.1016/s0764-4469(01)01326-9. [DOI] [PubMed] [Google Scholar]
  • 43.Floyd Sandra K, Friedman William E. Evolution of endosperm developmental patterns among basal flowering plants. Int J Plant Sci. 2000;161(S6):S57–S81. [Google Scholar]
  • 44.Irizarry RA, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4(2):249–264. doi: 10.1093/biostatistics/4.2.249. [DOI] [PubMed] [Google Scholar]
  • 45.Yamagishi K, et al. TANMEI/EMB2757 encodes a WD repeat protein required for embryo development in Arabidopsis. Plant Physiol. 2005;139(1):163–173. doi: 10.1104/pp.105.060467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Orlando DA, Brady SM, Koch JD, Dinneny JR, Benfey PN. Manipulating large-scale Arabidopsis microarray expression data: identifying dominant expression patterns and biological process enrichment. Methods Mol Biol. 2009;553:57–77. doi: 10.1007/978-1-60327-563-7_4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
1222061110_sd01.xls (1.9MB, xls)
1222061110_sd02.xlsx (12.1MB, xlsx)
1222061110_sd03.xlsx (2.4MB, xlsx)
1222061110_sd04.xlsx (2.5MB, xlsx)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES