Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2015 Nov 9;112(47):14734–14739. doi: 10.1073/pnas.1514670112

QQS orphan gene regulates carbon and nitrogen partitioning across species via NF-YC interactions

Ling Li a,b,1, Wenguang Zheng a,b, Yanbing Zhu a, Huaxun Ye a, Buyun Tang a, Zebulun W Arendsee a, Dallas Jones a, Ruoran Li a, Diego Ortiz c, Xuefeng Zhao d, Chuanlong Du e, Dan Nettleton e, M Paul Scott c,f, Maria G Salas-Fernandez c, Yanhai Yin a, Eve Syrkin Wurtele a,b,1
PMCID: PMC4664325  PMID: 26554020

Significance

Each species contains a subset of genes that are uniquely present in that species; the functions and origins of the vast majority of these “orphan genes” are not well-understood. Expression of the Arabidopsis QQS (Qua-Quine Starch; At3g30720) orphan transgene increases the level of protein in soybean lines with high and low protein and acts across flowering plants to increase the protein content of maize and rice. Our results begin to dissect the mechanism of QQS functions by identifying that it binds to the conserved transcription factor nuclear factor Y, subunit C4 (NF-YC4). Increased expression of NF-YC4 in Arabidopsis mimics the effects of increased expression of QQS. The ability to optimize protein productivity in plant-based foods would have far-ranging impacts on world health and sustainability.

Keywords: QQS, NF-YC4, carbon allocation, nitrogen allocation, orphan

Abstract

The allocation of carbon and nitrogen resources to the synthesis of plant proteins, carbohydrates, and lipids is complex and under the control of many genes; much remains to be understood about this process. QQS (Qua-Quine Starch; At3g30720), an orphan gene unique to Arabidopsis thaliana, regulates metabolic processes affecting carbon and nitrogen partitioning among proteins and carbohydrates, modulating leaf and seed composition in Arabidopsis and soybean. Here the universality of QQS function in modulating carbon and nitrogen allocation is exemplified by a series of transgenic experiments. We show that ectopic expression of QQS increases soybean protein independent of the genetic background and original protein content of the cultivar. Furthermore, transgenic QQS expression increases the protein content of maize, a C4 species (a species that uses 4-carbon photosynthesis), and rice, a protein-poor agronomic crop, both highly divergent from Arabidopsis. We determine that QQS protein binds to the transcriptional regulator AtNF-YC4 (Arabidopsis nuclear factor Y, subunit C4). Overexpression of AtNF-YC4 in Arabidopsis mimics the QQS-overexpression phenotype, increasing protein and decreasing starch levels. NF-YC, a component of the NF-Y complex, is conserved across eukaryotes. The NF-YC4 homologs of soybean, rice, and maize also bind to QQS, which provides an explanation of how QQS can act in species where it does not occur endogenously. These findings are, to our knowledge, the first insight into the mechanism of action of QQS in modulating carbon and nitrogen allocation across species. They have major implications for the emergence and function of orphan genes, and identify a nontransgenic strategy for modulating protein levels in crop species, a trait of great agronomic significance.


Carbon and nitrogen allocation to plant proteins, carbohydrates, and lipids is not controlled by a single gene but by many (1). Most of the enzymes promoting accumulation of these products have been identified; however, much less is understood about the mechanisms that regulate this complex metabolic network (28).

Arabidopsis thaliana QQS (Qua-Quine Starch; At3g30720) lacks sequence similarity to any other protein-coding genes, and is considered an orphan gene that has arisen de novo from noncoding sequence since the divergence of A. thaliana from other species (9, 10). Although orphans typically comprise 2–8% of the genome of eukaryotic and prokaryotic species, their origin and biological function have not been well-explored (1114). Proteins encoded by some orphan genes provide a defensive capability by binding to a receptor of a predator organism (11). In contrast, QQS action is endogenous (3): Overexpression of QQS in Arabidopsis increases total protein content and decreases total starch content in leaf, whereas down-regulation of QQS has the converse effect. The increased starch content in QQS RNAi (RNA interference) mutants is due to increased starch accumulation, whereas starch degradation is not affected (10). These data indicate that the QQS orphan gene modulates carbon and nitrogen allocation in Arabidopsis (3, 10).

Expression of the QQS transgene in the soybean cultivar Williams 82 causes a similar shift in composition, increasing protein and decreasing carbohydrate in leaf and seed, even though this organism does not have a polypeptide recognizable as QQS by sequence comparisons (3, 10). Protein gels show no visually detectable differences in accumulation of a particular polypeptide and no changes in the ratios of any of the major amino acids of proteins, indicating that, in soybean seeds, this increase in total protein is not due to a specific increase in storage proteins (3). The ability of QQS to affect soybean composition led us to investigate whether QQS interacts with a molecule or process in Arabidopsis that also exists in other plant species.

Here we report that the Arabidopsis orphan gene QQS increases protein content in multiple soybean lines with different protein levels, and also in rice and maize, two major agronomic crops that are highly divergent from Arabidopsis and are not high protein producers. Furthermore, we identify the protein NF-YC4 (nuclear factor Y, subunit C4) as a QQS-interacting protein, and provide direct evidence that NF-YC4 mediates plant composition. This understanding of how QQS functions uncovers a new node in the network that determines plant partitioning of precious carbon and nitrogen resources, and informs the current view on the evolutionary significance of newly formed (orphan) genes.

Results and Discussion

QQS Functions Across Varieties of Soybean with High or Low Protein Content.

We determined more broadly the effect of ectopic expression of QQS in plants. First, we chose to evaluate whether QQS acts only in Williams 82, a single variety of soybean with a moderate protein content, or whether the effect of QQS is more general. To do this, we introduced the QQS transgene into five elite soybean lines with varied seed protein contents by crossing QQS-expressing (QQS-E) Williams 82 soybeans (as the pollen donor) with each of these lines, selecting QQS-containing plants, and allowing them to self-pollinate. The self-pollinated offspring of the F2 (second filial) and F3 generations of the crosses containing the QQS transgene were similar to their respective segregating siblings in morphology, development, flowering time, seed shape and size, and seed weight per plant Fig. 1A and Fig. S1). However, QQS transgene expression increased seed protein content in each elite soybean line, regardless of the initial protein level in that line (Fig. 1B and Fig. S1B). Specifically, expression of QQS increased protein content in seeds of the F3 generation by 8–10% in line IA1022, 4–7% in lines IA2079 and IA2102, and 6–10% in lines IA2053 and IA3022, compared with the respective segregating WT (wild type) siblings. Seed oil content was similar or slightly decreased, and seed fiber was decreased (Fig. 1B).

Fig. 1.

Fig. 1.

Soybean plants expressing the Arabidopsis QQS gene have increased protein content. (A) Visual phenotype, developmental patterns, and seeds of QQS-E mutant lines were similar to those of their segregating sibling controls (line IA2053 crossed to QQS-E Williams 82 is shown, plants of F2 generation and seeds of F3 generation). (B) Seed composition (F3 generation) of protein, oil, and fiber was analyzed by near-infrared spectroscopy (NIRS), based on a 13% moisture content (wt/wt). Seed starch levels were below the detection limit. Percent increase of seed protein compared with their respective segregating siblings is labeled at the top of the mutant bar. (a) Segregating siblings lacking the QQS gene from QQS-E Williams 82 transformants. (b) QQS-E Williams 82 transformants. (c) Segregating sibling controls lacking the QQS gene from crosses of elite lines and QQS-E Williams 82. (d) QQS-E mutants from crosses of elite lines and QQS-E Williams 82. Soybeans were planted in a field, harvested, and processed in a completely randomized design. See Fig. S1 for additional data. All data in bar charts show mean ± SEM; n = 3 replicates from three individual plants of siblings, mutants, or wild-type controls, identified by both herbicide resistance and PCR screening of leaf genomic DNA. Student’s t test was used to compare QQS-E and controls; *P < 0.05, **P < 0.01, P < 0.05 for those with no marks.

Fig. S1.

Fig. S1.

Fig. S1.

Field-grown soybean plants expressing the QQS gene of Arabidopsis were similar to their controls. (A) QQS-E elite soybean lines were morphologically similar to their respective segregating sibling controls (F2 generation from the crosses of QQS-E Williams 82 and elite soybean lines), with similar seed size and shape (F3 generation). (B) Seed weight per plant (F3 generation) was indistinguishable, but seeds had increased protein (F3 generation) by protein combustion analysis (percent per dry weight). Percent increase of seed protein compared with respective segregating siblings is labeled at the top of the mutant bar. All data in bar charts show mean ± SEM; n = 3 replicates from three individual plants of mutants, siblings, or wild-type controls, identified by both herbicide resistance and PCR screening of leaf genomic DNA. Student’s t test was used within an ANOVA to compare QQS-E and controls; *P < 0.05, **P < 0.01; P > 0.05 for those with no marks.

QQS Functions in Monocot Plants to Regulate Plant Composition.

Mono- and dicotyledonous plants have physiological differences in how seed reserves are partitioned. Typically, dicots accumulate significant amounts of protein and oil in their cotyledons, whereas monocots store starch and protein primarily in the endosperm and a small amount of oil in the germ (15). To test whether QQS could function across angiosperm lineages, we introduced the QQS transgene into the C3 monocot (a monocot that uses only 3-carbon photosynthesis) rice (cultivar Kitaake). Rice is the primary staple crop for over half of the world’s people, with global rice consumption at about 480 million metric tons per y (16). High in starch and gluten-free, rice contains only about 4–5 g of protein per cup serving (7.1–7.9% protein on a dry-weight basis). The vast majority of rice grown globally and in the United States is used for human consumption, generally as a milled grain. Independent transgenic lines of rice plants (Kitaake) expressing the QQS gene were grown in a growth chamber under long-day (LD) conditions of 16 h light/8 h dark. Plants were visually and developmentally similar to nontransgenic lines (Fig. 2A and Fig. S2A); however, starch staining of the T2 (second generation progeny of the transformants) plants from 10 independent events indicated a decreased leaf starch. Three independently transformed lines bearing the QQS transgene and their respective WT sibling controls were chosen for further study. Determinations were made of plant height, panicle number per plant, seed number per panicle, seed weight, and leaf and seed composition (Fig. 2 and Fig. S2). Plant height, panicle number per plant, seed number per panicle, and seed weight of the T3 generation were similar for the transgenic and control lines (Fig. S2B). However, expression of the QQS gene decreased starch and increased protein content in leaves and seeds of the T3 and T4 generations (Fig. 2B). Leaf starch was decreased by about 60% at the end of the light cycle, and starch of mature seeds was decreased by about 6%. In contrast, protein was increased by 10% in leaves and 18% in mature seeds.

Fig. 2.

Fig. 2.

Rice plants expressing the Arabidopsis QQS gene have increased protein content. (A) Visual phenotype and developmental patterns of QQS-E mutant lines and their seeds were similar to those of the controls. Morphology of 30- and 90-d-after-planting (DAP) growth chamber-grown plants of transgenic line QQS-E 3-1 and segregating sibling controls is shown; T4 seeds from three independent transformation events (lines QQS-E 3-1, 30-2, and 33-3) are shown; see also Fig. S2. (B) Leaf and seed starch was decreased, and leaf and seed protein was increased, in rice QQS-E mutants compared with controls. Individual plants of siblings or mutants were identified by both herbicide resistance and PCR screening of leaf genomic DNA. The middle third of the second leaf from the primary tiller of 30-DAP T3 plants and the mature T4 seeds from three independent transformation events (lines QQS-E 3-1, 30-2, and 33-3, 10 plants per line, and a total of 10 sibling plants) were analyzed in triplicate. All leaf harvests were made just before the end of the light period. All data in bar charts show mean ± SEM; n = 3 independent transformation events. Student’s t test was used to compare QQS-E and controls; **P < 0.01.

Fig. S2.

Fig. S2.

Rice plants expressing the QQS gene of Arabidopsis were similar to their controls. (A) QQS-E rice plants had indistinguishable plant development and plant height (T3 generation). Morphology of growth chamber-grown plants of transgenic line QQS-E 3-1 and segregating sibling controls is shown. (B) Plant height and panicle per plant of T3 generation, seed number per panicle, and seed weight of T4 generation compared with their controls from three independent transformation events (lines QQS-E 3-1, 30-2, and 33-3) are shown from individual plants of siblings (about 10 plants in total) or mutants (about 20 plants per line), identified by both herbicide resistance and PCR screening of leaf genomic DNA. Rice plants were grown in soil in a growth chamber under LD conditions from fluorescence lamps, 249 ± 7 µmol photons⋅m−2⋅s−1 PAR at 28/25 °C (day/night). Rice seed weight was determined from an average of 100 seeds per plant. All data in bar charts show mean ± SEM; n = 10 (sibling) or 20 (QQS-E) plants. Student’s t test was used within an ANOVA to compare QQS-E and controls; P > 0.05 for all these tests.

We also investigated the effect of QQS transgene expression in another monocot, the C4 species (a monocot that initially uses 4-carbon photosynthesis to concentrate CO2) maize. In these studies, a hybrid, transformable, maize line, Hi-II, was used to determine whether QQS alters resource partitioning and effects compositional changes in maize kernels. Maize forms the basis for much of the US agricultural economy; the United States is the major exporter of maize worldwide, comprising ∼11% of all its agricultural exports (www.ers.usda.gov/topics/crops/corn.aspx). In addition, maize is the most widely grown crop in the world, and provides the major staple food for populations in Latin America, eastern and southern Africa, and southern Asia. Thus, understanding the control of metabolic resource partitioning in this species is an important step toward improving the nutritional value and protein content of a crop that billions of people depend upon for sustenance.

Maize plants expressing QQS (T0 generation in the Hi-II background) were backcrossed to an inbred line, B73, avoiding a selfed hybrid Hi-II that would segregate and likely result in variation in seed composition. A major advantage of B73 is that its genome is sequenced; however, B73 is not readily transformable. The resultant QQS-E maize plants had indistinguishable morphology and seed development from their segregating sibling controls (Fig. 3A). However, QQS expression decreased starch content in mature kernels by 2–4%, increased oil content by 3–4%, and increased protein content by 10–20% (Fig. 3B).

Fig. 3.

Fig. 3.

Phenotypic and compositional characterization of field-grown transgenic maize plants expressing QQS: QQS-E maize had a higher-protein trait. (A) Transgenic QQS-E maize plants and seeds were not distinguishable in morphology from their segregating siblings throughout development. Morphology of 52-DAP field-grown plants and segregating sibling controls of the third backcross generation (BC3) to the inbred line B73 is shown (line QQS-E 6-2). Maize plants were grown in the field; mutants and sibling controls were identified by both herbicide resistance and PCR screening of leaf genomic DNA. The red arrow indicates the leaf tip of a sibling control plant damaged from the herbicide treatment. (B) Kernel composition (BC4 generation) was altered in QQS-E compared with that of the WT siblings. Composition was determined in mature kernels by the nondestructive NIRS method. All data in bar charts show mean ± SEM; n = 6 independent transgenic events from 32 QQS-E plants and 32 sibling plants. Student’s t test was used to compare QQS-E and controls; *P < 0.05, **P < 0.01.

Ectopic Expression of QQS Does Not Alter Leaf Photosynthetic Rate.

QQS down-regulation does not affect starch degradation in Arabidopsis; rather, the increased starch content is due to increased starch biosynthesis (10). The effects of QQS on starch and protein content are similar in leaves and seeds. In leaves, protein may provide a stable capacity to produce more resources, whereas starch is a transiently accumulated carbon resource that is used for the night. One potential mechanism for the observed QQS-induced changes in protein and starch content is an alteration in leaf photosynthetic rate. To evaluate whether this might be the case, we measured the photosynthetic rate in soybean and maize leaves using five plants from each of two independent transformation events and their respective WT siblings grown under LD conditions. Photosynthetic rates of soybean plants were indistinguishable among the QQS-E transgenic lines and their segregating WT siblings (Fig. S3). Also, no significant difference was detected between photosynthetic rates of QQS-E maize plants and their segregating WT siblings (Fig. S3). This indicates that change in carbon and nitrogen allocation as a result of the ectopic expression of QQS is not likely associated with an increase in the rate of photosynthesis.

Fig. S3.

Fig. S3.

No significant difference was detected in photosynthetic rate in soybean and maize plants expressing QQS compared with their controls. Plants were completely randomly grown in a growth chamber under LD conditions: 16 h light/8 h dark, 27/22 °C for soybean and 28/21 °C for maize. Corn plants were moved to a greenhouse at 50 DAP (March 15) and exposed to natural light. The photosynthetic rate was measured on 39-DAP soybeans and 60-DAP corns. All data in bar charts show mean ± SEM; n = 10 mutant plants from two independent transformation events or 10 plants of sibling controls, identified by both herbicide resistance and PCR screening of leaf genomic DNA. Photosynthesis was measured by gas exchange using a LI-COR LI-6400XT portable infrared gas analyzer. Student’s t test was used to compare QQS-E and controls; P = 0.85 for the test in soybean; P = 0.75 for the test in maize.

QQS Interacts with NF-YC4.

QQS protein has no sequence similarity to any known functional domains, which might have provided a clue as to the mechanism by which it regulates carbon and nitrogen allocation; likewise, it does not contain any domain of unknown function that is computationally recognizable as conserved among proteins. The universal ability of QQS expression to impart a high-protein phenotype in other plants is consistent with the hypothesis that QQS confers its activity by interacting with a cellular protein or other molecule that is conserved across multiple species. To investigate a possible QQS–protein interaction, we conducted a yeast two-hybrid screening using QQS as bait against a cDNA library from Arabidopsis seedlings. Arabidopsis NF-YC4 (AtNF-YC4; At5g63470) was identified as a potential QQS interactor. The QQS–AtNF-YC4 interaction was supported by yeast two-hybrid pairwise reciprocal studies (Fig. S4 A and B).

Fig. S4.

Fig. S4.

Fig. S4.

AtNF-YC4 is a QQS interactor. (A) Reciprocal yeast two-hybrid assays were consistent with a QQS and AtNF-YC4 interaction; AtNF-YC4 on bait had an autosignal; QQS-prey and AtNF-YC4-bait signal was higher than AtNF-YC4-bait. AD, prey; BK, bait. (B) Statistical analysis of quantified expression indicates AtNF-YC4-bait and QQS-prey expression was higher than AtNF-YC4-bait (*P < 0.05), whereas expression of AtNF-YC4-bait and expression of AtNF-YC4-bait and QQS-prey were higher than expression of BK and AD vectors (**P < 0.01). (C) Constructs for BiFC assays: AtNF-YC4 and QQS were fused in-frame to the C terminus of nYFP and cYFP. (D) BiFC assays in tobacco leaves indicate that QQS and AtNF-YC4 interact within the cytosol and the nucleus; no YFP signal was detected without a QQS and AtNF-YC4 interaction. Negative controls included nYFP-AtNF-YC4 and cYFP, nYFP and cYFP-QQS, and nYFP and cYFP. cYFP, C terminus of YFP; nYFP, N terminus of YFP. (Scale bars, 20 µm.)

We further confirmed the physical interaction between QQS and AtNF-YC4 by GST pull-down assays using purified recombinant fusion proteins (Fig. 4A). To identify which region of the AtNF-YC4 protein interacts with QQS, we expressed the fusion proteins containing different segments of AtNF-YC4 and used them in pull-down assays. The binding to QQS appeared to require the region from amino acids 73–162 of AtNF-YC4, corresponding to the location of the AtNF-YC4 histone fold-like domain. To evaluate whether QQS would bind generally to proteins containing histone fold-like domains, we tested QQS with AtNF-YB7 (At2g13570) in a pull-down assay. AtNF-YB7 contains a histone fold-like domain similar to that of AtNF-YC4; it is also predicted to be an AtNF-YC4 interaction partner (17). QQS did not bind to AtNF-YB7 in pull-down assays (Fig. 4A), indicating that features unique to only some histone fold-like domains are likely to confer specificity for QQS binding.

Fig. 4.

Fig. 4.

QQS interacts with NF-YC4. (A) GST pull-down assays showing an interaction between QQS and AtNF-YC4 proteins. AtNF-YC4 binds to QQS in the region between amino acids 73 and 162. (B) Bimolecular fluorescence complementation assays in tobacco leaves indicate that QQS and AtNF-YC4 interact within the cytosol and nucleus. The interaction of BES1 and MYBL2, which occurs in the nucleus, was used as positive control. See Fig. S4D for additional controls. (Scale bars, 20 µm.) (C) Co-IP assays using protein extracted from 20-DAP Arabidopsis plants stably overexpressing MYC-tagged QQS (QQS–TAP) indicate an interaction between QQS and AtNF-YC4. (D) GST pull-down assays showing an interaction between QQS and the soybean, rice, and maize NF-YC4 homologs.

Coexpression of QQS and AtNF-YC4 in tobacco leaf in vivo detected the presence of the QQS–NF-YC4 protein complex in the cytosol and in the nucleus (Fig. 4B and Fig. S4 C and D), indicating that the QQS–NF-YC4 interaction occurs in a cellular environment. This localization is different from that obtained in QQS–GFP expression studies, which indicates that the bulk of expressed QQS protein is in the cytosol (10). Coimmunoprecipitation (co-IP) assays using protein extracted from transgenic Arabidopsis plants stably overexpressing MYC-tagged QQS (QQS–TAP) further confirmed that NF-YC4 binds with QQS in vivo (Fig. 4C).

NF-YC is conserved across eukaryotes (18, 19), consistent with our hypothesis that QQS acts via a conserved protein. To investigate whether QQS indeed interacts with NF-YC from soybean, rice, and maize, we selected the soybean, rice, and maize proteins most similar to AtNF-YC4 in amino acid sequence by phylogenetic analysis of all histone fold-like domains in the protein-coding genes from genomes of 10 diverse eukaryotic species (Fig. S5 shows this analysis for NF-YC–like histone-fold domains of rice, soybean, and Arabidopsis). We then examined whether QQS physically interacts with the AtNF-YC4 homologs in soybean (Glyma06g17780 and Glyma04g37291), rice (Os3g14669, Os2g07450, and Os6g45640), and maize (GrmZm2g089812). Indeed, GST pull-down assays indicated that QQS interacted with each of these soybean, rice, and maize NF-YC4 homologs (Fig. 4D). These findings are consistent with the concept that expression of QQS confers a high-protein phenotype to soybean, rice, and maize via interaction with NF-YC4. One example of a small, conserved plant peptide binding with a transcriptional regulator to regulate flowering time is the interaction between the florigen FLOWERING LOCUS T (FT) and the bZIP transcription factor FD (20, 21). The QQS–NY-YC4 complex provides an example of a small orphan peptide–transcription factor interaction in plants that can affect metabolic composition.

Fig. S5.

Fig. S5.

Phylogenetic tree of the NF-YC histone fold-like domains. The NF-YC transcription factor is conserved across eukaryotes, and in plants has evolved into large multifunctional gene families. Analysis was conducted on 10 diverse eukaryotic species. Results are shown here only for the sequences with NF-YC histone fold-like domains from A. thaliana (14 genes), O. sativa (16 genes), G. max (26 genes), and, as an outgroup, C. reinhardtii (three genes), because of data complexity.

NF-YC protein acts in a heterotrimer complex with NF-YA and NF-YB proteins to remodel nuclear architecture and to mediate transcription of a variety of CCAAT box-containing genes, few of which have been defined (18, 22). NF-YBs and NF-YCs contain histone-fold domains, whereas NF-YAs have a conserved 56-amino acid domain that incorporates a CCAAT-binding region (18). In the canonical model, an NF-YC and NF-YB heterodimer is formed in the cytosol, transported to the nucleus, and binds with NF-YA to generate the NF-Y complex (23). This complex interacts with promoters containing CCAAT sequences and with other nuclear factors to regulate transcription and profoundly influence multiple developmental and stress- and disease-associated conditions (17, 19, 24). NF-Y also appears to play a more general role in chromatin relaxation (25). In animals, the NF-Y complex participates in the cell cycle (26). The simplest interpretation of our findings that the QQS–NF-YC4 complex is both cytosolic and nuclear (Fig. 4B) is that the predominantly cytosolic QQS (10) binds NF-YC4 in the cytosol and the resultant QQS–NF-YC4 complex moves into the nucleus; this explanation is conceptually similar to the activation of NF-Y, in which NF-YB is translocated into the nucleus only after binding with NF-YC (23).

QQS Binds NF-YC4 Homologs Across Plant Species.

In contrast to the single-copy NF-Y subunit genes reported from animal and fungal species, the regulation and function of NF-Y complexes in land plants are complicated by the presence of gene families of 10 or more members for each of NF-YC, NF-YB, and NF-YA proteins (19, 27). The fact that multiple-member gene families encode the NF-Y subunits of plants has led to significant difficulties in interpretation of the role of each protein. These genes have varied patterns of expression, and are thought to act in different heterotrimer combinations to elicit various physiological and developmental responses, including regulation of shoot meristem development, gametophyte viability, embryogenesis, photosynthesis, and flowering time, as well as drought tolerance (19, 27, 28). For example, Arabidopsis contains up to 13 AtNF-YC genes; AtNF-YC3AtNF-YC4AtNF-YC9 triple knockout (KO) mutants show defects in development and flowering (28).

To directly explore whether QQS might increase protein content via its interaction with NF-YC4, we evaluated the ability of the AtNF-YC4 gene itself to affect composition in Arabidopsis. Under LD conditions, Arabidopsis plants overexpressing the AtNF-YC4 transgene looked similar to the WT controls (Fig. 5A). Overexpression of AtNF-YC4 decreased leaf starch accumulation by about 15% and led to a mean increase in leaf protein content of 17% compared with WT controls (Fig. 5 B and C). These data suggest that AtNF-YC4 has a similar function in regulating carbon and nitrogen allocation. Arabidopsis QQS T-DNA KO mutants and AtNF-YC4 T-DNA KO mutants grew similar to the WT controls, but QQS-KO mutants had an increased leaf starch whereas AtNF-YC4-KO mutants did not show an obvious increase in leaf starch content at the end of the light cycle (Fig. S6A). This lack of increased-starch phenotype in AtNF-YC4-KO may be due to the redundancy of NF-YCs. Moreover, when OsNF-YC4-1 (Os3g14669) was overexpressed in Arabidopsis under LD conditions, it did not alter the plant development or morphology but did decrease starch content up to 15–20% compared with WT controls (Fig. S6B).

Fig. 5.

Fig. 5.

Overexpression of AtNF-YC4 increases protein content in Arabidopsis. (A) AtNF-YC4-OE plants had a similar visual phenotype as the WT control plants. (B) Effect of overexpressing AtNF-YC4 on starch by starch staining assay (3). (C) Effect of AtNF-YC4 overexpression on starch and protein content by quantitative determination of starch and protein. Bar charts show mean ± SEM; n = 3 replicates with 3 (for leaf starch) or 10 (for leaf protein) plants each. Student’s t test was used to compare protein and starch composition in WT and AtNF-YC4-OE lines; *P < 0.05, **P < 0.01. See Fig. S6 for information about AtNF-YC4 knockout and NF-YC4-OE mutants. Plants were grown in soil in a growth chamber under LD conditions and harvested at 20 DAP at the end of the light period.

Fig. S6.

Fig. S6.

NF-YC4’s function in different plant species. (A) Arabidopsis QQS knockout mutants (CS907367) and AtNF-YC4 knockout mutants (SALK_032163) looked similar to WT controls and QQS-KO had increased starch by starch staining, while AtNF-YC4-KO did not show an obvious increase in leaf starch accumulation. (B) Arabidopsis plants overexpressing OsNF-YC4-1 had a similar visual phenotype as WT controls but decreased leaf starch content by starch staining and starch quantification. The bar chart shows mean ± SEM; n = 3 replicates with three plants each. Student’s t test was used to compare starch composition in WT and OsNF-YC4-1-OE lines; **P < 0.01. (C) Thirty-day-after-planting rice plants overexpressing AtNF-YC4 or OsNF-YC4-1 looked identical to their segregating sibling control plants but had decreased starch by starch staining. Arabidopsis (T2 generation for OsNF-YC4-1-OE plants) and rice plants (T1 generation) were grown in soil in a growth chamber under LD conditions from fluorescence lamps: 114 ± 4 µmol photons⋅m−2⋅s−1 PAR at 22 °C for Arabidopsis and 249 ± 7 µmol photons⋅m−2⋅s−1 PAR at 28/25 °C (day/night) for rice. Plants for starch staining (five plants per genotype) and starch quantification (three plants per genotype) were harvested at the end of the light period.

The molecular mechanism by which the QQS–NF-YC4 interaction alters carbon and nitrogen allocation remains to be determined. Because NF-YC4-OE (overexpression) increased the total protein composition on its own, a possible explanation is that up-regulation of QQS expression increased NF-YC4 expression. However, the levels of AtNF-YC4, or any of the other 35 AtNF-Y transcripts, were not significantly affected by changes in QQS expression in Arabidopsis QQS-OE or QQS RNAi plants sampled at the end of the light cycle (Fig. S7A); furthermore, the expression pattern of the QQS gene is not correlated with that of any of the NF-Y genes (Fig. S7B). These data do not support that QQS regulates NF-Y at the transcriptional level. The transient expression analysis of localization of the QQS–NF-YC4 complex (Fig. 4B) is consistent with the idea that QQS and NF-YC4 form a complex in the cytosol that is translocated into the nucleus.

Fig. S7.

Fig. S7.

Expression and coexpression of QQS and NF-Y genes in WT and mutant lines are not correlated in Arabidopsis. (A) There was no significant change in transcript levels for any of the NF-YA, NF-YB, or NF-YC genes in QQS-OE lines or QQS RNAi lines versus the WT controls. Two replicates, each from shoots of six 20-DAP plants grown in soil in pots and harvested at the end of the light cycle under LD conditions, were used for each mutant line, with three replicates for the WT controls. and, not determined. bThis QQS RNAi line was confirmed to have decreased QQS expression by real-time PCR (3) and western blot (10); however, some degraded pieces of the RNAi-targeted QQS-coding sequence may have been aligned to the QQS mRNA and counted as reads for the QQS transcript, which may have contributed to this high q value. (B) The relative accumulation of QQS and NF-YC4 transcripts (Pearson’s correlation −0.10) is shown across 1,000 conormalized samples from 80 diverse (Affymetrix platform) experiments (39). The QQS transcript is not significantly correlated (Pearson’s correlations range between 0.30 and −0.28) with any of the 36 NF-YA, NF-YB, or NF-YC transcripts. MetaOmGraph software was used for the correlations and visualization (www.metnetdb.org/MetNet_MetaOmGraph.htm).

Taken together, the data are consistent with a model in which QQS acts in conjunction with AtNF-YC4 to alter the allocation of nitrogen and carbon (Fig. 6). We envision three possible mechanisms. QQS protein might simply facilitate translocation of NF-YC4 from the cytosol to the nucleus. Alternatively, NF-YC4 binding to histones might be enhanced due to QQS interaction. NF-YC4 is known to affect flowering and germination in Arabidopsis (28, 29). Furthermore, the αC helix of the NF-YCs, which is required for dimerization to NF-YB/NF-YC, can also bind MYC and p53 (30). Thus, there is precedence for NF-YCs acting as a target for transcriptional regulatory factors in the nucleus. Finally, NF-YC4 might act simply to shuttle QQS into the nucleus, where QQS would then act on its own.

Fig. 6.

Fig. 6.

Model of QQS-induced change in composition. The working hypothesis is that cytosolic QQS forms a complex with NF-YC4, moves to the nucleus, and modulates transcription of target genes. The molecular mechanism is as yet to be determined. QQS might increase translocation of NF-YC4 to the nucleus and/or the nuclear QQS–NF-YC4 complex might enhance NF-YC4 activity/NF-Y transcription factor complex activity. Alternatively, NF-YC4 might simply shuttle QQS into the nucleus and release it, and QQS would itself have activity. The shifts in expression of targeted genes would result in altered composition with increased protein and decreased starch. Examples of conditions that induce QQS expression (11) are included to reinforce the intimate link between QQS expression and environmental perturbations.

Overexpression of AtNF-YC4 or OsNF-YC4-1 in rice (Kitaake) decreased leaf starch content (Fig. S6C), similar to its effect in Arabidopsis. These rice NF-YC4-OE lines appeared identical to WT control plants (Fig. S6C). These data indicate that NF-YC4 can function across species to regulate plant metabolism and can function independent of QQS to alter plant composition. This is the first report, to our knowledge, showing that NF-YC4 plays a role in regulation of primary metabolism.

It has long been known that composition varies widely with environmental conditions (31), although the underlying mechanisms are little-understood. Despite its dissimilarity to any protein-coding gene model in other (non-A. thaliana) species (10, 11), QQS expression is diversely patterned over time and space (10), is highly responsive to stresses (3, 10, 11, 32), and mediates compositional change (3, 10, 33). It is our hypothesis that many orphan genes arise because they confer a selective advantage to an organism by modulating an internal pathway in response to environmental perturbations, thus facilitating adaptation to a new environment (11). The emergence of “new” orphan genes provides an alternative “explosive” means for evolutionary adaptation. The data presented herein suggest that for QQS this is achieved through QQS interaction with a conserved protein complex. Taken together, the data indicate that QQS may represent a mechanism that promotes evolutionary survival during speciation across changing environmental challenges. These data may have broad implications for how orphan genes can function and affect biological processes.

In addition to the enigma surrounding how metabolic partitioning is controlled, the regulation of seed composition has major practical implications. Low protein intake contributes to mental retardation, stunting, susceptibility to disease, wasting diseases, and sometimes death in hundreds of millions of children each year (34, 35). Because plants provide over 60% of human dietary protein and are the major source of protein for many of the world’s at-risk populations (36), increasing protein content in staple crop plants could greatly impact human health. Furthermore, the negative environmental impact of using animal-based foods as a protein source is substantial: About 100 times more water and 11 times more energy are required to produce an equivalent amount of animal-based protein compared with plant-based protein (37). Breeding efforts often fail to produce high-protein varieties without sacrificing yield or protein quality (1). Historically, it has been difficult to uncouple seed protein content from that of carbohydrate and lipid; furthermore, the partitioning of carbon and nitrogen into storage compounds is considered a multigenic trait (1). Here we reveal a piece of this molecular puzzle. Our data indicate that QQS increases protein accumulation and decreases carbohydrate accumulation at least in part via its interaction with NF-YC4 protein. This QQS effect manifests itself irrespective of total protein content, and extends to rice and maize, two monocot species that diverged from A. thaliana about 150 million y ago (38). Because NF-YC4 is conserved across eukaryotes, the results open a potential nontransgenic strategy to create high-protein crops via targeted mutagenesis approaches such as transcription activator-like effector nucleases (TALENs) or clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein-9 nuclease (Cas9) technologies. More broadly, this demonstration that an orphan gene from one species can interact with a metabolic network of another species via a conserved protein suggests previously unidentified approaches to investigating the modulation of complex traits.

Materials and Methods

Constructs and Transformation.

The 35S::QQS and 35S::NF-YC4 fusion constructs were made respectively by cloning the amplified full-length coding sequences of QQS and NF-YC4 into binary vectors pB2GW7 as described (3). The genes are expressed under the control of the constitutive cauliflower mosaic virus (CaMV) 35S promoter. Constructs were introduced into Agrobacterium tumefaciens strain GV3101 for transformation of A. thaliana ecotype Columbia (Col-0) (3, 10, 39), and A. tumefaciens strain EHA101 for transformation of soybean (Glycine max) cultivar Williams 82 (3), rice (Oryza sativa) cultivar Kitaake, and maize (Zea mays) hybrid Hi-II at the Iowa State University (ISU) Plant Transformation Facility (PTF) (agron-www.agron.iastate.edu/ptf). The transformed plants were delivered from the PTF at the T1 generation (soybean) or T0 generation (rice and maize).

Additional Methods.

Plant selection, crossing, and growth, plant harvest and molecular analysis, photosynthesis measurement of soybean and maize plants, yeast two-hybrid, protein expression and purification, pull-down assay, coimmunoprecipitation assay, bimolecular fluorescence complementation assay, phylogenetic inference, and statistical design and analysis are in SI Materials and Methods.

SI Materials and Methods

Plant Selection, Crossing, and Growth.

Soybean mutants and rice and maize transformants expressing QQS (QQS-E) were identified by BAR (bialaphos resistance) selection by leaf painting and subsequent PCR analysis for the presence of the QQS gene as described before (3). The vector-specific primers were pB2GW7-F: 5′-ACATTACAATTTACTATTCTAGTCGA-3′ and pB2GW7-R: 5′-GCGGACTCTAGCATGGCCG-3′; the control gene primers for soybean and rice were 18S-rRNA-F: 5′-GGGCATTCGTATTTCATAGTCAGAG-3′ and 18S-rRNA-R: 5′-CGGTTCTTGATTAATGAAAACATCCT-3′, and for maize were actin-F: 5′-ATTCAGGTGATGGTGTGAGCCACAC-3′ and actin-R: 5′-GCCACCGATCCAGACACTGTACTTCC-3′. The segregating WT plants that were not resistant to herbicide and did not express QQS were used as sibling controls (3). Soybean QQS-E Williams 82 (3) was used as a pollen donor to cross with Iowa elite soybean lines (IA1022, IA2079, IA2102, IA2053, and IA3022; Williams 82 was used as a control) in the field. The F1 generation was grown and self-fertilized in the greenhouse in pots with three plants per pot under a controlled environmental of 16 h of light and 8 h of dark (long day; LD) at 27/22 °C (day/night). The F2 generation was planted in the field at Curtiss Farm in Ames, IA. All generations of rice transformants were grown in growth chambers under LD conditions from fluorescence lamps, 249 ± 7 µmol photons⋅m−2⋅s−1 PAR (photosynthetically active radiation) at 28/25 °C (day/night). Rice plants of the T2 and T3 generations and seeds of the T4 generation were analyzed. The QQS-E maize plants were backcrossed (BCed) to the inbred line B73 (B73 was used as a pollen donor). The QQS-E maize T0 generation was grown in the greenhouse in pots with one plant per pot under a controlled environmental of LD conditions at 28/21 °C (day/night). Plants were backcrossed to B73 by hand pollination. The BC1 seeds were planted in the field at Woodruff Bennet Farm in Ames, IA. The backcross process to B73 by hand pollination was conducted and repeated twice in the field at Woodruff Bennet Farm in Ames, IA. The plants of the BC3 generation and seeds of the BC4 generation were analyzed.

The NF-YC4 knockout line of SALK_032163 from the Arabidopsis Biological Resource Center (ABRC; www.arabidopsis.org/servlets/TairObject?type=germplasm&id=4634751) was screened and selected for homozygous mutants. SALK_032163 was previously proven to be a knockout line of AtNF-YC4 by the Holt group (28). Homozygous ABRC QQS knockout line CS907367 (www.arabidopsis.org/servlets/TairObject?type=germplasm&name=WiscDsLoxHs077_09G) was also generated. Arabidopsis was grown in a growth chamber under LD conditions from fluorescence lamps, 114 ± 4 µmol photons⋅m−2⋅s−1 PAR at 22 °C and harvested as described (10).

Plant Harvest and Molecular Analysis.

Leaves of Arabidopsis and rice were harvested at the end of the light period. Composition (starch staining and quantification, and protein content) and RNA-seq (RNA sequencing) were determined from Arabidopsis T2 seedling shoots grown in soil in a growth chamber (3) at 20 d after planting (DAP). For rice, the middle third of the second leaf (adjacent to the flag leaf) of the primary tiller from 30-DAP plants was harvested (T2 and T3 lines for leaf starch staining, and T3 lines for leaf starch quantification and leaf protein content). I2/KI (iodine and potassium iodide) staining for starch used the entire aerial portion from five plants for Arabidopsis (3). For rice, the middle third of the second leaf of the primary tiller from five plants was harvested and cut into small pieces about 1–1.5 cm long and used for starch staining. About 36 QQS-E rice T2 lines from 10 independent transformation events showed decreased starch in leaf by starch staining. Three representative lines (QQS-E 3-1, 30-2, and 33-3), each from an independent transformation event, were selected for further study of leaf starch/protein (T3 generation) and seed starch/protein (T4 generation). For leaf starch and protein quantification, three Arabidopsis shoots and sections of leaves from 10 rice plants were used per replicate. Three replicates were analyzed from each independent line for both Arabidopsis and rice.

Mature seeds were harvested from individual plants of soybean and rice. Ten soybean seeds (F3 generation) and 10 rice seeds (T4 generation) from each individual plant were used per replicate for protein combustion, with three biological replicates. For quantification of leaf protein of Arabidopsis and rice and seed protein of rice and soybean, plant materials were baked at 71 °C. Dry leaf/seed tissue (0.07 g) was used for each determination. The total protein content was measured at the Soil and Plant Analysis Laboratory of the Department of Agronomy at ISU (soiltesting.agron.iastate.edu). Total nitrogen was determined by using a LECO CHN-2000 and converted to protein content (3).

For starch quantification of rice mature seeds, about 10 seeds per plant were baked at 65 °C for 5 d and ground into a fine powder with a high-speed homogenizer. Rice seed starch quantification followed the procedures as described (40) with some modifications. Briefly, about 35 mg of rice seed powder was suspended in 10 mL 80% ethanol and incubated at 80–85 °C in a water bath for 30 min and centrifuged at 3,088 × g for 10 min after cooling to room temperature. The supernatant was decanted carefully. The above steps were repeated twice to remove the free sugar and glucose. The pellet was resuspended in 24 mL sterilized water and boiled for 30 min. After cooling to room temperature, the homogenized solution was subjected to starch quantification assay using the D-Glucose Assay Kit (GOPOD format; Megazyme) following the manufacturer’s instructions (3).

Compositional analysis of soybean seeds (protein, oil, and fiber) and maize seeds (protein, oil, and starch) was conducted with near-infrared spectroscopy (NIRS) using a Bruins Grain Analyzer 106110 at the ISU Iowa Grain Quality Laboratory (www.extension.iastate.edu/grain/lab); about 60 g of seeds per plant was tested, with three biological replicates for each independent line.

For Arabidopsis RNA-seq, total RNA from six shoots was extracted for each biological replicate and purified as previously described (10). Two biological replicates were used for each mutant line, and three were used for the WT controls. Both the 200-bp short-insert library and the transcriptome sequencing (an Illumina HiSeq 2000 system with V3 reagent, 91 pair-end sequencing) were conducted at BGI Americas (bgi-international.com/us). The cleaned reads were aligned to the reference genome of Arabidopsis thaliana (Phytozome version 8.0; phytozome.jgi.doe.gov/pz/portal.html) using TopHat (41); htseq-count (www-huber.embl.de/HTSeq/doc/overview.html) was used to count the mapped reads. Genes with an average of at least 1.5 uniquely mapped reads across samples were tested for differential expression using the negative binomial QLShrink method described by Lund et al. (42) and implemented in the R package QuasiSeq (cran.r-project.org/web/packages/QuasiSeq). Normalization was accomplished by including the log of the 0.75 quantile of read counts (43) in the log-linear model for the mean of each RNA-seq read count. The P values obtained for each genotype comparison were converted to q values (44) using an approach previously described (45) to estimate the number of genes with true null hypotheses among all genes tested. To control the false discovery rate at ∼5% when identifying genes with significant expression differences between genotypes, q values no larger than 0.05 were considered as evidence of significant differential expression.

Photosynthesis Measurements of Soybean and Maize Plants.

Carbon assimilation rate was measured by a gas-exchange method. Soybean and maize plants were randomly grown in pots (one plant per pot) in growth chambers under LD conditions at a photon flux density of 300 µmol⋅m−2⋅s−1 (provided by fluorescence lamps) at 27/22 °C for soybean and 28/21 °C for maize (day/night). Ten QQS-E Williams 82 transformants and 10 segregating siblings of the T4 generation, 10 maize QQS-E mutants, and 10 segregating siblings of the BC4 generation were identified by BAR selection and PCR screening of leaf genomic DNA.

Gas-exchange values were collected from 39-DAP plants from 11:00 AM to 2:00 PM within the light period, using LI-COR LI-6400XT portable infrared gas analyzers. The light intensity was 300 µmol photons⋅m−2⋅s−1 PAR, with a reference CO2 concentration of 400 µmol CO2⋅m−2, 50% humidity, 26 °C leaf temperature, and flow rate of 300 μmol⋅s−1.

At 50 DAP (March 15, Ames, IA), maize plants were moved to the greenhouse to prevent light damage from the fluorescence lamps. In the greenhouse, the plants received natural lighting. The temperature was set at 28/21 °C (day/night), and maximum PAR measured was 1,200 µmol photons⋅m−2⋅s−1. Gas-exchange values were collected from 60-DAP plants from 2:00 to 3:30 PM, with LI-COR LI-6400XT portable infrared gas analyzers. The light intensity was set to 1,200 µmol photons⋅m−2⋅s−1 PAR, with 50% humidity, 25 °C leaf temperature, and flow rate of 300 μmol⋅s−1.

Yeast Two-Hybrid.

For initial screens, Matchmaker System 3 (Clontech) was used to identify QQS-interacting proteins using an Arabidopsis Columbia cDNA library constructed with 3-d-old etiolated seedlings (www.arabidopsis.org/servlets/TairObject?type=library&id=23). For reciprocal yeast two-hybrid assays, QQS and AtNF-YC4 were cloned into pGBKT7 (bait vector) and pGADT7 (prey vector), respectively (Clontech).

Protein Expression and Purification.

Escherichia coli transformants containing the AtNF-YC4 gene (or other NF-Y genes or gene fragments) fused with a His-MBP (maltose-binding protein) tag on pDEST-His-MBP were grown at 37 °C and induced by 0.75 mM isopropyl-β-d-thiogalactopyranoside (IPTG) at 12 °C for 20 h. His-MBP–tagged protein was extracted and purified using Ni Sepharose 6 Fast Flow (GE Healthcare Life Sciences) affinity chromatography under native conditions at 4 °C. The cell pellet was resuspended in 15 mL binding buffer (50 mM NaH2PO4, pH 8.0, 500 mM NaCl, 30 mM imidazole). The cells were lysed by intermittent sonication. After centrifugation at 10,000 × g for 20 min, the resins were added to the cleared lysate. After shaking at 4 °C for 30 min, the lysate/resin mixture was loaded onto a column. The resins were washed with wash buffer (50 mM NaH2PO4, pH 8.0, 500 mM NaCl, 60 mM imidazole). The binding protein was eluted off the column with elution buffer (50 mM NaH2PO4, pH 8.0, 500 mM NaCl, 250 mM imidazole). The eluted proteins were dialyzed twice against phosphate-buffered saline (PBS) (140 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4, pH 7.3). E. coli transformants expressing the QQS gene fused with GST on pGEX-2T vector were grown under the same conditions as above. Purification of GST-tagged QQS was conducted using glutathione Sepharose 4B (GE Healthcare Life Sciences) affinity chromatography under native conditions at 4 °C. The cell pellet was resuspended in 15 mL PBS buffer, and cells were lysed by intermittent sonication. Similar steps were performed as above to load the lysate/resin mixture onto a column. The resins were washed with PBS. The binding protein was eluted off the column with elution buffer (50 mM Tris⋅HCl, 10 mM reduced glutathione, 5 mM DTT, pH 8.0). The eluted proteins were dialyzed twice against PBS buffer. The concentration of purified protein was determined using a BCA Protein Assay Kit (Thermo Fisher Scientific) and BSA as the standard.

Pull-Down Assay.

Purified His-MBP–tagged NF-Y or NF-Y fragment protein (5 μg) was mixed with bead-immobilized GST-QQS fusion protein (10 μg) in 1 mL PBS buffer containing 1% nonyl phenoxypolyethoxylethanol (NP)-40, 1 mM DTT, and 0.5 μg/μL BSA. The mixture was incubated at 4 °C for 2 h. GST protein immobilized on beads was used in incubations with His-MBP–tagged proteins as a negative control. The beads were recovered by centrifugation, washed six times with 1 mL PBS buffer (GE Healthcare Life Sciences), and resuspended in 50 μL SDS-sample buffer, boiled in a water bath for 10 min, and centrifuged at 369 × g for 1 min. Fifteen microliters of the resultant supernatant was fractionated by SDS/PAGE on a 12% gel, followed by immunoblotting and analysis with rabbit antiserum against MBP (New England Biolabs).

Bimolecular Fluorescence Complementation Assays.

For bimolecular fluorescence complementation (BiFC) assays, AtNF-YC4 and QQS were fused in-frame to the C terminus of nYFP and cYFP, respectively (46). The YFP activity from the interaction of BES1 and MYBL2 was used as a positive control for nuclear localization (47, 48). Negative controls included nYFP-AtNF-YC4 and cYFP, nYFP and cYFP-QQS, and nYFP and cYFP. Agrobacterium tumefaciens strain GV3101 transformed with each of the five combinations of constructs was coinfiltrated into Nicotiana tabaccum with three independent injections. The reconstituted YFP signal was observed 48 h after infiltration under a Zeiss Axioplan 2 fluorescence microscope at the ISU Microscopy and NanoImaging Facility (www.microscopy.biotech.iastate.edu/).

Coimmunoprecipitation Assay.

Co-IP assay was performed as previously described (49) with minor modifications as follows. Arabidopsis seedlings expressing the MYC-tagged QQS transgene were homogenized in protein lysis buffer (1 mM EDTA, 10% glycerol, 75 mM NaCl, 0.05% SDS, 100 mM Tris⋅HCl, pH 7.4, 0.1% Triton X-100, 1× complete mixture protease inhibitors). After protein extraction, anti-MYC antibody was added to total proteins. After incubation with gentle mixing for 1 h at 4 °C, 200 μL 50% protein A beads (Trisacryl immobilized protein; A-20338; Thermo Fisher Scientific) were added, and the mixture was incubated for 1 h. The mixture was centrifuged at 369 × g for 1 min, and the supernatant was removed. The precipitated beads were washed at least four times with protein extraction buffer (1 mM EDTA, 10% glycerol, 75 mM NaCl, 0.05% SDS, 100 mM Tris·HCl, pH 7.4) and eluted by boiling for 5 min in 2× SDS protein-loading buffer. Anti–NF-YC antibody (ab55799; Abcam) was used to detect AtNF-YC4 in the protein extracts from Arabidopsis seedlings in this experiment.

Phylogenetic Inference.

Sequences were selected as potential NF-Y genes with an HMMER 3.0 (50) search (hmmsearch) of the PF00808.18 PFam domain (histone fold-like domain) against the protein sequences of Glycine max, Oryza sativa, A. thaliana, Chlamydomonas reinhardtii, Zea mays, Homo sapiens, Mus musculus, Danio rerio, Saccharomyces cerevisiae, and Dictyostelium discoideum. The trees were built using the PhyML package (51) (with parameters “-a e -f m”) and visualized in Archaeopteryx (sites.google.com/site/cmzmasek/home/software/archaeopteryx).

Statistical Design and Analysis.

Plants were grown, collected, and analyzed in a completely randomized design. A minimum of three biological determinations from each independent transgenic line and each control was used for qualitative and quantitative analyses of composition. For compositional analyses, plant samples were assigned randomized numbers and submitted with no genotype/identity identification.

Data are presented as mean ± SEM. Independent samples were compared using Student’s t test (two-tailed). These t tests were conducted as part of a one-way analysis of variance (ANOVA) for analyses involving more than two groups (in Fig. 1B and Figs. S1B and S2B). P < 0.05 was considered significant (*); P < 0.01 was considered very significant (**).

Acknowledgments

We thank Asheesh Singh, Mark Westgate, Walter Fehr, and Randy Shoemaker for helpful advice on soybean genetics and breeding; Walter Fehr for the elite soybean lines; Grace Welke for help with soybean crossing; Diane Luth, Marcy Main, Bronwyn Frame, and Kan Wang for introducing QQS into soybean, rice, and corn; Kent Berns for field management; Charles Hurburgh and Glen Rippke for near-infrared spectroscopy analysis of soybean and maize seed composition; and Jack Horner and Randall Den Adel for help using the microscopy equipment. We are grateful to Yan Xiong, Mark Stitt, and Basil Nikolau for helpful discussions. This research was supported by the National Science Foundation (MCB-0951170 to E.S.W. and L.L.; IOS-1257631 to Y.Y.), United Soybean Board (2287 to L.L.), ISU Research Foundation (L.L.), and ISU Center for Metabolic Biology (E.S.W.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1514670112/-/DCSupplemental.

References

  • 1.Wenefrida I, Utomo HS, Linscombe SD. Mutational breeding and genetic engineering in the development of high grain protein content. J Agric Food Chem. 2013;61(48):11702–11710. doi: 10.1021/jf4016812. [DOI] [PubMed] [Google Scholar]
  • 2.Zhang M-Z, et al. Molecular insights into how a deficiency of amylose affects carbon allocation—Carbohydrate and oil analyses and gene expression profiling in the seeds of a rice waxy mutant. BMC Plant Biol. 2012;12:230. doi: 10.1186/1471-2229-12-230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Li L, Wurtele ES. The QQS orphan gene of Arabidopsis modulates carbon and nitrogen allocation in soybean. Plant Biotechnol J. 2015;13(2):177–187. doi: 10.1111/pbi.12238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Stitt M, Lunn J, Usadel B. Arabidopsis and primary photosynthetic metabolism—More than the icing on the cake. Plant J. 2010;61(6):1067–1091. doi: 10.1111/j.1365-313X.2010.04142.x. [DOI] [PubMed] [Google Scholar]
  • 5.Angelovici R, et al. Deciphering transcriptional and metabolic networks associated with lysine metabolism during Arabidopsis seed development. Plant Physiol. 2009;151(4):2058–2072. doi: 10.1104/pp.109.145631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Thum KE, et al. An integrated genetic, genomic and systems approach defines gene networks regulated by the interaction of light and carbon signaling pathways in Arabidopsis. BMC Syst Biol. 2008;2:31. doi: 10.1186/1752-0509-2-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Li L, et al. A systems biology approach toward understanding seed composition in soybean. BMC Genomics. 2015;16(Suppl 3):S9. doi: 10.1186/1471-2164-16-S3-S9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Gutiérrez RA, et al. Qualitative network models and genome-wide expression data define carbon/nitrogen-responsive molecular machines in Arabidopsis. Genome Biol. 2007;8:R7. doi: 10.1186/gb-2007-8-1-r7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Silveira AB, et al. Extensive natural epigenetic variation at a de novo originated gene. PLoS Genet. 2013;9(4):e1003437. doi: 10.1371/journal.pgen.1003437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li L, et al. Identification of the novel protein QQS as a component of the starch metabolic network in Arabidopsis leaves. Plant J. 2009;58(3):485–498. doi: 10.1111/j.1365-313X.2009.03793.x. [DOI] [PubMed] [Google Scholar]
  • 11.Arendsee ZW, Li L, Wurtele ES. Coming of age: Orphan genes in plants. Trends Plant Sci. 2014;19(11):698–708. doi: 10.1016/j.tplants.2014.07.003. [DOI] [PubMed] [Google Scholar]
  • 12.Fischer D, Eisenberg D. Finding families for genomic ORFans. Bioinformatics. 1999;15(9):759–762. doi: 10.1093/bioinformatics/15.9.759. [DOI] [PubMed] [Google Scholar]
  • 13.Carvunis AR, et al. Proto-genes and de novo gene birth. Nature. 2012;487(7407):370–374. doi: 10.1038/nature11184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Gollery M, Harper J, Cushman J, Mittler T, Mittler R. POFs: What we don’t know can hurt us. Trends Plant Sci. 2007;12(11):492–496. doi: 10.1016/j.tplants.2007.08.018. [DOI] [PubMed] [Google Scholar]
  • 15.Olsen OA. Endosperm: Developmental and Molecular Biology. Springer; Berlin: 2007. [Google Scholar]
  • 16.Muthayya S, Sugimoto JD, Montgomery S, Maberly GF. An overview of global rice production, supply, trade, and consumption. Ann N Y Acad Sci. 2014;1324:7–14. doi: 10.1111/nyas.12540. [DOI] [PubMed] [Google Scholar]
  • 17.Hackenberg D, et al. Studies on differential nuclear translocation mechanism and assembly of the three subunits of the Arabidopsis thaliana transcription factor NF-Y. Mol Plant. 2012;5(4):876–888. doi: 10.1093/mp/ssr107. [DOI] [PubMed] [Google Scholar]
  • 18.Nardini M, et al. Sequence-specific transcription factor NF-Y displays histone-like DNA binding and H2B-like ubiquitination. Cell. 2013;152(1-2):132–143. doi: 10.1016/j.cell.2012.11.047. [DOI] [PubMed] [Google Scholar]
  • 19.Laloum T, De Mita S, Gamas P, Baudin M, Niebel A. CCAAT-box binding transcription factors in plants: Y so many? Trends Plant Sci. 2013;18(3):157–166. doi: 10.1016/j.tplants.2012.07.004. [DOI] [PubMed] [Google Scholar]
  • 20.Taoka K, et al. 14-3-3 proteins act as intracellular receptors for rice Hd3a florigen. Nature. 2011;476(7360):332–335. doi: 10.1038/nature10272. [DOI] [PubMed] [Google Scholar]
  • 21.Wigge PA, et al. Integration of spatial and temporal information during floral induction in Arabidopsis. Science. 2005;309(5737):1056–1059. doi: 10.1126/science.1114358. [DOI] [PubMed] [Google Scholar]
  • 22.Rípodas C, et al. Transcriptional regulators of legume-rhizobia symbiosis: Nuclear factors Ys and GRAS are two for tango. Plant Signal Behav. 2014;9(5):e28847. doi: 10.4161/psb.28847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kahle J, Baake M, Doenecke D, Albig W. Subunits of the heterotrimeric transcription factor NF-Y are imported into the nucleus by distinct pathways involving importin beta and importin 13. Mol Cell Biol. 2005;25(13):5339–5354. doi: 10.1128/MCB.25.13.5339-5354.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ceribelli M, et al. The histone-like NF-Y is a bifunctional transcription factor. Mol Cell Biol. 2008;28(6):2047–2058. doi: 10.1128/MCB.01861-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Oldfield AJ, et al. Histone-fold domain protein NF-Y promotes chromatin accessibility for cell type-specific master transcription factors. Mol Cell. 2014;55(5):708–722. doi: 10.1016/j.molcel.2014.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Benatti P, et al. Specific inhibition of NF-Y subunits triggers different cell proliferation defects. Nucleic Acids Res. 2011;39(13):5356–5368. doi: 10.1093/nar/gkr128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Petroni K, et al. The promiscuous life of plant NUCLEAR FACTOR Y transcription factors. Plant Cell. 2012;24(12):4777–4792. doi: 10.1105/tpc.112.105734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kumimoto RW, Zhang Y, Siefers N, Holt BF., III NF-YC3, NF-YC4 and NF-YC9 are required for CONSTANS-mediated, photoperiod-dependent flowering in Arabidopsis thaliana. Plant J. 2010;63(3):379–391. doi: 10.1111/j.1365-313X.2010.04247.x. [DOI] [PubMed] [Google Scholar]
  • 29.Kumimoto RW, et al. NUCLEAR FACTOR Y transcription factors have both opposing and additive roles in ABA-mediated seed germination. PLoS One. 2013;8(3):e59481. doi: 10.1371/journal.pone.0059481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Romier C, Cocchiarella F, Mantovani R, Moras D. The NF-YB/NF-YC structure gives insight into DNA binding and transcription regulation by CCAAT factor NF-Y. J Biol Chem. 2003;278(2):1336–1345. doi: 10.1074/jbc.M209635200. [DOI] [PubMed] [Google Scholar]
  • 31.Spoehr HA, Milner HW. The chemical composition of Chlorella; effect of environmental conditions. Plant Physiol. 1949;24(1):120–149. doi: 10.1104/pp.24.1.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Seo PJ, Kim MJ, Ryu JY, Jeong EY, Park CM. Two splice variants of the IDD14 transcription factor competitively form nonfunctional heterodimers which may regulate starch metabolism. Nat Commun. 2011;2:303. doi: 10.1038/ncomms1303. [DOI] [PubMed] [Google Scholar]
  • 33.Li L, Wurtele ES. Iowa State University 2015. Materials and Methods for Modifying a Biochemical Component in a Plant. US Patent 9157091 and US patent publication US 20120222167 A1.
  • 34.Gomes SP, et al. Atrophy and neuron loss: Effects of a protein-deficient diet on sympathetic neurons. J Neurosci Res. 2009;87(16):3568–3575. doi: 10.1002/jnr.22167. [DOI] [PubMed] [Google Scholar]
  • 35.Forrester TE, et al. Prenatal factors contribute to the emergence of kwashiorkor or marasmus in severe undernutrition: Evidence for the predictive adaptation model. PLoS One. 2012;7(4):e35907. doi: 10.1371/journal.pone.0035907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Young VR, Pellett PL. Plant proteins in relation to human protein and amino acid nutrition. Am J Clin Nutr. 1994;59(Suppl 5):1203S–1212S. doi: 10.1093/ajcn/59.5.1203S. [DOI] [PubMed] [Google Scholar]
  • 37.Pimentel D, Pimentel M. Sustainability of meat-based and plant-based diets and the environment. Am J Clin Nutr. 2003;78(3) Suppl:660S–663S. doi: 10.1093/ajcn/78.3.660S. [DOI] [PubMed] [Google Scholar]
  • 38.Hedges SB, Kumar S, editors. The Timetree of Life. Oxford Univ Press; New York: 2009. [Google Scholar]
  • 39.Li L, Ilarslan H, James MG, Myers AM, Wurtele ES. Genome wide co-expression among the starch debranching enzyme genes AtISA1, AtISA2, and AtISA3 in Arabidopsis thaliana. J Exp Bot. 2007;58(12):3323–3342. doi: 10.1093/jxb/erm180. [DOI] [PubMed] [Google Scholar]
  • 40.Das S, Nayak M, Patra BC, Ramakrishnan B, Krishnan P. Characterization of seeds of selected wild species of rice (Oryza) stored under high temperature and humidity conditions. Indian J Biochem Biophys. 2010;47(3):178–184. [PubMed] [Google Scholar]
  • 41.Trapnell C, Pachter L, Salzberg SL. TopHat: Discovering splice junctions with RNA-seq. Bioinformatics. 2009;25(9):1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Lund SP, Nettleton D, McCarthy DJ, Smyth GK. Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol. 2012;11(5):1544–6115. doi: 10.1515/1544-6115.1826. [DOI] [PubMed] [Google Scholar]
  • 43.Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinformatics. 2010;11:94. doi: 10.1186/1471-2105-11-94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Storey JD. A direct approach to false discovery rates. J R Stat Soc Series B Stat Methodol. 2002;64(3):479–498. [Google Scholar]
  • 45.Nettleton D, Hwang JTG, Caldo R, Wise R. Estimating the number of true null hypotheses from a histogram of p values. JABES. 2006;11(3):337–356. [Google Scholar]
  • 46.Yu X, et al. Modulation of brassinosteroid-regulated gene expression by Jumonji domain-containing proteins ELF6 and REF6 in Arabidopsis. Proc Natl Acad Sci USA. 2008;105(21):7618–7623. doi: 10.1073/pnas.0802254105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ye H, Li L, Guo H, Yin Y. MYBL2 is a substrate of GSK3-like kinase BIN2 and acts as a corepressor of BES1 in brassinosteroid signaling pathway in Arabidopsis. Proc Natl Acad Sci USA. 2012;109(49):20142–20147. doi: 10.1073/pnas.1205232109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Yin Y, et al. BES1 accumulates in the nucleus in response to brassinosteroids to regulate gene expression and promote stem elongation. Cell. 2002;109(2):181–191. doi: 10.1016/s0092-8674(02)00721-3. [DOI] [PubMed] [Google Scholar]
  • 49.Lee JH, et al. DWA1 and DWA2, two Arabidopsis DWD protein components of CUL4-based E3 ligases, act together as negative regulators in ABA signal transduction. Plant Cell. 2010;22(6):1716–1732. doi: 10.1105/tpc.109.073783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Finn RD, Clements J, Eddy SR. HMMER web server: Interactive sequence similarity searching. Nucleic Acids Res. 2011;39(Web Server issue):W29–W37. doi: 10.1093/nar/gkr367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003;52(5):696–704. doi: 10.1080/10635150390235520. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES