Skip to main content
Genetics logoLink to Genetics
. 2014 Sep 25;198(4):1699–1716. doi: 10.1534/genetics.114.169979

A Foundation for Provitamin A Biofortification of Maize: Genome-Wide Association and Genomic Prediction Models of Carotenoid Levels

Brenda F Owens *,1, Alexander E Lipka †,‡,1, Maria Magallanes-Lundback §, Tyler Tiede *, Christine H Diepenbrock **, Catherine B Kandianis §,**, Eunha Kim §, Jason Cepela §§, Maria Mateos-Hernandez *, C Robin Buell §§, Edward S Buckler †,**,††, Dean DellaPenna §, Michael A Gore **,2, Torbert Rocheford *,2
PMCID: PMC4256781  PMID: 25258377

Abstract

Efforts are underway for development of crops with improved levels of provitamin A carotenoids to help combat dietary vitamin A deficiency. As a global staple crop with considerable variation in kernel carotenoid composition, maize (Zea mays L.) could have a widespread impact. We performed a genome-wide association study (GWAS) of quantified seed carotenoids across a panel of maize inbreds ranging from light yellow to dark orange in grain color to identify some of the key genes controlling maize grain carotenoid composition. Significant associations at the genome-wide level were detected within the coding regions of zep1 and lut1, carotenoid biosynthetic genes not previously shown to impact grain carotenoid composition in association studies, as well as within previously associated lcyE and crtRB1 genes. We leveraged existing biochemical and genomic information to identify 58 a priori candidate genes relevant to the biosynthesis and retention of carotenoids in maize to test in a pathway-level analysis. This revealed dxs2 and lut5, genes not previously associated with kernel carotenoids. In genomic prediction models, use of markers that targeted a small set of quantitative trait loci associated with carotenoid levels in prior linkage studies were as effective as genome-wide markers for predicting carotenoid traits. Based on GWAS, pathway-level analysis, and genomic prediction studies, we outline a flexible strategy involving use of a small number of genes that can be selected for rapid conversion of elite white grain germplasm, with minimal amounts of carotenoids, to orange grain versions containing high levels of provitamin A.

Keywords: genome-wide association study, pathway-level analysis, genomic prediction, carotenoid, biofortification, provitamin A


CAROTENOIDS are a group of >700 lipophilic yellow, orange, and red pigments primarily produced by photosynthetic organisms and also by certain fungi and bacteria (Britton 1995a; Khoo et al. 2011). The length and number of conjugated double bonds in the carotenoid molecule determines its spectral absorption properties (color). There are two generalized classes of carotenoids: carotenes, which are cyclic or acyclic hydrocarbons, and xanthophylls, which are carotenes to which various oxygen functional groups have been added. Carotenoids serve a variety of functions in plants including acting as antioxidants, photoprotectants, accessory pigments for light harvesting, and substrates for production of volatile compounds in flowers, fruit, and seed (Goff and Klee 2006; Moise et al. 2014). Specific xanthophylls are precursors for biosynthesis of the plant hormone abscisic acid, which is essential for seed dormancy and response to environmental stresses (Kermode 2005).

The most important and best-defined function of carotenoids in animals is as a dietary source of provitamin A. Provitamin A carotenoids are a small subset of the 700 carotenoids that are distinguished by having unhydroxylated β-rings. Provitamin A carotenoids can be converted by oxidative cleavage in the body to retinol, or vitamin A, which is stored in the liver (Stahl and Sies 2005; Combs 2012). Vitamin A (retinol) is involved in immune function and synthesis of various retinoic acid hormones and is converted to retinal, the primary light-absorbing pigment in the eye. Vitamin A deficiency can result in night blindness and increased susceptibility to infections and can eventually result in death (Combs 2012). It is estimated that 250,000–500,000 children become blind every year as a result of vitamin A deficiency and that half of these die within 1 year of losing their eyesight (http://www.who.int/nutrition/topics/vad/en/). The health benefits of vitamin A have prompted nutritional interventions including those promoting increased consumption of plant-based carotenoids, notably by the HarvestPlus maize biofortification program for Africa (http://www.harvestplus.org; Nestel et al. 2006; Tanumihardjo et al. 2008). In addition to provitamin A activities, all carotenoids are antioxidants and are generally considered nutritionally beneficial in the human diet and important for maintenance of optimal health (Jerome-Morais et al. 2011; Sen and Chakraborty 2011). As an example, specific isomers of the nonprovitamin A carotenoids, lutein and zeaxanthin, are present at high levels in the fovea of the eye where they are associated with prevention of age-related macular degeneration (Krinsky et al. 2003; Abdel-Aal et al. 2013), a leading cause of irreversible blindness in elderly populations of Western societies (Friedman et al. 2004).

Carotenoids are essential to many aspects of animal health, yet animals do not synthesize carotenoids, with the exception of the pea aphid (Moran and Jarvik 2010), and therefore must obtain them from their diet to meet minimal nutritional requirements. The most abundant provitamin A carotenoids in plant-based foods are β-carotene (two retinyl groups), β-cryptoxanthin (one retinyl group), and α-carotene (one retinyl group), but in most plant tissues they are substrates for hydroxylation reactions that produce the dihydroxyxanthophylls lutein and zeaxanthin (Figure 1)—the most prevalent carotenoids in vegetative and seed tissues (Howitt and Pogson 2006; Cazzonelli and Pogson 2010). The carotenoid biosynthetic pathway is conserved in plants and has been best characterized in the model dicot Arabidopsis thaliana (Dellapenna and Pogson 2006; Kim et al. 2009; Cuttriss et al. 2011) in which the molecular basis of these hydroxylation steps is well understood. The committed step of the carotenoid pathway is formation of phytoene from geranylgeranyl diphosphate (GGPP) by phytoene synthase (PSY) (Figure 1). A subsequent key branch point occurs at the level of lycopene cyclization. Lycopene β-cyclase activity at both ends of the molecule produces β-carotene, while addition of one β-ring and one ε-ring by lycopene ε-cyclase produces α-carotene. Hydroxylation of one β-carotene ring produces β-cryptoxanthin followed by hydroxylation of the other β-ring to produce zeaxanthin. Similarly, hydroxylation of the β-ring of α-carotene produces zeinoxanthin, and subsequent hydroxylation of the ε-ring yields lutein.

Figure 1.

Figure 1

Carotenoid biosynthesis and degradation pathways. Compounds derived from this pathway are diagrammed as nodes in boldface type, with compounds measured in this study shown in red type. Enzymes known to be involved in the conversion of these compounds are adjacent to node connectors. Solid arrows represent single reactions; dashed arrows represent two or more reactions. Note that for some steps maize contains multiple paralogs for a reaction. Note that, in Arabidopsis, the CCD class of enzymes has been shown to degrade additional carotenoid compounds (Gonzalez-Jorge et al. 2013). DOXP, 1-deoxy-d-xylulose 5-phosphate synthase; IPP, isopentenyl pyrophosphate synthase; GGPP, geranylgeranyl pyrophosphate synthase; PSY, phytoene synthase; PDS, phytoene desaturase; Z-ISO, ζ-carotene isomerase; ZDS, ζ-carotene desaturase; CRTISO, carotenoid isomerase; LCYE, lycopene ε-cyclase; LCYB, lycopene β-cyclase; CYP97A, β-carotene hydroxylase (P450); CYP97C, ε-carotene hydroxylase (P450); CRTRB, β-carotene hydroxylase; VDE, violaxanthin de-epoxidase; ZEP, zeaxanthin epoxidase; CCD1, carotenoid cleavage dioxygenase 1.

Maize (Zea mays L.) grain exhibits considerable phenotypic variation for carotenoid profiles (Harjes et al. 2008; Berardo et al. 2009; Burt et al. 2011), including some of the highest carotenoid concentrations for cereal crops (Abdel-Aal et al. 2013). Biochemical characterization of maize endosperm color mutants and transposon tagging helped to identify some maize-specific homologs of carotenoid pathway genes cloned in bacteria and model plant species. The first was phytoene synthase (y1), for which mutant alleles were shown to result in white endosperm grain (Buckner et al. 1990). White endosperm grain resulting from the recessive y1 allele provides negligible amounts of carotenoids compared to yellow and orange endosperm grain (Egesel et al. 2003; Howe and Tanumihardjo 2006; Burt et al. 2011). Subsequently, phytoene desaturase (pds1) (Li et al. 1996) and ζ-carotene desaturase (zds1) (Matthews et al. 2003) were cloned. The first quantitative trait loci (QTL) mapping study of maize grain carotenoids showed that some of the identified QTL were in proximity to two of three carotenoid biosynthetic genes that had been cloned at the time, y1 and zds1 (Wong et al. 2004). The finding of possible QTL association with carotenoid biosynthetic genes prompted efforts to identify and characterize alleles of genes in the carotenoid biosynthetic pathway that may be associated with quantitative levels of carotenoids. These alleles could then be selected with robust and inexpensive PCR-based assays for marker-assisted selection (MAS) efforts for desirable carotenoids, as opposed to high-performance liquid chromatography (HPLC), which is considerably more expensive and technically challenging to deploy in breeding programs.

Advances in genomics and bioinformatics resulted in the identification of additional genes in the maize carotenoid biosynthetic pathway (Wurtzel et al. 2012). This enabled discovery of an association of lycopene ε-cyclase (lcyE) with the ratio of α- to β-branch carotenoids (Harjes et al. 2008) and of β-carotene hydroxylase 1 (crtRB1) with β-carotene concentration and conversion (Yan et al. 2010). lcyE and crtRB1 alleles with substantially reduced transcript levels increased accumulation of β-branch carotenoids and decreased hydroxylation of β-carotene, respectively, resulting in higher provitamin A levels in maize kernels (Harjes et al. 2008; Yan et al. 2010). Genetic variation in crtRB3 has been associated with α-carotene levels in maize (Zhou et al. 2012) and with favorable alleles of y1 associated with higher total carotenoid content (Z. Fu et al. 2013).

Several candidate genes from the carotenoid biosynthetic pathway lie within QTL intervals associated with visual scores of relative orange endosperm color intensity (Chandler et al. 2013). Darker orange color in maize grain is associated with higher total carotenoids but does not necessarily result in higher provitamin A concentrations (Harjes et al. 2008; Burt et al. 2011). These results suggest that selection of visibly darker orange grain to increase synthesis and retention of total carotenoids needs to be combined with MAS for favorable QTL alleles of carotenoid biosynthetic genes such as crtRB1 to increase provitamin A carotenoid levels. Selection for orange color has important ramifications, given that people in most Sub-Saharan African countries generally prefer to eat maize dishes that are prepared from white grain, in part because yellow maize grain is considered suitable only for animal consumption. Thus much of the maize grain grown for human consumption in Africa has white endosperm that provides inadequate levels of provitamin A carotenoids (Pfeiffer and McClafferty 2007; Stevens and Winter-Nelson 2008). Consequently, HarvestPlus has developed an integrated outreach, education, and consumer acceptance strategy in parallel with the breeding efforts to address vitamin A deficiency. This program uses darker orange endosperm color maize grain with elevated provitamin A carotenoids to distinguish maize varieties having elevated provitamin A carotenoids from white grain and yellow feed grain. The approach of using orange grain, which essentially has not been grown in Africa previously and thus is new to the consumer, appears initially to be effective in gaining acceptance in Zimbabwe (Muzhingi et al. 2008) and Zambia (Meenakshi 2010).

Maize carotenoids are a promising model system for the continued exploration of quantitative variation in a biochemical pathway, and the fundamental knowledge obtained can be directly applied in maize provitamin A biofortification breeding programs. A genome-wide association study (GWAS) of these phenotypes is a powerful approach that can be used to identify additional key genes and favorable alleles that affect carotenoid levels in maize grain. Furthermore, given that many of the genes in the carotenoid pathway have been well characterized, pathway-level association analysis serves as a potentially useful complement to GWAS that allows less stringent significance thresholds because fewer hypothesis tests are conducted (Califano et al. 2012). Various pathway-based association approaches have been pursued in human genetics, typically defining a pathway as a set of genes grouped together based on function or network analysis and testing its association with a disease phenotype (Lantieri et al. 2009; Wang et al. 2010). Alternatively, nontargeted metabolite profiling approaches can be used in combination with GWAS to dissect kernel phenotypes, as utilized in several recent maize studies (Riedelsheimer et al. 2012; J. Fu et al. 2013; Wen et al. 2014). In contrast, our targeted analysis of maize grain carotenoids takes advantage of the genetic basis of a well-characterized biosynthetic pathway. Thus, as shown for the tocochromanol biosynthetic pathway in maize (Lipka et al. 2013), readjustment of the multiple testing problem to account only for the markers within or near these a priori candidate pathway genes is a viable approach to identify weaker-effect and relatively rare alleles contributing to carotenoid phenotypic variation.

The potential application of association results in breeding can be assessed by using marker data to predict grain carotenoid levels in statistical models commonly applied in genomic selection (GS). Previous work has suggested that GS approaches can accelerate the breeding cycle, enhancing genetic gain per unit of time by enabling selection of lines that show favorable genomic signatures for traits of interest but have not been phenotyped (Meuwissen et al. 2001; Lorenz et al. 2011). The statistical models and marker densities optimizing prediction of carotenoid levels have not been tested and are especially in question, given that the traits are likely oligogenic in genetic architecture but have been only partially characterized in maize grain (Wong et al. 2004; Chander et al. 2008; Kandianis et al. 2013). Information regarding a priori candidate pathway genes, QTL, or the combination thereof can be used to generate marker sets that more directly target the carotenoid phenotypes of interest, potentially achieving higher prediction accuracies than genome-wide models for these traits (Rutkoski et al. 2012). Importantly, the relative prediction accuracies of models built on marker sets with different levels of genome coverage, or that differ in the genes they target, provide a metric for the relative gains that each marker set could be expected to confer in a selection program.

We sought in this study to determine the controllers of natural variation for carotenoid content in grain and to develop a prediction model that can be used for biofortification of maize. Therefore, we conducted (i) a GWAS and a pathway-level analysis to identify novel genes responsible for quantitative variation of grain carotenoid levels in a maize inbred panel and (ii) genome-wide, pathway-level, and carotenoid QTL-targeted prediction studies to determine the model parameterizations and extent of marker density needed to accurately predict maize grain carotenoid levels. The results of this study will also be used to develop efficient strategies to convert locally adapted maize germplasm with white grain to orange, high provitamin A grain throughout Sub-Saharan Africa.

Materials and Methods

Germplasm

The 281-member maize inbred association panel that represents a significant portion of maize allelic diversity (Flint-Garcia et al. 2005) was grown in West Lafayette, IN, at Purdue University’s Agronomy Center for Research and Education during the 2009 and 2010 growing seasons. The inbred association panel was grown in a field design and grain samples were produced as described previously (Chandler et al. 2013; Lipka et al. 2013). Because of poor agronomic performance or late maturity of some lines, high-quality grain samples were obtained from only a total of 252 lines.

Carotenoid extraction and quantification

The general procedure used for extraction of lipid-soluble compounds from maize kernels for HPLC has been previously described (Lipka et al. 2013), except that 1 mg of β-apo-8′-carotenal was added per milliliter of extraction buffer as an internal recovery control. Twenty microliters of maize seed extract were injected onto a C30 YMC column (3 μm, 100 × 3 mm, Waters Inc., Wilmington, MA) at 30° and a flow rate of 0.8 ml/min. HPLC mobile phases were buffer A (methyl tert-butyl ether) and buffer B (methanol:H2O) (90:10, v:v). Carotenoids were resolved using the following gradient: 0–12 min: 100% B to 60% B; 12–17.5 min: 60% B to 22.5% B; 17.5–19.5 min: 22.5% B to 100% B; 19.5–21 min held at 100% B, for re-equilibration. Carotenoid spectra were collected from 200 to 600 nm using a photo-diode-array detector model SPD-M20A (Shimadzu, Kyoto, Japan). Individual carotenoids were identified by a combination of their order of elution in the chromatogram, retention times, characteristic spectral peaks, and additional fine spectral characteristics (Britton 1995b).

Carotenoid levels were quantified at 450 nm relative to five-point standard curves for purified all trans lutein, zeaxanthin or β-carotene standards except for ζ-carotene and phytofluene, which were done at 400 and 350 nm, respectively. Antheraxanthin, zeinoxanthin, and α-carotene were quantified using the lutein curve; zeaxanthin using the zeaxanthin curve; and lycopene, tetrahydrolycopene, β-cryptoxanthin, β-carotene, and δ-carotene using the β-carotene curve. Relative phytofluene and ζ-carotene levels were estimated from the β-carotene curve. While the major carotenoid species in most samples were in the all trans configuration, the system used was able to resolve one or more cis isomers for zeinoxanthin, α- and β-carotenes, lutein, zeaxanthin, tetrahydrolycopene, β-cryptoxanthin, phytofluene, and ζ-carotene. When cis isomers were present for a given carotenoid, these were quantified using the corresponding curve for their all trans isomers, and the values for all isomers for the carotenoid were summed.

Phenotypic data analysis

Nine carotenoid compounds were measured in grain samples from a 252-line subset of the 281-line association panel (Table 1). In addition, a series of 15 sums, ratios, and proportions were calculated from the measured values of these nine compounds. The additional derivative traits may reveal biochemical and genetic relationships not detectable from the measured carotenoids or provide information relevant to future biofortification efforts. The peak signal from a GWAS for white vs. nonwhite (yellow/orange) kernel color in this panel of 252 inbreds was a single nucleotide polymorphism (SNP) located 1141 bp upstream of the y1 transcription start site showing a P-value of 4.17 × 10−31 (Supporting Information, Figure S1). The white inbreds are homozygous for the recessive allele of y1 (Emerson 1921; Buckner et al. 1990) and do not produce measurable carotenoids in the endosperm. To adjust for this, the white inbreds were excluded from further analysis. White endosperm lines were identified and excluded based on very low carotenoid levels determined by HPLC and confirmation with grain color descriptors in the GRIN database (http://www.ars-grin.gov). Consequently, a total of 201 lines with a range from light-yellow to dark-orange kernel color and adequate amounts of mature grain for analysis were used.

Table 1. List of 24 grain carotenoid traits that were annalyzed.

Traits listed in Table 2 Traits listed in Table S5
β-Carotenea Phytofluenea
β-Cryptoxanthina ζ-Carotenea
Zeaxanthina Tetrahydrolycopenea
α-Carotenea Total β-xanthophyllsb
Zeinoxanthina Total α-xanthophyllsb
Luteina Provitamin Ac/total carotenoidsb
Acyclic and monocyclic carotenesb Acyclic carotenes/cyclic carotenesb
Total carotenoidsb β-Carotene/(β-cryptoxanthin + zeaxanthin)b
β-Carotenoids/α-carotenoidsb Total carotenes/total xanthophyllsb
β-Xanthophylls/α-xanthophyllsb
β-Carotene/β-cryptoxanthinb
β-Cryptoxanthin/zeaxanthinb
α-Carotene/zeinoxanthinb
Zeinoxanthin/luteinb
Provitamin Ab,c

The means of the BLUP values and heritability estimates for the 15 traits listed in the left column are reported in Table 2 and the values and estimates for the remaining traits are listed in Table S5.

a

Individual carotenoid compound measured by HPLC.

b

Derivative carotenoid trait.

c

Provitamin A is calculated as the sum of β-carotene, 1/2 α-carotene, and 1/2 β-cryptoxanthin.

A total of 48, 117, 112, 15, 10, 5, 2, and 2 samples had phytofluene, tetrahydrolycopene, ζ-carotene, α-carotene, β-carotene, zeinoxanthin, β-cryptoxanthin, and lutein values, respectively, that were below the HPLC detection threshold. For these samples, uniform random variables between 0 and the minimum detected value were generated to approximate the compound values. This approach is similar to the one described in Lubin et al. (2004). Outliers were removed from all traits using SAS version 9.3 (SAS Institute 2012) following examination of the Studentized deleted residuals obtained from mixed linear models fitted for each trait with the line and field explanatory variables set as random effects (Kutner 2005).

For each of the 24 carotenoid traits, a best linear unbiased predictor (BLUP) for each line (Table S1) was obtained by fitting a mixed linear model across all environments in ASREML version 3.0 (Gilmour 2009). The model-fitting procedure has been previously described (Chandler et al. 2013). The variance component estimates from these models were used to calculate heritabilities (h^l2) on a line mean basis (Holland et al. 2003; Hung et al. 2012), and standard errors of the heritability estimates were calculated using the delta method (Holland et al. 2003). To assess the relationship between carotenoid BLUPs, Pearson’s correlation coefficient (r) was calculated. Finally, the Box–Cox procedure (Box and Cox 1964) was conducted on BLUPs of each trait to find the optimal transformation that corrected for unequal error variances and non-normality of error terms. This procedure is critical for preventing violations of the statistical assumptions made for the models used in GWAS and genomic prediction.

Genome-wide association study

We conducted a GWAS for each of the 24 carotenoid grain traits in the 201 lines with light-yellow to dark-orange kernel color. The SNP markers used in the GWAS have been previously described (Lipka et al. 2013). The genotyping-by-sequencing marker data set (partially imputed genotypes; January 10, 2012, version) is available for download from the Panzea database (http://www.panzea.org/dynamic/derivative_data/genotypes/Maize282_GBS_genos_imputed_20120110.zip). After removal of monomorphic and low-quality SNPs, a total of 462,702 SNPs were available for the 201-member association panel. Additionally, seven indels and one SNP (lcyE SNP216) located within or close to the coding regions of four carotenoid biosynthetic pathway and degradation genes (y1, lcyE, crtRB1, and ccd1) that had been previously analyzed were included (Harjes et al. 2008; Yan et al. 2010; Z. Fu et al. 2013; Kandianis et al. 2013) (Table S2). Prior to the GWAS, all missing SNP genotypes were conservatively imputed with the major allele.

The procedure for the GWAS has been previously described (Lipka et al. 2013). Briefly, the BLUPs of each carotenoid trait (Table S1) were used to test for an association at the 284,180 SNPs with minor allele frequencies (MAFs) ≥0.05 in the panel. Similarly, unified mixed linear models were fitted to each of the aforementioned seven indel markers (Table S2) using PROC MIXED in SAS version 9.3. To account for multiple allelic states, indels were analyzed as class explanatory variables in PROC MIXED. All unified mixed linear models included principal components (Price et al. 2006) and a kinship matrix (Loiselle et al. 1995) that were calculated from a subset of 34,368 non-industry SNPs from the Illumina MaizeSNP50 BeadChip. For each carotenoid trait, the Bayesian information criterion (Schwarz 1978) was implemented to determine the optimal number of principal components to include in the model as covariates. The amount of phenotypic variation explained by the model was estimated using a likelihood-ratio-based R2 statistic, denoted R2LR (Sun et al. 2010). The Benjamini and Hochberg (1995) procedure was used to adjust for the multiple testing problem by controlling the false-discovery rate (FDR) at 5 and 10%.

A multi-locus mixed model (MLMM) procedure (Segura et al. 2012) was conducted to clarify the signals from major-effect loci identified in GWAS. This method employs a stepwise mixed-model regression procedure with forward selection and backward elimination. The variance components of the model are re-estimated at each step. Because it is possible to have multiple polymorphisms in the optimal model, the MLMM approach allows for an exhaustive search of the model space. All markers on the same chromosome of a major-effect locus were considered for inclusion as explanatory variables in the optimal model. The extended Bayesian information criterion (Chen and Chen 2008) was used to determine the optimal model. To examine the influence of polymorphisms identified through MLMM on our results, GWAS was conducted again with these polymorphisms included as covariates in the unified mixed linear model.

Pathway-level analysis

We performed an analysis that used prior knowledge relevant to the biosynthesis and degradation of carotenoids to identify a subset of candidate genes. These genes encode isoprenoid and carotenoid biosynthetic pathway enzymes and carotenoid degradation enzymes, and all either have been shown to influence carotenoid phenotypes in previous work or were identified through homology with carotenoid, isoprenoid, and degradation-related genes in Arabidopsis (Dellapenna and Pogson 2006; Moise et al. 2014). A total of 37 genes related to carotenoid biosynthesis and degradation and 21 genes related to prenyl group synthesis were used to identify regions in the B73 Refgen_v2 genome to be used in the analysis (Table S3). The genes involved in isoprenoid synthesis were chosen because these compounds are in precursor pathways to carotenoids (Dellapenna and Pogson 2006; Cuttriss et al. 2011). The degradation enzymes were included on the basis of reported rates of degradation for one or more carotenoids (Vallabhaneni et al. 2010). Ultimately, the association results for 7408 SNP markers and 7 indels located within ±250 kb of these 58 genes were considered in what we term the pathway-level analysis. For each trait, the unadjusted P-values of these markers were corrected for the multiple testing problem by using the Benjamini–Hochberg procedure (Benjamini and Hochberg 1995) to control the FDR at 5%.

Linkage disequilibrium analysis

The procedure used for calculating linkage disequilibrium (LD) has been previously described (Lipka et al. 2013). Briefly, the squared allele-frequency correlations (r2) were calculated in TASSEL version 3.0 (Bradbury et al. 2007). Only markers with <10% missing data and MAF ≥ 0.05 were considered for estimating LD. To ensure accurate estimation of LD, the markers were not imputed prior to LD analysis.

Carotenoid prediction

To assess the ability of markers to predict carotenoid levels among the 201 lines, we examined the prediction accuracy of three statistical models commonly used in genomic selection and prediction approaches: ridge regression best linear unbiased prediction (RR-BLUP) (Meuwissen et al. 2001), least absolute shrinkage and selection operator (LASSO) (Tibshirani 1996), and elastic net analysis (Zou and Hastie 2005) (Table S4). The RR-BLUP method was conducted using the rrBLUP R package (Endelman 2011), while the other two methods were conducted in the glmnet R package (Friedman et al. 2010). The same 24 carotenoid traits tested in a GWAS were included in the prediction analyses.

Each statistical model was tested with three different data sets that varied in marker scope: genome-wide, pathway-level, and carotenoid QTL-targeted. The genome-wide data set consisted of the 284,180 SNP markers and seven indels used for GWAS, whereas the pathway-level data set included the 7408 SNP markers and seven indels within ±250 kb of the 58 candidate genes from the pathway-level analysis. The carotenoid QTL-targeted data set included 944 SNP markers and seven indels within ±250 kb of eight key candidate genes underlying QTL associated with carotenoid biosynthesis and retention. These genes are considered important for selecting for individual carotenoids, higher total carotenoids, and higher provitamin A based on their function in the carotenoid pathway and previous results. The eight candidate genes, y1, zds1, lcyE, crtRB3, lut1, crtRB1, zep1, and ccd1, are all in chromosome regions associated with QTL for carotenoids (Wong et al. 2004; Chander et al. 2008; Zhou et al. 2012; Chandler et al. 2013; Kandianis et al. 2013). Six of eight genes were also associated with QTL for intensity of orange color, crtRB3 and lut1 being the exceptions (Chandler et al. 2013). A darker orange color is associated with higher total carotenoids, particularly lutein and zeaxanthin in maize (Pfeiffer and McClafferty 2007; Burt et al. 2011).

The full complement of 201 lines was used to generate the marker sets for prediction analyses, regardless of whether or not all 201 lines were phenotyped for a particular trait. The prediction accuracy of each model was assessed using the approach described in Resende et al. (2012). Briefly, the data were randomized into five folds for cross-validation. To enable a direct comparison between RR-BLUP, LASSO, and elastic net, the same fold assignments were used throughout this study. For each model, the correlations between observed and predicted trait values were standardized by dividing the average correlation estimates across the five folds by the square root of the heritability on a line mean basis estimated for that trait in the 201 lines.

Results

Phenotypic variation

Phenotypic variation for grain carotenoid content and composition was assessed in an association panel of 201 maize inbreds with kernel color ranging from light yellow to dark orange. Of the nine carotenoid compounds measured via HPLC in grain samples, the most abundant was zeaxanthin, and the least abundant was tetrahydrolycopene (Table 2 and Table S5). The strongest Pearson’s correlation among the nine carotenoid compounds was between β-cryptoxanthin and zeaxanthin (rp = 0.63), and the lowest correlations were between β-cryptoxanthin and α-carotene; zeinoxanthin and zeaxanthin; and zeaxanthin and ζ-carotene (rp < 0.01) (Table S6). As expected, compounds tended to be highly correlated with their corresponding precursor compounds in the carotenoid biosynthetic pathway. The average heritability on a line mean basis for the nine carotenoid compounds and the 15 sums, ratios, and proportions was 0.80, with a range from 0.98 for the ratio of β-branch to α-branch carotenoids to 0.25 for α-carotene. The relatively lower heritability of α-carotene may be related to technical limitations for reliable separation of it from other more abundant carotenes that overlap in elution on the HPLC system. Overall, the high heritabilities for carotenoids suggest that variation for these compounds in maize grain is largely influenced by genetic rather than environmental effects (Table S7).

Table 2. Summary statistics of 15 grain carotenoid traits.

BLUPs
Heritabilities
Trait No. of lines Mean SDa Range Estimate SEb
β-Carotene 199 1.31 0.61 0.31–3.27 0.82 0.035
β-Cryptoxanthin 199 1.44 1.05 0.13–5.17 0.95 0.009
Zeaxanthin 196 12.90 6.86 1.44–32.40 0.94 0.008
α-Carotene 201 1.24 0.38 0.45–2.65 0.25 0.049
Zeinoxanthin 198 0.82 0.82 0.12–5.29 0.88 0.016
Lutein 200 11.16 4.73 1.23–23.93 0.94 0.011
Acyclic and monocyclic carotenes 200 5.54 1.05 3.39–8.92 0.57 0.060
Total carotenoids 201 32.66 10.66 9.55–62.96 0.91 0.013
β-Carotenoids/α-carotenoids 190 1.92 1.17 0–7.87 0.98 0.002
β-Xanthophylls/α-xanthophylls 196 1.74 1.23 0.45–6.37 0.83 0.022
β-Carotene/β-cryptoxanthin 198 1.13 0.49 0.51–3.06 0.89 0.029
β-Cryptoxanthin/zeaxanthin 196 0.12 0.05 0.04–0.39 0.90 0.021
α-Carotene/zeinoxanthin 196 2.57 1.62 0.52–8.88 0.90 0.019
Zeinoxanthin/Lutein 195 0.10 0.06 0.03–0.42 0.89 0.023
Provitamin Ac 199 2.68 1.01 0.81–5.55 0.80 0.033

Means and ranges (μg/g) for untransformed BLUPs of 15 carotenoid traits evaluated on a maize inbred association panel and estimated heritability on a line mean basis in two summer environments in West Lafayette, Indiana, across 2 years.

a

SD, standard deviation.

b

SE, standard error.

c

Provitamin A is calculated as the sum of β-carotene, 1/2 α-carotene, and 1/2 β-cryptoxanthin.

Average quantities of the provitamin A carotenoids, α-carotene, β-carotene, and β-cryptoxanthin, were low relative to lutein and zeaxanthin (Table 2). The three provitamin A compounds, respectively, composed ∼23, 49, and 27% of the average provitamin A concentration of 2.68 μg/g present in this panel. The heritabilities of β-carotene and β-cryptoxanthin, the more predominant provitamin A compounds, were high: 0.82 and 0.95, respectively. High heritabilities were also observed for the ratios of β-branch to α-branch carotenoids (0.98) and β-carotene to β-cryptoxanthin (0.89). Because higher heritability traits are more responsive to selection than low heritability traits, these high heritabilities indicate that selection for the more predominant provitamin A compounds should be effective.

Genome-wide association study

The genetic basis of variation for carotenoids in maize grain was dissected in the 201-member panel using 462,703 genome-wide SNPs and seven indels. Unified mixed linear models (Yu et al. 2006) that accounted for population structure and familial relatedness were fitted to a subset of 284,180 SNPs with MAF ≥ 0.05 and the seven indels. A total of 24 unique SNPs and two indels were significantly associated with one or more carotenoid traits at a genome-wide FDR of 5% (Table S8A, Figure S2). Because the statistical power from an association panel of 201 inbreds is limited, generally only capable of repeatedly detecting large-effect QTL (Long and Langley 1999), we searched for relatively smaller-effect QTL at a genome-wide FDR of 10%. Under this less conservative criterion, an additional 11 SNPs and one indel were significantly associated with at least one carotenoid trait (Table S8A). Most of the additional SNPs identified at 10% FDR were located in the same vicinity of the significant polymorphisms detected at 5% FDR.

Peak associations significant at 5% FDR for zeaxanthin, total β-xanthophylls, and β-xanthophylls/α-xanthophylls were found at two SNPs within the gene encoding zeaxanthin epoxidase (zep1, GRMZM2G127139) on chromosome 2 (uncorrected P-values 4.82 × 10−8 to 2.22 × 10−9). Zeaxanthin epoxidase carries out a two-step reaction that produces violaxanthin from zeaxanthin through the intermediate antheraxanthin (Figure 1). Weaker associations were detected for zeaxanthin and β-xanthophylls/α-xanthophylls with five SNPs located ∼26 kb downstream of zep1 (P-values 7.57 × 10−6 to 1.19 × 10−6) in the vicinity of a gene encoding a eukaryotic aspartyl protease (GRMZM2G062559). To better clarify the signals of association in this 1.2-Mb genomic interval, the MLMM procedure (Segura et al. 2012) was conducted on a chromosome-wide basis for all three zeaxanthin-related traits. The resultant optimal model for two of the three traits, zeaxanthin and total β-xanthophylls, included peak SNP S2_44448432 located within zep1. No SNP was selected by MLMM for the third trait, β-xanthophylls/α-xanthophylls. When GWAS was conducted with SNP S2_44448432 as a covariate for all three traits, the remaining signals on chromosome 2 were no longer significant (Figure 2, Figure S3, Figure S4, Table S8B).

Figure 2.

Figure 2

GWAS for zeaxanthin content in maize grain. (A) Scatter plot of association results from a unified mixed model analysis of zeaxanthin and LD estimates (r2) across the zep1 chromosome region. Negative log10-transformed P-values (left y-axis) from a GWAS for zeaxanthin and r2 values (right y-axis) are plotted against physical position (B73 RefGen_v2) for a 1.2-Mb region on chromosome 2 that encompasses zep1. The blue vertical lines are –log10 P-values for SNPs that are statistically significant for zeaxanthin at 5% FDR, while the gray vertical lines are –log10 P-values for SNPs that are nonsignificant at 5% FDR. Triangles are the r2 values of each SNP relative to the peak SNP (indicated in red) at 44,448,432 bp. The black horizontal dashed line indicates the –log10 P-value of the least statistically significant SNP at 5% FDR. The black vertical dashed lines indicate the start and stop positions of zep1 (GRMZM2G127139). (B) Scatter plot of association results from a conditional unified mixed model analysis of zeaxanthin and LD estimates (r2) across the zep1 chromosome region, as in A. The peak SNP from the unconditional GWAS (S2_44448432; 44,448,432 bp) was included as a covariate in the unified mixed model to control for the zep1 effect.

The lut1 gene (GRMZM2G143202) on chromosome 1 contains an intronic SNP (ss196425306; 86,844,203 bp) that was significantly associated with α-carotene/zeinoxanthin, zeinoxanthin, and zeinoxanthin/lutein (P-values 8.95 × 10−8 to 3.47 × 10−10). The lut1 gene encodes CYP97C, a cytochrome P450-type monooxygenase responsible for hydroxylating the ε-ring of zeinoxanthin to yield lutein (Tian et al. 2004; Quinlan et al. 2012). The only other statistically significant SNP (ss196425308; 86,945,134 bp) in this region was located ∼100 kb downstream of lut1 and was in perfect LD (r2 = 1) with the peak SNP (ss196425306) in lut1. To further resolve the signals in the lut1 region, the MLMM procedure was run on these three carotenoid traits, with all SNPs on chromosome 1 considered for inclusion into the optimal models. All optimal models contained only the peak GWAS SNP in the lut1 intron (Figure 3, Figure S5, Figure S6, Table S8C).

Figure 3.

Figure 3

GWAS for the ratio of α-carotene to zeinoxanthin content in maize grain. (A) Scatter plot of association results from a unified mixed model analysis of the ratio of α-carotene to zeinoxanthin and LD estimates (r2) across the lut1 chromosome region. Negative log10-transformed P-values (left y-axis) from a GWAS for the ratio of α-carotene to zeinoxanthin and r2 values (right y-axis) are plotted against physical position (B73 RefGen_v2) for a 1-Mb region on chromosome 1 that encompasses lut1. The blue vertical lines are –log10 P-values for SNPs that are statistically significant for the ratio of α-carotene to zeinoxanthin at 5% FDR, while the gray vertical lines are –log10 P-values for SNPs that are nonsignificant at 5% FDR. Triangles are the r2 values of each SNP relative to the peak SNP (indicated in red) at 86,844,203 bp. The black horizontal dashed line indicates the –log10 P-value of the least statistically significant SNP at 5% FDR. The black vertical dashed lines indicate the start and stop positions of lut1 (GRMZM2G14322.) (B) Scatter plot of association results from a conditional unified mixed model analysis of the ratio of α-carotene to zeinoxanthin and LD estimates (r2) across the lut1 chromosome region, as in A. The peak SNP from the unconditional GWAS (ss196425306; 86,844,203 bp) was included as a covariate in the unified mixed model to control for the lut1 effect.

A cluster of association signals was detected in an 11-Mb region surrounding the lcyE gene (GRMZM2G012966) on chromosome 8, involving 16 markers at 10% FDR and six traits: lutein, zeaxanthin, total α-xanthophylls, total β-xanthophylls, β-xanthophylls/α-xanthophylls, and β-carotenoids/α-carotenoids. lcyE encodes lycopene ε-cyclase, the committed step toward α-carotene biosynthesis whose activity influences flux between the α- and β-branches of the carotenoid pathway (Cunningham et al. 1996). The most significant associations in this region were from nine markers within ±3 kb of the lcyE-coding region (P-values 8.99 × 10−7 to 5.05 × 10−16). The MLMM procedure with all chromosome 8 SNPs produced optimal models for lutein, total α-xanthophylls, β-xanthophylls/α-xanthophylls, and β-carotenoids/α-carotenoids with two lcyE polymorphisms, S8_138882897 and lcyE SNP216. When GWAS was conducted for these four traits using these two lcyE polymorphisms as covariates, the signals from remaining polymorphisms in the 11-Mb region surrounding lcyE disappeared (Figure 4, Figure S7, Figure S8, Figure S9, Figure S10, Figure S11, Table S8D). The optimal MLMM for zeaxanthin and total β-xanthophylls also included one SNP (S8_171705574; 171,705,574 bp) located within a gene encoding a 3-hydroxyacyl-CoA dehydrogenase (GRMZM2G106250). When GWAS was performed using S8_171705574 as a covariate, the signal associated with 3-hydroxyacyl-CoA dehydrogenase disappeared, but the signals in the lcyE region remained (Figure 5, Figure S12, Table S8E).

Figure 4.

Figure 4

GWAS for the ratio of β-xanthophylls to α-xanthophylls content in maize grain. Scatter plot of association results from a unified mixed model analysis of the ratio of β-xanthophylls to α-xanthophylls and LD estimates (r2) across the lcyE chromosome region. Negative log10-transformed P-values (left y-axis) from a GWAS for the ratio of β-xanthophylls to α-xanthophylls and r2 values (right y-axis) are plotted against physical position (B73 RefGen_v2) for a 12-Mb region on chromosome 8 that encompasses lcyE. The blue vertical lines are –log10 P-values for SNPs that are statistically significant for the ratio of β-xanthophylls to α-xanthophylls at 5% FDR, while the gray vertical lines are –log10 P-values for SNPs that are nonsignificant at 5% FDR. Triangles are the r2 values of each SNP relative to the peak SNP (indicated in red) at 138,883,206 bp. The black horizontal dashed line indicates the –log10 P-value of the least statistically significant SNP at 5% FDR. The black vertical dashed lines indicate the start and stop positions of lcyE (GRMZM2G12966). (B) Scatter plot of association results from a conditional unified mixed model analysis of the ratio of β-xanthophylls to α-xanthophylls and LD estimates (r2) across the lcyE chromosome region, as in A. The two SNPs (lcyE SNP216 and S_138882897) from the optimal MLMM model were included as covariates in the unified mixed model to control for the lcyE effect.

Figure 5.

Figure 5

GWAS for total β-xanthophylls content in maize grain. (A) Scatter plot of association results from a unified mixed model analysis of total β-xanthophylls and LD estimates (r2) across the surrounding chromosome region. Negative log10-transformed P-values (left y-axis) from a GWAS for total β-xanthophylls and r2 values (right y-axis) are plotted against physical position (B73 RefGen_v2) for a 1.2-Mb region on chromosome 8. The blue vertical lines are –log10 P-values for SNPs that are statistically significant for total β-xanthophylls at 5% FDR, while the gray vertical lines are –log10 P-values for SNPs that are nonsignificant at 5% FDR. Triangles are the r2 values of each SNP relative to the peak SNP (indicated in red) at 171,705,574 bp. The black horizontal dashed line indicates the –log10 P-value of the least statistically significant SNP at 5% FDR. (B) Scatter plot of association results from a conditional unified mixed model analysis of total β-xanthophyll and LD estimates (r2) across the 1.2-Mb chromosome region, as in A. The peak SNP from the unconditional GWAS (S8_171705574; 171,705,574 bp) was included as a covariate in the unified mixed model to control for the novel effect detected on chromosome 8.

A significant association at 5% FDR was identified between zeaxanthin and an insertion in the 3′ end (3′TE indel marker) of the crtRB1 gene (GRMZM2G152135) on chromosome 10 (P-value 1.11 × 10−6). At 10% FDR, signals for β-carotene/(β-cryptoxanthin+zeaxanthin) were detected by crtRB1 InDel4, a coding region indel, and SNP ss196501627, with P-values of 2.23 × 10−7 and 3.51 × 10−7, respectively. crtRB1 encodes a nonheme dioxygenase that hydroxylates β-rings of carotenoids. Significant associations with β-carotene, ratios of β-carotene/β-cryptoxanthin and β-carotene/β-cryptoxanthin+zeaxanthin, and total carotenoid content were previously reported for crtRB1 (Yan et al. 2010). The MLMM analysis produced an optimal model that contained only crtRB1 InDel4, which, when included as a covariate in GWAS, removed other signals in the region (Figure 6, Figure S13, Figure S14, Table S8F).

Figure 6.

Figure 6

GWAS for the ratio of β-carotene to β-cryptoxanthin plus zeaxanthin content in maize grain. (A) Scatter plot of association results from a unified mixed model analysis of the ratio of β-carotene to β-cryptoxanthin plus zeaxanthin and LD estimates (r2) across the crtRB1 chromosome region. Negative log10-transformed P-values (left y-axis) from a GWAS for the ratio of β-carotene to β-cryptoxanthin plus zeaxanthin and r2 values (right y-axis) are plotted against physical position (B73 RefGen_v2) for a 1.2-Mb region on chromosome 10 that encompasses crtRB1. The vertical lines are –log10 P-values for all tested SNPs in this region. Triangles are the r2 values of each SNP relative to the peak polymorphism (indicated in red) at 136,059,748 bp. The black vertical dashed lines indicate the start and stop positions of crtRB1 (GRMZM2G152135). (B) Scatter plot of association results from a conditional unified mixed model analysis of the ratio of β-carotene to β-cryptoxanthin plus zeaxanthin and LD estimates (r2) across the crtRB1 chromosome region, as in A. The peak polymorphism from the unconditional GWAS (crtRB1 InDel4; 136,059,748 bp) was included as a covariate in the unified mixed model to control for the crtRB1 effect.

The zep1, lut1, lcyE, and crtRB1 genes were the only carotenoid biosynthetic genes identified in the GWAS with peak signals located within or adjacent to their coding regions. To simultaneously account for the potential confounding effects of these moderate-to-strong association signals (Platt et al. 2010), a more stringent conditional analysis was conducted. Inclusion of peak polymorphisms for each of the genes individually eliminated signals for that gene, but signals for the other three genes remained (Table S8, B–D and F). When polymorphisms tagging all four genes were simultaneously included as covariates in the GWAS model, however, only two SNPs remained statistically significant at 5% FDR (Table S8G). The first of these SNPs—S7_13843351 (chromosome 7; 13,843,351 bp; associated with β-cryptoxanthin at P-value 4.86 × 10−8)—lies within GRMZM2G001938, an exostosin family protein. The second SNP—S8_171705574 (chromosome 8; 171,705,574 bp; associated with zeaxanthin at P-value 1.54 × 10−7)—lies in the putative 3-hydroxyacyl-CoA dehydrogenase (GRMZM2G106250). This gene was also found to be associated with zeaxanthin in the MLMM analysis of chromosome 8 presented above.

Pathway-level analysis

The large number of markers used for GWAS requires a very conservative adjustment for the multiple testing problem, permitting detection of only the strongest association signals. To assess weaker association signals, we performed a pathway-level analysis with a set of 58 a priori metabolic genes that are potentially involved in the genetic control of natural variation for carotenoid synthesis or degradation. The FDR procedure was conducted on a subset of 7408 SNPs and seven indels located within ±250 kb of these 58 candidate genes tested for all 24 carotenoid traits, and a total of 38 SNPs and three indels were significant at 5% FDR (Table S9). Seven SNPs were in the vicinity of three genes involved in plastidic synthesis of isopentenyl pyrophosphate (IPP): IPP isomerase 3 (ippi3, GRMZM2G133082), 1-deoxy-D-xylulose 5-phosphate synthase 2 (dxs2, GRMZM2G493395), and geranylgeranyl pyrophosphate synthase 2 (ggps2, GRMZM2G102550). The remaining markers were within ±250 kb of eight carotenoid biosynthetic pathway genes: β-carotene hydroxylase 6 (hyd6, GRMZM2G090051), CYP97A β-ring hydroxylase (lut5, GRMZM5G837869), carotenoid isomerase 3 (crti3, GRMZM2G144273), ζ-carotene desaturase (zds1, GRMZM2G454952), zep1, lut1, lcyE, and crtRB1.

To account for the signals from zep1, lut1, lcyE, and crtRB1, an additional pathway-level analysis was performed as per GWAS using models with covariate markers of each gene individually and one model accounting for all four genes (Table S9, B–G). When a SNP tagging zep1 or lut1 was used as a covariate, signals in the vicinity of hyd6 and ippi3 were eliminated. When two markers tagging lcyE were used as covariates, no significant SNPs were detected in the regions of crti3, ippi3, or zds1. When crtRB1 InDel4 was used as a covariate, signal was lost for ggps2 and zds1. When covariates from zep1, lut1, lcyE, and crtRB1 were placed into the model, the only significant signals remaining were from markers within ±250 kb of dxs2 (GRMZM2G493395) and lut5 (GRMZM5G837869).

Prediction of carotenoid levels

We assessed the potential of genomic selection as a method for breeding maize grain with higher levels of carotenoids. Specifically, the predictive abilities of marker data sets with three different levels of coverage—genome-wide (284,180 SNP markers and seven indels); 58 pathway-level genes (7408 SNP markers and seven indels); and eight candidate genes (y1, zds1, lcyE, crtRB3, lut1, crtRB1, zep1, and ccd1) underlying QTL associated with carotenoid levels in prior linkage population studies (944 SNP markers and seven indels)—were assessed and compared. These marker sets were tested in three types of linear regression models commonly used for genomic selection and prediction: RR-BLUP, LASSO, and elastic net analysis. While previous studies have shown that these approaches produce similar prediction accuracies (Riedelsheimer et al. 2012), it was useful to test multiple statistical models in this study, given the potential oligogenic architecture of carotenoid levels in maize grain (Wong et al. 2004; Chander et al. 2008; Kandianis et al. 2013).

We performed prediction analyses for 24 traits in total (Table 1): 15 traits expected to be of most interest to breeders (Table 2) and 9 traits capturing additional compounds, sums, ratios, and proportions (Table S5). Results for the two sets of traits (Table S10) showed equivalent trends; thus we will focus our reporting on the 15 highest-priority traits for breeding (Figure 7). We observed no consistent differences in predictive ability across the three statistical approaches (Table S10). Notably, there were no differences observed across the three marker sets for each of the traits tested; inclusion of more markers beyond those within ±250 kb of eight candidate genes underlying maize grain carotenoid QTL did not confer additional predictive ability. Additionally, we determined that the carotenoid QTL-targeted marker set yielded substantially better prediction accuracies than marker sets generated from eight 500-kb regions selected at random throughout the genome (2.765-fold mean difference; paired t = 10.68, d.f. = 23, P-value = 1.09 × 10−10) (Table S11). The carotenoid QTL-targeted marker set also outperformed markers within ±250 kb of eight genes randomly selected from the other 50 a priori candidate genes represented in the pathway-level prediction set (2.709-fold mean difference; paired t = 10.21, d.f. = 23, P-value = 2.59 × 10−10).

Figure 7.

Figure 7

Comparison of genomic prediction methods and marker sets for 15 grain carotenoid traits. Three prediction methods—RR-BLUP, LASSO, and elastic net analysis—were tested using three marker sets as predictors: carotenoid QTL-targeted prediction (the 944 markers and seven indels within ±250 kb of 8 a priori candidate genes), pathway-level prediction (the 7408 markers and seven indels within ±250 kb of 58 a priori candidate genes), and genome-wide prediction (all 284,180 markers and 7 indels used in genome-wide association studies). Standardized average correlations resulting from the fivefold cross-validation are reported. A superscript “a” (a) indicates that no markers were selected in one or two of the five folds or in three of the five folds in one case (α-carotene using the Pathway-Level Prediction marker set in eNet.)

On average, we obtained a prediction accuracy of 0.43 across the 15 traits, with the highest prediction accuracies (averaged across the three marker sets and three models tested) for β-xanthophylls/α-xanthophylls (0.71), β-carotenoids/α-carotenoids (0.59), zeaxanthin (0.52), lutein (0.51), α-carotene/zeinoxanthin (0.51), zeinoxanthin (0.49), β-cryptoxanthin (0.44), and zeinoxanthin/lutein (0.43) (Table 3, Figure 7). We found a weak but significant positive relationship between trait heritabilities and unstandardized prediction correlations (rsp = 0.57, P-value = 0.026). This relationship was no longer significant at a significance level of α = 0.05 when α-carotene, the least heritable trait (h^l2= 0.25), was excluded (rsp = 0.49, P-value = 0.079). In contrast, standardized prediction accuracies for the 15 traits were observed to scale consistently with the number of significant marker associations observed in GWAS (rsp = 0.91, P-value = 2.2 × 10−6) (Table 3). The eight traits with prediction accuracies above or at the mean had at least one significant marker association in a GWAS at a genome-wide FDR of 10%. Given that the standardized prediction accuracies were also strongly positively correlated with the partial r2 value of the most significantly associated marker for a given trait (rsp = 0.85, P-value = 6.9 × 10−5) and strongly negatively correlated with the P-values of that marker (rsp = −0.94, P-value = 2.09 × 10−7), these results also suggest that effect size of associated markers is an important factor driving prediction accuracy.

Table 3. Mean prediction accuracies and significant marker associations for 15 grain carotenoid traits.

Significant marker associations within ±3 kb of a candidate gene
Trait Mean prediction accuracy Significant marker associations (10% FDR) Partial r2 of most significant marker P-value of most significant marker Total Per candidate gene
β-Xanthophylls/α-xanthophylls 0.714 24 0.14 5.05E-16 13 zep1 (2), lcyE (11)
β-Carotenoids/α-carotenoids 0.587 4 0.17 2.08E-09 3 lcyE (3)
Zeaxanthin 0.518 11 0.19 2.22E-09 4 zep1 (2), lcyE, crtRB1
Lutein 0.509 3 0.34 6.28E-09 2 lcyE (2)
α-Carotene/zeinoxanthin 0.506 3 0.19 3.31E-10 1 lut1
Zeinoxanthin 0.488 4 0.14 8.95E-08 1 lut1
β-Cryptoxanthin 0.439 1 0.13 1.66E-07 0
Zeinoxanthin/lutein 0.432 3 0.15 4.97E-08 1 lut1
β-Carotene/β-cryptoxanthin 0.395 0 0.12 5.38E-07 0
α-Carotene 0.390 0 0.1 4.93E-06 0
Acyclic and monocyclic carotenes 0.345 0 0.1 5.72E-06 0
Provitamin A 0.342 0 0.1 5.81E-06 0
β-Cryptoxanthin/zeaxanthin 0.332 0 0.1 3.41E-06 0
Total carotenoids 0.231 0 0.11 5.80E-06 0
β-Carotene 0.208 0 0.09 1.46E-05 0

Mean prediction accuracies, significant marker associations, and the partial r2 and P-values of the most significant marker of each trait from a GWAS for the 15 priority grain carotenoid traits. Mean prediction accuracies were obtained by averaging across RR-BLUP, LASSO, and elastic net analysis prediction methods and carotenoid QTL-targeted, pathway-level, and genome-wide marker sets. A 10% FDR threshold was used to determine significance. A full list of significant marker associations detected for each trait in GWAS without covariates, including those located within ±3 kb of a candidate gene, can be found in Table S8A.

Discussion

Provitamin A biofortification efforts are strengthened by association studies that further characterize the underlying genetic basis of variation for maize grain carotenoids and thus provide more loci that can be used in different combinations in MAS and GS programs. Four major-effect loci were identified in GWAS, the previously reported associations of lcyE and crtRB1 with maize grain carotenoids, and notably new associations with zep1 and lut1. MLMMs and covariate analyses were used to distinguish and eliminate noncausal variation in LD with putative causal variants. We also demonstrated higher genetic mapping resolution with genome-wide SNP markers than previous QTL studies in biparental mapping populations that identified candidate genes associated with levels of carotenoids and orange kernel color in maize grain (Wong et al. 2004; Chander et al. 2008; Chandler et al. 2013; Kandianis et al. 2013).

A series of prediction analyses was used to compare the relative usefulness of the full set of GWAS markers with a pathway-level set of markers and with a smaller carotenoid QTL-targeted marker set. Alleles or haplotypes with effect estimates falling below the conservative detection thresholds applied in GWAS are fitted in genomic selection and prediction models in addition to more strongly associated loci. This increased genome coverage compared to traditional MAS may prove an effective selection strategy for maize grain carotenoid traits, including provitamin A.

Significant SNPs associated with zeaxanthin and total β-xanthophylls were identified in the coding region of zep1, which fits well with the activity of the encoded enzyme in converting zeaxanthin to violaxanthin via antheraxanthin (Hieber et al. 2000). In the zep1 region, QTL have been identified for levels of β-branch carotenoids, zeaxanthin, β-cryptoxanthin, and β-carotene (Kandianis et al. 2013) and for degree of orange color (Chandler et al. 2013), a trait associated with higher levels of zeaxanthin (Pfeiffer and McClafferty 2007). These linkage studies provide independent support for our association results for zep1.

A SNP in the lut1-coding region was associated through GWAS with α-carotene/zeinoxanthin, zeinoxanthin/lutein, and zeinoxanthin, again consistent with the enzymatic activity of lut1 in forming lutein by hydroxylation of the ε-ring of zeinoxanthin (Tian et al. 2004; Quinlan et al. 2012). A QTL for lutein was reported near the lut1 region in a low-resolution biparental mapping population (Chander et al. 2008). Pathway-level analysis with covariates for lcyE detected two additional SNPs ∼240 kb upstream of the lut1 start codon that were also associated with the ratio of zeinoxanthin to lutein. However, it may be difficult to determine whether or not these additional signals indicate an enhancer element upstream of lut1 because this region is part of the chromosome 1 pericentromeric region (Gore et al. 2009). Substantially larger association panels that better exploit the recombinational history of maize, such as the Ames diversity panel (Romay et al. 2013), are needed to provide more statistical power and precision in the lut1 interval.

Significant SNPs associated with zeaxanthin and total β-xanthophylls were identified in the coding regions of lcyE and a gene encoding a 3-hydroxyacyl-CoA dehydrogenase. Given that allelic variation in lcyE influences relative flux into the α- and β-branches of the carotenoid pathway (Harjes et al. 2008), this is a logical candidate gene for influencing levels of zeaxanthin and total β-xanthophylls. Although the 3-hydroxyacyl-CoA dehydrogenase gene does not have a known function in the carotenoid pathway or in regulating the pathway, when a SNP in the 3-hydroxyacyl-CoA dehydrogenase-coding region (S8_171705574) was used as a covariate in GWAS, the signal in the lcyE region was still present for zeaxanthin and total β-xanthophylls. Determining whether there is a true association of 3-hydroxyacyl-CoA dehydrogenase with levels of zeaxanthin and total β-xanthophylls, or if the presence of these associations is due to long-range LD with lcyE or another gene on chromosome 8, merits further investigation. Again, this two-gene region could be better resolved in a larger association panel.

The crtRB1 gene showed a relatively weak signal in GWAS with no significant SNPs at 5% FDR and only one significant SNP associated with the ratio of β-carotene to β-cryptoxanthin+zeaxanthin at a genome-wide FDR of 10%. The inclusion of two indel markers for crtRB1 revealed signals between the 3′ TE indel marker and zeaxanthin and total β-xanthophylls and between the InDel4 marker and ratio of β-carotene to β-cryptoxanthin+zeaxanthin. There was only one SNP in our data set within the coding region of crtRB1 and, as a result, the SNPs did not capture the relevant variation described in Yan et al. (2010). Notably, the detection of a significant association with the two indel markers showed that the contribution of the crtRB1 gene was similar to that previously reported.

The analysis of a pathway-level, 58 a priori candidate gene set revealed additional weaker signals within ±250 kb of 7 of these candidate genes. However, when covariates identified from MLMM analysis as tagging the signals of zep1, lut1, lcyE, and crtRB1 were added to the model, polymorphisms in the vicinity of 5 of these candidate genes lost significance and only dxs2 and lut5 remained significant. These results suggest that dxs2 and lut5 should be further investigated, as they logically could affect carotenoid traits. The gene regions and polymorphisms that were or were not significant depended on the analysis performed: GWAS, pathway-level analysis, MLMM, and covariate analysis. The polymorphisms significant in one or more of these analyses should be evaluated in much larger association and linkage panels that provide greater genetic diversity, power, and precision. The pathway-level analysis that we performed was designed in part to minimize the multiple hypothesis testing penalty (Califano et al. 2012). Other statistical methodologies that consider all significant loci from GWAS, along with transcriptional and protein interaction networks, have the potential to identify genes outside of the pathway that affect carotenoid accumulation (Baranzini et al. 2009; Chan et al. 2011) as well as polymorphisms surrounding these gene regions that may be useful in selection programs for higher levels of provitamin A, total carotenoids, and orange grain color.

To evaluate the relative gains to be expected from conducting genomic selection for carotenoid traits in maize grain, we tested multiple prediction methods and marker sets. The RR-BLUP method assigns equal variance to all included markers (Meuwissen et al. 2001). This approach is optimal for complex traits having many underlying QTL of small effect. Given that carotenoid traits are likely largely explained by a small number of moderate- to large-effect loci (Wong et al. 2004; Chander et al. 2008; Kandianis et al. 2013), we hypothesized that a variable selection method that shrinks the variance explained by noncontributing markers to near or equal to zero, such as LASSO or elastic net analysis, would show higher predictive ability. While no differences were found among the three statistical approaches used in this study, we recommend continued model comparison for carotenoid traits in future analyses that employ larger maize populations with higher marker densities (Gore et al. 2009; McMullen et al. 2009; Chia et al. 2012; Romay et al. 2013).

Across the 15 traits tested, the three statistical approaches achieved a wide range of mean prediction accuracies: from 0.21 for β-carotene to 0.71 for β-xanthophylls/α-xanthophylls (Table 3). Standard errors were generally equivalent in size across the statistical methods and marker sets tested (Table S10). Notably, the seven traits showing below-average prediction accuracy also showed no significant marker associations in GWAS (Table 3). This result, along with the strong positive correlation observed between prediction accuracy and the partial r2 value of the most strongly associated marker for each trait, suggests that markers in strong LD with causative variants of at least moderate effect likely contributed to higher prediction accuracy of particular carotenoid traits in maize grain. Additionally, the comparable predictive abilities observed between the eight-gene QTL-targeted set and the larger candidate gene and genome-wide marker sets supports the hypothesis that density of marker coverage in carotenoid candidate gene regions was the primary driver in determining relative and absolute predictive power for carotenoid traits in this panel.

Most notably, linear regression models into which only the 944 SNP markers and seven indels within ±250 kb of the eight candidate genes in the carotenoid QTL-targeted data set were input were generally as predictive as models trained with all 284,180 genome-wide SNP markers and seven indels included (Figure 7). A similar result was reported in Rutkoski et al. (2012) for another oligogenic trait, deoxynivalenol levels in wheat: the addition of genome-wide markers was found to decrease prediction accuracies compared to a model containing only markers associated with QTL. This key finding of our study—that a more targeted approach based on ∼300-fold fewer markers was equally predictive as genome-wide coverage—suggests that QTL-targeted approaches will be effective for favorably modifying and improving carotenoid composition in maize grain. However, continued prediction analyses in panels with larger sample size and greater genetic diversity, as well as studies in breeding populations, are needed to further examine whether more extensive genome coverage affords higher prediction accuracies relative to the carotenoid QTL-targeted prediction sets due to increased power to detect weaker QTL effects and rarer alleles in a larger panel or population.

In the panel we studied, many of the most significant SNP associations are related to known carotenoid genes. Given these results and the likely oligogenic nature of maize grain carotenoid traits, it was logical to confine pathway-level prediction efforts to genes within the biochemical pathway. Recent efforts have made use of transcriptional networks to identify groups of genes showing subthreshold associations with phenotypes of interest (Baranzini et al. 2009; Chan et al. 2011). Additionally, an experimental study of general combining ability in hybrid maize found use of metabolite profiles as predictor variables in prediction models, although without the use of network analysis, to achieve prediction accuracies similar to models based on SNP marker data, but did not observe further gains when the two types of data were combined (Riedelsheimer et al. 2012). Our understanding of the genetics underlying maize carotenoid levels may benefit from the integration of network analysis and prediction approaches. Targeted or nontargeted gene expression and metabolite profiling approaches could feasibly be used together, particularly in larger panels, to identify transcriptional and metabolite networks that exhibit associations with carotenoid phenotypes but may not be represented in pathway-level analyses. The constituents of these networks could then be combined as additional predictor variables in models for potential further gains in accuracy.

Prior to our study, the best-characterized genes for provitamin A biofortification in maize grain were lcyE and crtRB1 (Harjes et al. 2008; Yan et al. 2010; Burt et al. 2011; Babu et al. 2013). Prior to these findings, breeding programs for developing countries with vitamin A deficiency performed selection based on HPLC analysis to directly measure carotenoid levels in maize kernels. These efforts had achieved only 6–8 μg/g provitamin A in their experimental maize breeding materials (http://www.harvestplus.org; Pixley et al. 2013). This was only half of the HarvestPlus biofortification initial target level of 15 μg/g, and only small incremental gains of provitamin A levels were achieved during cycles of selection. MAS for a favorable crtRB1 allele has resulted in rapidly increasing provitamin A content to >20 μg/g in maize grain from experimental lines soon to be released (Azmach et al. 2013; Pixley et al. 2013).

Despite the excellent progress in breeding for higher levels of provitamin A, even higher levels are needed to account for postharvest degradation, which can result in a 70% reduction in provitamin A content in a 4- to 6-month storage period. Furthermore, in the second phase of HarvestPlus, higher target levels of provitamin A will be set so that smaller, more attainable quantities of maize grain can be consumed in a day to provide a beneficial level of provitamin A. This will broaden the impact of high provitamin A maize intervention programs. Thus genetic research that enables continual increases in levels of provitamin A is needed. To this end, use of GWAS and pathway-level gene sets with covariate analysis has revealed additional potentially useful genes.

For maize provitamin A biofortification to be effective in Africa, breeders are faced with the challenge of converting white maize germplasm that has had no direct selection for alleles in the carotenoid pathway to germplasm that has a dark-orange endosperm, high total carotenoids, and high provitamin A. In addition to the two genes already tapped for biofortification efforts, our GWAS results demonstrate the substantial contribution of two new genes, zep1 and lut1, to carotenoid variation in maize grain. The improved knowledge of the associated effects of these two genes may lead to better prediction and selection of carotenoid levels in breeding populations, particularly for xanthophylls, total carotenoids, and the color orange, given the roles of zep1 and lut1 in the biosynthetic pathway. Zeaxanthin and lutein are the most predominant carotenoid compounds in maize, and accessions with darker orange kernels generally have higher levels of these two compounds (Pfeiffer and McClafferty 2007; Burt et al. 2011).

Although the four genes detected in GWAS—zep1, lut1, lcyE, and crtRB1—are clearly important, they may not be sufficient for efficient breeding in all contexts and genetic backgrounds. We propose that favorable alleles at y1, zds1, lcyE, crtRB3, lut1, crtRB1, zep1, and ccd1 could be selected for rapid conversion endeavors. Our prediction analyses show that these eight genes are at least as effective for predicting carotenoid levels as a genome-wide set of predictors. While simultaneously selecting for eight genes would be resource-intensive, testing for the presence or absence of specific favorable alleles at y1, zds1, crtRB3, zep1, lut1, and ccd1 in addition to lcyE and crtRB1 in the elite adapted white-grain germplasm to be improved and the respective orange-grain donor germplasm should help breeders design effective MAS conversion strategies. We hypothesize that, in lines that have yellow or orange endosperm color or in lines already in selection programs for provitamin A, fewer genes will need to be selected. While crtRB1 has been shown to be very useful for improving β-carotene, current objectives include selecting for the color orange. This eight gene set is proposed to meet this need. Future breeding objectives will also include (i) increasing the β-cryptoxanthin component of provitamin A since studies have shown that β-cryptoxanthin appears to be twice as bioavailable as β-carotene (Davis et al. 2008; Burri et al. 2011; Turner et al. 2013) and (ii) selecting for higher zeaxanthin and lutein levels for prevention of macular degeneration. The zep1 and lut1 genes should be especially useful in selection programs designed to meet these targets.

Use of the genes or subsets of genes in the carotenoid prediction sets could have a transformational effect on maize in Sub-Saharan Africa, starting with Ethiopia and Zimbabwe, the next HarvestPlus target countries. The rapid, cost-effective development of high-yielding, locally adapted germplasm with high provitamin A and total carotenoids and dark-orange kernel color could effectively create a new widespread biofortified grain crop. Consumption of this grain will provide essential provitamin A carotenoids and a broad carotenoid profile exhibiting an array of nutritional attributes.

Supplementary Material

Supporting Information

Acknowledgments

We thank Evan J. Klug and Xiodan Xi for assistance with processing samples and HPLC assays; Kristin Chandler, Jerry Chandler, and Jason Morales for assistance in field work and seed processing; and Jean-Luc Jannink, Nicolas Heslot, Jessica Rutkoski, and Vahid Edriss for assistance in genomic prediction. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U. S. Department of Agriculture (USDA). The USDA is an equal opportunity provider and employer. This research was supported by National Science Foundation grants DBI-0922493 (D.D.P., T.R., E.S.B., and C.R.B.), DBI-0096033 (E.S.B.), DBI-0820619 (E.S.B.), and DBI-1238014 (E.S.B.); by Harvest Plus (T.R.); by Purdue University startup funds and Patterson Chair funds (T.R.); by the USDA–Agricultural Research Service (E.S.B.); by Cornell University startup funds (M.A.G.); by a USDA National Needs Fellowship (C.H.D.); and by a Borlaug Fellowship (B.F.O.).

Footnotes

Communicating editor: A. H. Paterson

Literature Cited

  1. Abdel-Aal E. M., Akhtar H., Zaheer K., Ali R., 2013.  Dietary sources of lutein and zeaxanthin carotenoids and their role in eye health. Nutrients 5: 1169–1185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Azmach G., Gedil M., Menkir A., Spillane C., 2013.  Marker-trait association analysis of functional gene markers for provitamin A levels across diverse tropical yellow maize inbred lines. BMC Plant Biol. 13: 227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Babu R., Rojas N. P., Gao S. B., Yan J. B., Pixley K., 2013.  Validation of the effects of molecular marker polymorphisms in LcyE and CrtRB1 on provitamin A concentrations for 26 tropical maize populations. Theor. Appl. Genet. 126: 389–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baranzini S. E., Galwey N. W., Wang J., Khankhanian P., Lindberg R., et al. , 2009.  Pathway and network-based analysis of genome-wide association studies in multiple sclerosis. Hum. Mol. Genet. 18: 2078–2090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Benjamini Y., Hochberg Y., 1995.  Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57: 289–300. [Google Scholar]
  6. Berardo N., Mazzinelli G., Valoti P., Lagana P., Redaelli R., 2009.  Characterization of maize germplasm for the chemical composition of the grain. J. Agric. Food Chem. 57: 2378–2384. [DOI] [PubMed] [Google Scholar]
  7. Box G. E. P., Cox D. R., 1964.  An analysis of transformations. J. R. Stat. Soc., B 26: 211–252. [Google Scholar]
  8. Bradbury P. J., Zhang Z., Kroon D. E., Casstevens T. M., Ramdoss Y., et al. , 2007.  TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23: 2633–2635. [DOI] [PubMed] [Google Scholar]
  9. Britton G., 1995a Structure and properties of carotenoids in relation to function. FASEB J. 9: 1551–1558. [PubMed] [Google Scholar]
  10. Britton G., 1995b Carotenoids: Isolation and Analysis. Birkhäuser, Basel. [Google Scholar]
  11. Buckner B., Kelson T. L., Robertson D. S., 1990.  Cloning of the y1 locus of maize, a gene involved in the biosynthesis of carotenoids. Plant Cell 2: 867–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Burri B. J., Chang J. S. T., Neidlinger T. R., 2011.  β-Cryptoxanthin- and α-carotene-rich foods have greater apparent bioavailability than β-carotene-rich foods in Western diets. Br. J. Nutr. 105: 212–219. [DOI] [PubMed] [Google Scholar]
  13. Burt A. J., Grainger C. M., Smid M. P., Shelp B. J., Lee E. A., 2011.  Allele mining of exotic maize germplasm to enhance macular carotenoids. Crop Sci. 51: 991–1004. [Google Scholar]
  14. Califano A., Butte A. J., Friend S., Ideker T., Schadt E., 2012.  Leveraging models of cell regulation and GWAS data in integrative network-based association studies. Nat. Genet. 44: 841–847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cazzonelli C. I., Pogson B. J., 2010.  Source to sink: regulation of carotenoid biosynthesis in plants. Trends Plant Sci. 15: 266–274. [DOI] [PubMed] [Google Scholar]
  16. Chan E. K. F., Rowe H. C., Corwin J. A., Joseph B., Kliebenstein D. J., 2011.  Combining genome-wide association mapping and transcriptional networks to identify novel genes controlling glucosinolates in Arabidopsis thaliana. PLoS Biol. 9: e1001125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chander S., Guo Y. Q., Yang X. H., Zhang J., Lu X. Q., et al. , 2008.  Using molecular markers to identify two major loci controlling carotenoid contents in maize grain. Theor. Appl. Genet. 116: 223–233. [DOI] [PubMed] [Google Scholar]
  18. Chandler K., Lipka A. E., Owens B. F., Li H. H., Buckler E. S., et al. , 2013.  Genetic analysis of visually scored orange kernel color in maize. Crop Sci. 53: 189–200. [Google Scholar]
  19. Chen J. H., Chen Z. H., 2008.  Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95: 759–771. [Google Scholar]
  20. Chia J. M., Song C., Bradbury P. J., Costich D., de Leon N., et al. , 2012.  Maize HapMap2 identifies extant variation from a genome in flux. Nat. Genet. 44: 803–807. [DOI] [PubMed] [Google Scholar]
  21. Combs, G. F., 2012 Vitamin A, pp. 93–138 in Vitamins: Fundamental Aspects in Nutrition and Health, Ed 4. Academic Press, New York, NY. [Google Scholar]
  22. Cunningham F. X., Pogson B., Sun Z. R., McDonald K. A., DellaPenna D., et al. , 1996.  Functional analysis of the beta and epsilon lycopene cyclase enzymes of Arabidopsis reveals a mechanism for control of cyclic carotenoid formation. Plant Cell 8: 1613–1626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Cuttriss A. J., Cazzonelli C. I., Wurtzel E. T., Pogson B. J., 2011.  Carotenoids. Adv. Bot. Res. 58: 1–36. [Google Scholar]
  24. Davis C., Jing H., Howe J. A., Rocheford T., Tanumihardjo S. A., 2008.  β-Cryptoxanthin from supplements or carotenoid-enhanced maize maintains liver vitamin A in Mongolian gerbils (Meriones unguiculatus) better than or equal to β-carotene supplements. Br. J. Nutr. 100: 786–793. [DOI] [PubMed] [Google Scholar]
  25. DellaPenna D., Pogson B. J., 2006.  Vitamin synthesis in plants: tocopherols and carotenoids. Annu. Rev. Plant Biol. 57: 711–738. [DOI] [PubMed] [Google Scholar]
  26. Egesel C. O., Wong J. C., Lambert R. J., Rocheford T. R., 2003.  Gene dosage effects on carotenoid concentration in maize grain. Maydica 48: 183–190. [Google Scholar]
  27. Emerson R. A., 1921.  The Genetic Relations of Plant Colors in Maize. Cornell University Press, Ithaca, NY. [Google Scholar]
  28. Endelman J. B., 2011.  Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4: 250–255. [Google Scholar]
  29. Flint-Garcia S. A., Thuillet A. C., Yu J. M., Pressoir G., Romero S. M., et al. , 2005.  Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J. 44: 1054–1064. [DOI] [PubMed] [Google Scholar]
  30. Friedman D. S., O’Colmain B., Tomany S. C., McCarty C., de Jong P. T., et al. , 2004.  Prevalence of age-related macular degeneration in the United States. Arch. Ophthalmol. 122: 564–572. [DOI] [PubMed] [Google Scholar]
  31. Friedman J., Hastie T., Tibshirani R., 2010.  Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33: 1–22. [PMC free article] [PubMed] [Google Scholar]
  32. Fu J., Cheng Y., Linghu J., Yang X., Kang L., et al. , 2013.  RNA sequencing reveals the complex regulatory network in the maize kernel. Nat. Commun. 4: 2832. [DOI] [PubMed] [Google Scholar]
  33. Fu Z. Y., Chai Y. C., Zhou Y., Yang X. H., Warburton M. L., et al. , 2013.  Natural variation in the sequence of PSY1 and frequency of favorable polymorphisms among tropical and temperate maize germplasm. Theor. Appl. Genet. 126: 923–935. [DOI] [PubMed] [Google Scholar]
  34. Gilmour, A. R. G., B.; B. Cullis, R. Thompson, D. Butler, 2009 Asreml User Guide Release 3.0. VSN International Ltd, Hemel Hempstead, UK.
  35. Goff S. A., Klee H. J., 2006.  Plant volatile compounds: Sensory cues for health and nutritional value? Science 311: 815–819. [DOI] [PubMed] [Google Scholar]
  36. Gonzalez-Jorge S., Ha S. H., Magallanes-Lundback M., Gilliland L. U., Zhou A. L., et al. , 2013.  Carotenoid cleavage dioxygenase4 is a negative regulator of β-carotene content in Arabidopsis seeds. Plant Cell 25: 4812–4826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Gore M. A., Chia J. M., Elshire R. J., Sun Q., Ersoz E. S., et al. , 2009.  A first-generation haplotype map of maize. Science 326: 1115–1117. [DOI] [PubMed] [Google Scholar]
  38. Harjes C. E., Rocheford T. R., Bai L., Brutnell T. P., Kandianis C. B., et al. , 2008.  Natural genetic variation in lycopene epsilon cyclase tapped for maize biofortification. Science 319: 330–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hieber A. D., Bugos R. C., Yamamoto H. Y., 2000.  Plant lipocalins: violaxanthin de-epoxidase and zeaxanthin epoxidase. Biochim. Biophys. Acta 1482: 84–91. [DOI] [PubMed] [Google Scholar]
  40. Holland J. B., Nyquist W. E., Cervantes-Martinez C. T., 2003.  Estimating and interpreting heritability for plant breeding: an update. Plant Breed. Rev. 22: 9–112. [Google Scholar]
  41. Howe J. A., Tanumihardjo S. A., 2006.  Carotenoid-biofortified maize maintains adequate vitamin A status in Mongolian gerbils. J. Nutr. 136: 2562–2567. [DOI] [PubMed] [Google Scholar]
  42. Howitt C. A., Pogson B. J., 2006.  Carotenoid accumulation and function in seeds and non-green tissues. Plant Cell Environ. 29: 435–445. [DOI] [PubMed] [Google Scholar]
  43. Hung H. Y., Browne C., Guill K., Coles N., Eller M., et al. , 2012.  The relationship between parental genetic or phenotypic divergence and progeny variation in the maize nested association mapping population. Heredity 108: 490–499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Jerome-Morais A., Diamond A. M., Wright M. E., 2011.  Dietary supplements and human health: For better or for worse? Mol. Nutr. Food Res. 55: 122–135. [DOI] [PubMed] [Google Scholar]
  45. Kandianis C. B., Stevens R., Liu W. P., Palacios N., Montgomery K., et al. , 2013.  Genetic architecture controlling variation in grain carotenoid composition and concentrations in two maize populations. Theor. Appl. Genet. 126: 2879–2895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Kermode A. R., 2005.  Role of abscisic acid in seed dormancy. J. Plant Growth Regul. 24: 319–344. [Google Scholar]
  47. Khoo H. E., Prasad K. N., Kong K. W., Jiang Y., Ismail A., 2011.  Carotenoids and their isomers: color pigments in fruits and vegetables. Molecules 16: 1710–1738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Kim J., Smith J. J., Tian L., DellaPenna D., 2009.  The evolution and function of carotenoid hydroxylases in Arabidopsis. Plant Cell Physiol. 50: 463–479. [DOI] [PubMed] [Google Scholar]
  49. Krinsky N. I., Landrum J. T., Bone R. A., 2003.  Biologic mechanisms of the protective role of lutein and zeaxanthin in the eye. Annu. Rev. Nutr. 23: 171–201. [DOI] [PubMed] [Google Scholar]
  50. Kutner M. H., 2005.  Applied Linear Statistical Models. McGraw-Hill Irwin, Boston. [Google Scholar]
  51. Lantieri, F., M. A. Jhun, J. Park, T. Park, and M. Devoto, 2009 Comparative analysis of different approaches for dealing with candidate regions in the context of a genome-wide association study. BMC Proc. 3(Suppl 7): S93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Li Z. H., Matthews P. D., Burr B., Wurtzel E. T., 1996.  Cloning and characterization of a maize cDNA encoding phytoene desaturase, an enzyme of the carotenoid biosynthetic pathway. Plant Mol. Biol. 30: 269–279. [DOI] [PubMed] [Google Scholar]
  53. Lipka A. E., Gore M. A., Magallanes-Lundback M., Mesberg A., Lin H. N., et al. , 2013.  Genome-wide association study and pathway-level analysis of tocochromanol levels in maize grain. G3 (Bethesda) 3: 1287–1299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Loiselle B. A., Sork V. L., Nason J., Graham C., 1995.  Spatial genetic-structure of a tropical understory shrub, Psychotria Officinalis (Rubiaceae). Am. J. Bot. 82: 1420–1425. [Google Scholar]
  55. Long A. D., Langley C. H., 1999.  The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 9: 720–731. [PMC free article] [PubMed] [Google Scholar]
  56. Lorenz A. J., Chao S. M., Asoro F. G., Heffner E. L., Hayashi T., et al. , 2011.  Genomic selection in plant breeding: knowledge and prospects. Adv. Agron. 110: 77–123. [Google Scholar]
  57. Lubin J. H., Colt J. S., Camann D., Davis S., Cerhan J. R., et al. , 2004.  Epidemiologic evaluation of measurement data in the presence of detection limits. Environ. Health Perspect. 112: 1691–1696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Matthews P. D., Luo R. B., Wurtzel E. T., 2003.  Maize phytoene desaturase and zeta-carotene desaturase catalyse a poly-Z desaturation pathway: implications for genetic engineering of carotenoid content among cereal crops. J. Exp. Bot. 54: 2215–2230. [DOI] [PubMed] [Google Scholar]
  59. McMullen M. D., Kresovich S., Villeda H. S., Bradbury P., Li H. H., et al. , 2009.  Genetic properties of the maize nested association mapping population. Science 325: 737–740. [DOI] [PubMed] [Google Scholar]
  60. Meenakshi, J. V., A. Banerji, V. Manyong, K. Tomlins, P. Hamukwala et al., 2010 Consumer acceptance of provitamin A orange maize in rural Zambia. HarvestPlus Working Paper 4. IFPRI, Washington, DC.
  61. Meuwissen T. H. E., Hayes B. J., Goddard M. E., 2001.  Prediction of total genetic value using genome-wide dense marker maps. Genetics 157: 1819–1829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Moise A. R., Al-Babili S., Wurtzel E. T., 2014.  Mechanistic aspects of carotenoid biosynthesis. Chem. Rev. 114: 164–193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Moran N. A., Jarvik T., 2010.  Lateral transfer of genes from fungi underlies carotenoid production in aphids. Science 328: 624–627. [DOI] [PubMed] [Google Scholar]
  64. Muzhingi T., Langyintuo A. S., Malaba L. C., Banziger M., 2008.  Consumer acceptability of yellow maize products in Zimbabwe. Food Policy 33: 352–361. [Google Scholar]
  65. Nestel P., Bouis H. E., Meenakshi J. V., Pfeiffer W., 2006.  Biofortification of staple food crops. J. Nutr. 136: 1064–1067. [DOI] [PubMed] [Google Scholar]
  66. Pfeiffer W. H., McClafferty B., 2007.  HarvestPlus: breeding crops for better nutrition. Crop Sci. 47: S88–S105. [Google Scholar]
  67. Pixley K., Rojas N. P., Babu R., Mutale R., Surles R., et al. , 2013.  Biofortification of maize with provitamin A carotenoids, pp. 271–292 in Carotenoids and Human Health, edited by Tanumihardjo S. Humana Press, New York. [Google Scholar]
  68. Platt A., Vilhjálmsson B. J., Nordborg M., 2010.  Conditions under which genome-wide association studies will be positively misleading. Genetics 186: 1045–1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Price A. L., Patterson N. J., Plenge R. M., Weinblatt M. E., Shadick N. A., et al. , 2006.  Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38: 904–909. [DOI] [PubMed] [Google Scholar]
  70. Quinlan R. F., Shumskaya M., Bradbury L. M. T., Beltran J., Ma C. H., et al. , 2012.  Synergistic interactions between carotene ring hydroxylases drive lutein formation in plant carotenoid biosynthesis. Plant Physiol. 160: 204–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Resende M. F. R., Munoz P., Resende M. D. V., Garrick D. J., Fernando R. L., et al. , 2012.  Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190: 1503–1510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Riedelsheimer C., Czedik-Eysenberg A., Grieder C., Lisec J., Technow F., et al. , 2012.  Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat. Genet. 44: 217–220. [DOI] [PubMed] [Google Scholar]
  73. Romay M. C., Millard M. J., Glaubitz J. C., Peiffer J. A., Swarts K. L., et al. , 2013.  Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol. 14: R55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Rutkoski J., Benson J., Jia Y., Brown-Guedira G., Jannink J. L., et al. , 2012.  Evaluation of genomic prediction methods for fusarium head Blight resistance in wheat. Plant Genome 5: 51–61. [Google Scholar]
  75. SAS Institute , 2012.  The SAS System for Windows. Version 9.3. SAS Institute, Cary, NC. [Google Scholar]
  76. Schwarz G., 1978.  Estimating the dimension of a model. Ann. Stat. 6: 461–464. [Google Scholar]
  77. Segura V., Vilhjalmsson B. J., Platt A., Korte A., Seren U., et al. , 2012.  An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat. Genet. 44: 825–830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Sen S., Chakraborty R., 2011.  The role of antioxidants in human health, pp. 1–37 in Oxidative Stress: Diagnostics, Prevention, and Therapy, edited by Andreescu S., Hepel M. Oxford University Press, New York. [Google Scholar]
  79. Stahl W., Sies H., 2005.  Bioactivity and protective effects of natural carotenoids. Biochim. Biophys. Acta 1740: 101–107. [DOI] [PubMed] [Google Scholar]
  80. Stevens R., Winter-Nelson A., 2008.  Consumer acceptance of provitamin A-biofortified maize in Maputo, Mozambique. Food Policy 33: 341–351. [Google Scholar]
  81. Sun G., Zhu C., Kramer M. H., Yang S. S., Song W., et al. , 2010.  Variation explained in mixed-model association mapping. Heredity 105: 333–340. [DOI] [PubMed] [Google Scholar]
  82. Tanumihardjo S. A., Bouis H., Hotz C., Meenakshi J. V., McClafferty B., 2008.  Biofortification of staple crops: an emerging strategy to combat hidden hunger. Symposium on “Food Technology for Better Nutrition.” Comp. Rev. Food Sci. Safety 7: 329–334. [Google Scholar]
  83. Tian L., Musetti V., Kim J., Magallanes-Lundback M., DellaPenna D., 2004.  The Arabidopsis LUT1 locus encodes a member of the cytochrome P450 family that is required for carotenoid epsilon-ring hydroxylation activity. Proc. Natl. Acad. Sci. USA 101: 402–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Tibshirani R., 1996.  Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Series B Stat. Methodol. 58: 267–288. [Google Scholar]
  85. Turner T., Burri B. J., Jamil K. M., Jamil M., 2013.  The effects of daily consumption of β-cryptoxanthin-rich tangerines and β-carotene-rich sweet potatoes on vitamin A and carotenoid concentrations in plasma and breast milk of Bangladeshi women with low vitamin A status in a randomized controlled trial. Am. J. Clin. Nutr. 98: 1200–1208. [DOI] [PubMed] [Google Scholar]
  86. Vallabhaneni R., Bradbury L. M. T., Wurtzel E. T., 2010.  The carotenoid dioxygenase gene family in maize, sorghum, and rice. Arch. Biochem. Biophys. 504: 104–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Wang K., Li M. Y., Hakonarson H., 2010.  Analysing biological pathways in genome-wide association studies. Nat. Rev. Genet. 11: 843–854. [DOI] [PubMed] [Google Scholar]
  88. Wen W., Li D., Li X., Gao Y., Li W., et al. , 2014.  Metabolome-based genome-wide association study of maize kernel leads to novel biochemical insights. Nat. Commun. 5: 3438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Wong J. C., Lambert R. J., Wurtzel E. T., Rocheford T. R., 2004.  QTL and candidate genes phytoene synthase and ζ-carotene desaturase associated with the accumulation of carotenoids in maize. Theor. Appl. Genet. 108: 349–359. [DOI] [PubMed] [Google Scholar]
  90. Wurtzel E. T., Cuttriss A., Vallabhaneni R., 2012.  Maize provitamin A carotenoids, current resources, and future metabolic engineering challenges. Front. Plant Sci. 3: 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Yan J. B., Kandianis C. B., Harjes C. E., Bai L., Kim E. H., et al. , 2010.  Rare genetic variation at Zea mays crtRB1 increases beta-carotene in maize grain. Nat. Genet. 42: 322–327. [DOI] [PubMed] [Google Scholar]
  92. Yu J. M., Pressoir G., Briggs W. H., Bi I. V., Yamasaki M., et al. , 2006.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38: 203–208. [DOI] [PubMed] [Google Scholar]
  93. Zhou Y., Han Y. J., Li Z. G., Fu Y., Fu Z. Y., et al. , 2012.  ZmcrtRB3 encodes a carotenoid hydroxylase that affects the accumulation of α-carotene in maize kernel. J. Integr. Plant Biol. 54: 260–269. [DOI] [PubMed] [Google Scholar]
  94. Zou H., Hastie T., 2005.  Regularization and variable selection via the elastic net. J. R. Stat. Soc. B. 67: 301–320. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES