Abstract
The use of molecular markers has revolutionized the pace and precision of plant genetic analysis which in turn facilitated the implementation of molecular breeding of crops. The last three decades have seen tremendous advances in the evolution of marker systems and the respective detection platforms. Markers based on single nucleotide polymorphisms (SNPs) have rapidly gained the center stage of molecular genetics during the recent years due to their abundance in the genomes and their amenability for high-throughput detection formats and platforms. Computational approaches dominate SNP discovery methods due to the ever-increasing sequence information in public databases; however, complex genomes pose special challenges in the identification of informative SNPs warranting alternative strategies in those crops. Many genotyping platforms and chemistries have become available making the use of SNPs even more attractive and efficient. This paper provides a review of historical and current efforts in the development, validation, and application of SNP markers in QTL/gene discovery and plant breeding by discussing key experimental strategies and cases exemplifying their impact.
1. Introduction
Allelic variations within a genome of the same species can be classified into three major groups that include differences in the number of tandem repeats at a particular locus [microsatellites, or simple sequence repeats (SSRs)] [1], segmental insertions/deletions (InDels) [2], and single nucleotide polymorphisms (SNPs) [3]. In order to detect and track these variations in the individuals of a progeny at DNA level, researchers have been developing and using genetic tools called molecular markers [4]. Although SSRs, InDels, and SNPs are the three major allelic variations discovered so far, a plethora of molecular markers were developed to detect the polymorphisms that resulted from these three types of variation [5]. Evolution of molecular markers has been primarily driven by the throughput and cost of detection method and the level of reproducibility [6]. Depending on detection method and throughput, all molecular markers can be divided into three major groups: (1) low-throughput, hybridization-based markers such as restriction fragment length polymorphisms (RFLPs) [4]; (2) medium-throughput, PCR-based markers that include random amplification of polymorphic DNA (RAPD) [7], amplified fragment length polymorphism (AFLP) [8], SSRs [9]; (3) high-throughput (HTP) sequence-based markers: SNPs [3]. In late eighties, RFLPs were the most popular molecular markers that were widely used in plant molecular genetics because they were reproducible and codominant [10]. However, the detection of RFLPs was an expensive, labor- and time-consuming process, which made these markers eventually obsolete. Moreover, RFLP markers were not amenable to automation. Invention of PCR technology and the application of this method for the rapid detection of polymorphisms overthrew low-throughput RFLP markers, and new generation of PCR-based markers emerged in the beginning of nineties. RAPD, AFLP, and SSR markers are the major PCR-based markers that research community has been using in various plant systems. RAPDs are able to simultaneously detect polymorphic loci in various regions of a genome [11]. However, they are anonymous and the level of their reproducibility is very low due to the non-specific binding of short, random primers. Although AFLPs are anonymous too, the level of their reproducibility and sensitivity is very high owing to the longer +1 and +3 selective primers and the presence of discriminatory nucleotides at 3′ end of each primer. That is why AFLP markers are still popular in molecular genetics research in crops with little to zero reference genome sequence available [12]. However, AFLP markers did not find widespread application in molecular breeding owing to the lengthy and laborious detection method, which was not amenable to automation either. Therefore, it was not surprising that soon after the discovery of SSR markers in the genome of a plant, they were declared as “markers of choice” [13], because SSRs were able to eliminate all drawbacks of the above-mentioned DNA marker technologies. SSRs were no longer anonymous; they were highly reproducible, highly polymorphic, and amenable to automation. Despite the cost of detection remaining high, SSR markers had pervaded all areas of plant molecular genetics and breeding in late 90s and the beginning of 21st century. However, during the last five years, the hegemony of medium-throughput SSRs was eventually broken by SNP markers. First discovered in human genome, SNPs proved to be universal as well as the most abundant forms of genetic variation among individuals of the same species [14]. Although SNPs are less polymorphic than SSR markers because of their biallelic nature, they easily compensate this drawback by being abundant, ubiquitous, and amenable to high- and ultra-high-throughput automation. However, despite these obvious advantages, there were only a limited number of examples of application of SNP markers in plant breeding by 2009 [15]. In this paper, we tried to summarize the recent progress in the utility of SNP markers in plant breeding.
2. SNP Discovery in Complex Plant Genomes
While SNP discovery in crops with simple genomes is a relatively straightforward process, complex genomes pose serious obstacles for the researchers interested in developing SNPs. One of the major problems is the highly repetitive nature of the plant genomes [16]. Prior to the emergence of next-generation sequencing (NGS) technologies, researchers used to rely on different experimental strategies to avoid repetitive portions of the genome. These include discovery of SNPs experimentally by resequencing of unigene-derived amplicons using Sanger's method [17] and in silico SNP discovery through the mining of SNPs within EST databases followed by PCR-based validation [18]. Although these approaches allowed the detection of gene-based SNPs, their frequency is generally low in conserved genic regions, and they were unable to discover SNPs located in low-copy noncoding regions and intergenic spaces. Additionally, amplicon resequencing was an expensive and labor-intensive procedure [15]. As many crops are ancient tetraploids with mosaics of scattered duplicated regions [19], in silico and experimental mining of EST databases resulted in the discovery of a large number of nonallelic SNPs that represented paralogous sequences and were suboptimal for application in molecular breeding [20]. Recent emergence of NGS technologies such as 454 Life Sciences (Roche Applied Science, Indianapolis, IN), HiSeq (Illumina, San Diego, CA), SOLiD and Ion Torrent (Life Technologies Corporation, Carlsbad, CA) has eliminated the problems associated with low throughput and high cost of SNP discovery [21]. Transcriptome resequencing using NGS technologies allows rapid and inexpensive SNP discovery within genes and avoids highly repetitive regions of a genome [22]. This methodology was successfully applied in several plant genomes, including maize [23], canola [24], eucalyptus [25], sugarcane [26], tree species [27], wheat [28], avocado [29], and black currant [30]. Originally developed for human disease diagnostic research, the NimbleGen sequence capture technology (Roche Applied Science, IN) [31] brought the detection of gene-based SNPs in plants into higher throughput and coverage level [32]. This technology consists of exon sequence capture and enrichment by microarray followed by NGS for targeted resequencing. Similar in-solution target capture technologies, such as Agilent SureSelect, are also commercially available for genome/exome mining studies. However, this technology would be efficient only for crops with available reference genome sequence or large transcriptome (EST) datasets, since the design of capture probes requires these reference resources.
Despite the attractiveness of SNP discovery via transcriptome or exome resequencing, this process is targeted, focusing solely on coding regions. It is obvious that the availability of SNPs within coding sequences is a very powerful tool for molecular geneticists to detect a causative mutation [33]. However, often QTL are located in noncoding regulatory sequences such as enhancers or locus control regions, which could be located several megabases away from genes within intergenic spaces [34]. Discovery of SNPs located within those regulatory elements via transcriptome or exon sequencing is limited. In order to discover SNPs in a genome-wide fashion and avoid repetitive and duplicated DNA, it is very important to employ genome complexity reduction techniques coupled with NGS technologies. Several genome complexity reduction techniques have been developed over the years, including High Cot selection [35], methylation filtering [36], and microarray-based genomic selection [37]. These techniques mainly reduce the number of repetitive sequences but lack the power to recognize and eliminate duplicated sequences, which cause the detection of false-positive SNPs. Unlike the above-mentioned techniques, recently developed genome complexity reduction technologies such as Complexity Reduction of Polymorphic Sequences (CRoPS) (Keygene N.V., Wageningen, The Netherlands) [38] and Restriction Site Associated DNA (RAD) (Floragenics, Eugene, OR, USA) [39] are computationally well equipped and capable of filtering out duplicated SNPs. These systems were successfully applied to discover SNPs in crops with [40] and without reference genome sequences [41].
Although several complexity reduction approaches are being developed to generate data from NGS platforms, it is often challenging to identify candidate SNPs in polyploid crops species such as potato, tobacco, cotton, canola, and wheat. In general, minor allele frequency could be used as a measure to identify candidate SNPs in diploid species [42]. However, in polyploid crops, you often find loci that are polymorphic within a single genotype due to the presence of either homoeologous loci from the individual subgenomes (homoeologous SNPs) or paralogous loci from duplicated regions of the genome. Such false positive SNPs are not useful for genetic mapping purposes and often lead to a lower validation rate during assays. Successful SNP validation in allopolyploids depends upon differentiation of the sequence variation classes [43]. Use of haplotype information beside the allelic frequency would help to identify homologous SNPs (true SNPs) from those of homoeologous loci (false positives). Bioinformatic programs such as HaploSNPer [44] would facilitate identification of candidate loci for assay design purposes in polyploid crops. Elimination of homoeologous loci for the assay design process would improve the validation rate. Such approaches could also be extended to other complex and highly repetitive diploid genomes such as barley. Complexity reduction approaches, combined with sophisticated computational tools, would expedite SNP discovery and validation efforts in polyploids.
Although CRoPS and RAD technologies are powerful tools to detect SNPs in genome-wide fashion, they can hardly be called HTP, because on an average only ~1,000 SNPs pass stringent quality control [40]. While these numbers are enough to generate genetic linkage maps of reasonable saturation and carry out preliminary QTL mapping, they are not adequate to implement genome-wide association studies (GWAS). Depending on the rate of linkage disequilibrium decay, GWAS might require several million genetic landmarks. From this point of view, genotyping-by-sequencing (GBS) technique offers many more opportunities. Discovery of a large number of SNPs using GBS was demonstrated in maize [45] and sorghum [46]. GBS not only increases the sequencing throughput by several orders of magnitude but also has multiplexing capabilities [47]. To eliminate a large portion of repetitive sequences, a type II restriction endonuclease, ApeKI, is applied to digest DNA prior to sequencing to generate reduced representation libraries (genome complexity reduction component), which are further subject to sequencing [47]. In polyploid crops, GBS might be challenging, but the associated complexity reduction methods could be used for SNP discovery. For discovery purposes, the availability of a reference genome is not an absolute requirement to implement GBS approach. However, in organisms that do not have a reference genome, GBS-derived SNPs must be validated using one of the techniques that are described in the following section, which might dramatically increase per marker price. Validation needs to be done primarily to discard paralogous SNPs. For organisms with a reference genome sequence, the validation step is replaced by in silico mapping of the sequenced fragments to the genome. Although GBS has the potential to discover several million SNPs, one of the major drawbacks of this technique is large numbers of missing data. To solve this problem, computational biologists developed data imputation models such as BEAGLE v3.0.2 [48] and IMPUTE v2 [49], to bring imputed data as close as possible to the real data [50, 51].
3. SNP Validation and Modern Genotyping Platforms and Chemistries
The availability of reference sequence and sophisticated software does not always guarantee that the discovered SNP can be converted into a valid marker. In order to insure that the discovered SNP is a Mendelian locus, it has to be validated. The validation of a marker is the process of designing an assay based on the discovered polymorphism and then genotyping a panel of diverse germplasm and segregating population. Compared to the collection of unrelated lines, a segregating population is more informative as a validation panel because it allows the inspection of the discriminatory ability and segregation patterns of a marker which helps the researcher to understand whether it is a Mendelian locus or a duplicated/repetitive sequence that escaped the software filter [40].
The most popular HTP assays/chemistries and genotyping platforms that are currently being used for SNP validation are Illumina's BeadArray technology-based Golden Gate (GG) [52] and Infinium assays [53], Life Technologies' TaqMan [54] assay coupled with OpenArray platform (TaqMan OpenArray Genotyping system, Product bulletin), and KBiosciences' Competitive Allele Specific PCR (KASPar) combined with the SNP Line platform (SNP Line XL; http://www.kbioscience.co.uk). These modern genotyping assays and platforms differ from each other in their chemistry, cost, and throughput of samples to genotype and number of SNPs to validate. The choice of chemistry and genotyping platform depends on many factors that include the length of SNP context sequence, overall number of SNPs to genotype, and finally the funds available to the researcher, because most of these chemistries still remain cost intensive. Comparative analyses of these four genotyping assays and platforms were described in Kumpatla et al. [55].
Though all genotyping chemistries and platforms are applicable to generate genotypic data in polyploid crops, analysis of SNP calls is somewhat challenging in polyploids due to multiallele combinations in the genotypes. SNPs in polyploid species can be broadly classified as simple SNPs, hemi-SNPs, and homoeo-SNPs. Here, we describe simple, hemi-, and homoeo-SNPs using an example of allele calls in tetraploid and diploid cotton species (Figure 1). Genomes of tetraploid cotton species, Gossypium hirsutum (AD1) and G. barbadense (AD2), consist of two subgenomes A and D, where A genome was derived from diploid progenitors, such as G. herbaceum (A1) and G. arboreum (A2), and D genome resulted from another diploid progenitor G. raimondii (D5). Simple, or true SNPs are markers that detect allelic variation between homologous loci of the same subgenome of two tetraploid samples. For example, in Figure 1(a), a SNP marker clearly detects polymorphism within A subgenomes of G. hirsutum (AD1) and G. barbadense (AD2) and separates samples into homozygous A (blue) and B (red) clusters. This marker does not discriminate polymorphism in D subgenome, because the D genome allele is absent there (pink dot in G. raimondii). In contrast to simple SNPs, hemi-SNPs detect allelic variation in the homozygous state in one sample and the heterozygous state in the other sample. In Figure 1(b), SNP marker detects both alleles (A and B) in G. hirsutum (heterozygous green cluster) and one allele A in G. barbadense (a homozygous blue cluster) and could be vice versa. Homoeo-SNPs detect homoeologous and possibly paralogous loci both in A and D subgenomes and result in monomorphic loci in tetraploid species (right image). In Figure 1(c) A genome progenitors (G. herbaceum and G. arboreum) had allele A (blue) and D genome progenitor (G. raimondii) had allele B (red), but both tetraploid species (G. hirsutum and G. barbadense) were grouped into heterozygous AB (green) cluster. As homoeo-SNPs can detect paralogous loci, the diploid progenitors both have different alleles.
Simple SNPs as well as hemi-SNPs are useful markers for genetic mapping and diversity screening studies. Simple SNPs segregate like the markers in diploids in most of the mapping populations and would account for approximately 10–30% of total polymorphic SNPs in various polyploid crop species. Hemi-SNPs form a major category (30–60%) of polymorphic SNPs in a polyploid crop species and could be used for genetic mapping purposes in F2, RIL, and DH populations. Homoeo-SNPs are of lesser value for mapping purposes as most of the genotypes result in heterologous loci due to polymorphism between the homoeologous genomes or duplicated loci within each of the polyploid genotypes [56].
4. Application of SNP Markers in Gene/QTL Discovery
4.1. Biparental Approach
Genetic mapping studies involve genetic linkage analysis, which is based on the concept of genetic recombination during meiosis [57]. This encompasses developing genetic linkage maps following genotyping of individuals in segregating populations with DNA markers covering the genome of that organism. Since their discovery in the 1980s, DNA-based markers have been widely used in developing saturated genetic linkage maps as well as for the mapping and discovery of genes/QTL. With the large-scale availability of the sequence information and development of HTP technologies for SNP genotyping, SNP markers have been increasingly used for QTL mapping studies. This is primarily, because SNPs are highly abundant in the genomes and, therefore, they can provide the highest map resolution compared to other marker systems [58, 59]. A review of the selected examples of QTL and gene discovery using SNP markers is presented below.
4.1.1. Examples in Rice
A recent study on QTL analysis in rice for yield and three-yield-component traits, number of tillers per plant, number of grains per panicle, and grain weight compared a SNP-based map to that of a previous RFLP/SSR-based QTL map generated using the same mapping population [42]. Using the ultra-high-density SNP map, the authors showed that this map had more power and resolution relative to the RFLP/SSR map. This was clearly evident by the analysis of the two main QTL for grain weight, kgw3a (GS3) and kgw5 (GW5/qSW5). Using the SNP bin map, GW5/qSW5 QTL for grain width was accurately narrowed down to a 123 kb region as compared to the 12.4 Mb region based on the RFLP/SSR genetic map. Likewise, GS3 QTL for grain length was mapped to a 197 kb interval in comparison to 6 Mb region with the RFLP/SSR genetic map. Beside the power and the resolution, maps based on high-density SNP markers are also highly suitable for fine mapping and cloning of QTL and at times SNPs on these maps are also functionally associated with the natural variation in the trait. In another QTL mapping project, SNP and InDel markers were used to fine map qSH1 gene, a major QTL of seed shattering trait in rice [60]. The QTL were initially detected using RFLP and RAPD markers on F2 plants. Using large BC4F2 and BC3F2 populations in fine mapping approach with SNP and InDel markers, the authors mapped the functional natural variation to a 612 bp interval between the QTL flanking markers and discovered only one SNP. They further showed that this SNP in the 5′ regulatory region of the qSH1 gene caused loss of seed shattering. Fine mapping approach was also taken to positionally clone the rice bacterial blight resistance gene xa5, by isolating the recombination breakpoints to a pair of SNPs followed by sequencing of the corresponding 5 kb region [61]. Several studies have shown that the SNPs and InDels are highly abundant and present throughout the genome in various species including plants [62–64]. SNP genotyping is a valuable tool for gene mapping, map-based cloning, and marker assisted selection (MAS) in crops [65]. A study was conducted to assess the feasibility of SNPs and InDels as DNA markers in genetic analysis and marker-assisted breeding in rice by analyzing these sequence polymorphisms in the genomic region containing Piz and Piz-t rice blast resistance genes and developing PCR-based SNP markers [65]. The authors discovered that SNPs were abundant in the Piz and Piz-t (averaging one SNP every 248 bp), while InDels were much lower. This dense distribution of SNPs helped in developing SNP markers in the vicinity of these genes. Advancements in rice genomics have led to mapping and cloning of several genes and QTL controlling agronomically important traits, enabled routine use of SNP markers for MAS, gene pyramiding, and marker-assisted breeding (MAB) [66–68].
4.1.2. Examples in Maize
SNP markers have facilitated the dissection of complex traits such as flowering time in maize. Using a set of 5000 RILs, which represent the nested association mapping (NAM) population and genotyping with 1,200 SNP markers, the authors discovered that the genetic architecture of flowering time is controlled by small additive QTL rather than a single large-effect QTL [69]. The same NAM population was used for mapping resistance to northern leaf blight disease [70]. Twenty-nine QTL were discovered and candidate genes were identified with genome-wide NAM approach using 1.6 million SNPs. Proprietary SNP markers developed by companies are being predominantly used in their private breeding programs. A study from Pioneer Hi-Bred International Inc. reported identifying a high-oil QTL (qHO6) affecting maize seed oil and oleic acid contents. This QTL encodes an acyl-CoA:diacylglycerol acyltransferase (DGAT1-2), which catalyzes the final step of oil synthesis [71].
4.1.3. Examples in Wheat
Recent advances in wheat genomics have led to the implementation of high-density SNP genotyping in wheat [72–75]. Gene-based SNP markers were developed for Lr34/Yr18/Pm38 locus that confers resistance to leaf rust, stripe rust, and powdery mildew diseases [76]. These markers serve as efficient tools for MAS and MAB of disease resistant wheat lines. Another economically important wheat disease, Fusarium head blight (FHB), has been extensively studied. Several QTL controlling FHB resistance have been identified, with the most important being Fhb1 [77]. Recently, SNP markers were mapped between the known flanking markers for Fhb1 [78]. These new markers would be useful for MAS and fine mapping towards cloning the Fhb1 gene. MAS in wheat has been extensively applied for simple traits that are difficult to score [79].
4.1.4. Examples in Soybean
In order to improve the effectiveness of MAS and clone soybean aphid resistance gene, Rag1, fine mapping was done to accurately position the gene, which was previously mapped to a 12 cM interval [80]. The authors mapped the gene between two SNP markers that corresponded to a physical distance of 115 kb and identified several candidate genes. Similarly, another aphid resistance gene, Rag2, originally mapped to a 10 cM interval, was fine mapped to a 54 kb interval using SNP markers that were developed by resequencing of target intervals and sequence-tagged sites [81]. In another study that used a similar approach, the authors identified SNP markers tightly linked to a QTL conferring resistance to southern root-knot nematode by developing these SNP markers from the bacterial artificial chromosome (BAC) ends and SSR-containing genomic DNA clones [82]. In all of these examples the main idea behind the identification of closely linked SNP markers was to enhance the efficiency and cost effectiveness through MAS and increase the resolution within the target locus.
4.1.5. Examples in Other Crops
In a study conducted in canola to map the fad2 and fad3 gene, single nucleotide mutations were identified by sequencing the genomic clones of these genes and subsequently SNP markers were developed [83]. Allele-specific PCR assays were developed to enable direct selection of desirable fad2 and fad3 alleles in marker-assisted trait introgression and breeding. In barley, SNP markers were identified that were linked to a covered smut resistance gene, Ruh.7H, by using high-resolution melting (HRM) technique [84]. In sugar beet, an anchored linkage map based on AFLP, SNP, and RAPD markers was developed to map QTL for Beet necrotic yellow vein virus resistance genes, Rz4 [85] and Rz5 [86]. A consensus genetic map based on EST-derived SNPs was developed for cowpea that would be an important resource for genomic and QTL mapping studies in this crop [87]. In one of the post-genomic era studies in 2002, the fine mapping and map-based cloning approaches were used to clone the VTC2 gene in Arabidopsis [88]. The authors fine mapped the gene interval from ~980 kb region to a 20 kb interval with SNP and InDel markers. Additional nine candidate genes were identified in that interval and subsequently the underlying mutation was discovered. Although only a few examples that demonstrate the application of SNP markers in QTL mapping and genomic studies have been mentioned here, several other studies have been published in this area. Recent advances in HTP genotyping technologies and sequence information will further pave the way for rapid identification of causative variations and cloning of QTL of interest for use in MAB.
4.2. Genome-Wide Association Study Approach
GWAS is increasingly becoming a popular tool for dissecting complex traits in plants [89–92]. The idea behind GWAS is to genotype a large number of markers distributed across the genome so that the phenotype or the functional alleles will be in LD with one or few markers that could then be used in the breeding program. However, due to limited extent of LD, a greater number of markers are required for sufficient power to detect linkage between the marker and the underlying phenotypic variation. Several studies on association mapping in plants have been published and reviewed in the past [89, 90, 92, 93]. A few selected examples on the GWAS and candidate gene association (CGA) studies that utilized SNP markers are described below.
The successful use and first time demonstration of the power of GWAS was through the identification of a putative gene associated with a QTL in maize [94]. In that study, a single locus with major effect on oleic acid was mapped to a 4 cM genetic interval by using SNP haplotypes at 8,590 loci. The authors identified a fatty acid desaturase gene, fad2, at ~2 kb from one of the associated markers, and this was considered a likely causative gene. With the discovery of millions of SNPs in maize and the availability of tools such as NAM populations, GWAS was effectively applied to dissect the genetic architecture of leaf traits and it was also shown that variations at the liguleless genes contributed to more upright leaf phenotype [95]. Utility of the GWAS approach was demonstrated in barley through the mapping of a QTL for spot blotch disease resistance [96]. Using the diversity array technology (DArT) and SNP markers, the authors identified several QTL, some of which were not identified for this trait earlier. Another variant of the association mapping method is the CGA, where the association between one or few gene candidate loci and the trait of interest is tested. Using this approach 24 gene candidates were analyzed for association with the field resistance to late blight disease in potato and plant maturity. Nine SNPs were identified to be associated with maturity corrected resistance, explaining 50% of the genetic variance of this trait [97]. Two SNPs at the allene oxide synthase 2 (StAOS2) gene locus were associated with the largest effect on the trait of interest. A GWAS approach was also successfully applied to understand the genetic architecture of complex diseases such as northern and southern corn leaf blights [70, 98]. Although the number of papers dedicated to the application of GWAS to reveal the genetic basis of agronomic traits is growing, the practical utility of minor QTL in molecular breeding is yet to be shown. As GWAS requires large number of molecular markers, the utility of GWAS in dissection of molecular basis of traits in polyploid crops such as canola, wheat, and cotton has been fairly limited due to the insufficient number of polymorphic markers and the absence of reference genome. However, recently developed associative transcriptomics method has a potential to overcome the above-mentioned shortages [99]. Harper et al. [99] leveraged differentially expressed transcriptome sequences to develop molecular markers in tetraploid crop Brassica napus and associated them with glucosinolate content variation in seeds. Due to the precision of this method, scientists were able to correlate specific deletions in canola genome with two QTL controlling the trait. Annotation of deleted regions revealed the orthologs of the transcription factor HAG1, which controlled aliphatic glucosinolate biosynthesis in A. thaliana. This research work gives an optimism on successful application of GWAS in polyploid crops.
5. Implementation of SNP Markers in Plant Breeding
Due to the availability of HTP SNP detection and validation technologies, the development of SNP markers becomes a routine process, especially in crops with reference genome. How has that influenced the application of SNP markers in plant breeding? In a review article, Xu and Crouch [100] indicated fairly low number of articles dedicated to the marker assisted selection for the 1986–2005 period. The combination of three key phrases (“marker-assisted selection” AND “SNP” AND “plant breeding”), indeed, shows only 637 articles at Google Scholar for that period. However, similar search for the period, spanning 2006 through 2012, demonstrates almost sevenfold (~4,560) increase in the number of articles indicating the application of SNPs in MAS. A vast majority of those publications are from public sector and primarily describe mapping QTL using SNPs and state the potential usefulness of those markers in MAS without any experimental support for that. For most of those research studies, QTL mapping is the final destination and further application of those markers in actual MAS leading to the development of varieties seldom happens. Fairly low impact of academic research in the MAS-based variety development can be explained by the lack of funding to complete the entire marker development pipeline (MDP), which can be long term and cost intensive. MDP includes several steps such as (1) population development, (2) initial QTL mapping, (3) QTL validation (testing in several locations and years and implementing fine mapping), and (4) marker validation (development of inexpensive but HTP and automation amenable assays) [101]. Every step of the development of markers linked to QTL is associated with numerous constraints, which may take several years and substantial funding to resolve. However, since 2006, there have been a few success stories about the development of varieties using SNPs in publications derived from academic research, including the development of submergence-tolerant rice cultivars [102], rice cultivars with improved eating, cooking, and sensory quality [103], leaf rust resistant wheat variety “Patwin” [104], and maize cultivar with low phytic acid [105]. Although the private sector does not normally release details of its breeding methodologies to the public, several papers published by Monsanto [106, 107], Pioneer Hi-bred [71], Syngenta [108], and Dow AgroSciences [109] indicate that commercial organizations are the main drivers in the application of SNP markers in MAS [110].
Current MAS strategies fit the breeding programs for the traits that are highly heritable and governed by a single gene or one major QTL that explains a large portion of the phenotypic variability. In reality, most of the agronomic traits such as yield, drought and heat tolerance, nitrogen and water use efficiency, and fiber quality in cotton have complex inheritance that is controlled by multiple QTL with minor effect. Use of one of those minor QTL in MAS will be inefficient because of its negligible effect on phenotype.
The MAS scheme using paternity testing has recently been proposed to address challenges associated with selection gains that can be achieved in outbred forage crops [111]. Paternity testing, a nonlinkage-based MAS scheme, improves selection gains by increasing parental control in the selection gain equation. The authors demonstrated paternity testing MAS in three red clover breeding populations by using permutation-based truncation selection for a biomass-persistence index trait and achieved paternity-based selection gains that were greater than double the selection gains based on maternity alone. The paternity was determined by using a small set (11) of SSR markers. SNP markers can also be used for paternity testing, but one would require a relatively larger number of SNP loci [112].
Meuwissen et al. [113] described a new methodology in plant breeding called genomic selection (GS) that was intended to solve problems related to MAS of complex traits. This methodology also applies molecular markers but in a different fashion in both diploid and polyploid crop species. Unlike MAS, in GS markers are not used for tracking a trait. In GS high-density marker coverage is needed to potentially have all QTL in LD with at least one marker. Then the comprehensive information on all possible loci, haplotypes, and marker effects across the entire genome is used to calculate genomic estimated breeding value (GEBV) of a particular line in the breeding population.
GS of superior lines can be carried out within any breeding population. In order to enable successful GS, the experimental population must be identified. The population should not be necessarily derived from bi-parental cross but must be representative of selection candidates in the breeding program to which GS will be applied [114]. The experimental population must be genotyped with a large number of markers. Taking into account the low cost of sequencing, the best choice is the GBS implementation, which will yield maximum number of polymorphisms. The sequence of the two events, that is, phenotypic and genotypic data collection, is arbitrary and can be done in parallel. When both phenotypic and genotypic data are ready, one can start “training” molecular markers [115]. In order to train the GS model, the effect of each marker is calculated computationally. The effect of a marker is represented by a number with a positive or negative sign that indicates the positive or negative effect, respectively, of a particular locus to the phenotype. When the effects of all markers are known, they are considered “trained” and ready to assess any breeding population different from the experimental one for the same trait. Availability of trained GS model does not require the collection of phenotypic data from new breeding populations. The same set of “trained” markers will be used to genotype a new breeding population. Based on genotypic data, the known effects of each marker will be summed and GEBV of each line will be calculated. The higher the GEBV value of an individual line, the more the chances that this line will be selected and advanced in the breeding cycle. Thus, GS using high-density marker coverage has a potential to capture QTL with major and minor effects and eliminate the need to collect phenotypic data in all breeding cycles. Also, the application of GS was demonstrated to reduce the number of breeding cycles and increase the annual gain [114]. One of the problems of GS is the level of GEBV accuracy. Simulation studies based on simulated and empirical data demonstrated that GEBV accuracy could be within 0.62–0.85. Heffner et al. [114] used previously reported GEBV accuracy of 0.53 and reported three- and twofold annual gain in maize and winter barley, respectively. The obvious advantages of GS over traditional MAS have been successfully proven in animal breeding [116]. Rapid evolution of sequencing technologies and HTP SNP genotyping systems are enabling generation and validation of millions of markers, giving a “cautious optimism” for successful application of GS in breeding for complex traits [117–120].
6. Conclusion
SNP markers have become extremely popular in plant molecular genetics due to their genome-wide abundance and amenability for high- to ultra-high-throughput detection platforms. Unlike earlier marker systems, SNPs made it possible to create saturated, if not, supersaturated genetic maps, thereby enabling genome-wide tracking, fine mapping of target regions, rapid association of markers with a trait, and accelerated cloning of gene/QTL of interest. On the flip side, there are some challenges that need to be addressed or overcome while using SNPs. For example, the biallelic nature of SNPs needs to be compensated by discovering and using a larger number of SNPs to arrive at the same or higher power as that of earlier-generation molecular markers. This could be cost prohibitive depending on the crop and the sequence resources available for that genome. Working with polyploid crops is another challenge where useful SNPs are only a small percentage of the total available polymorphisms. Creative strategies need to be employed to generate a reasonable number of SNPs in those species. The use of SNP markers in MAB programs has been growing at a faster pace and so is the development of technologies and platforms for the discovery and HTP screening of SNPs in many crops. SNP chips are currently available for several crops; however, one disadvantage is that these readily available chips are made based on SNPs discovered from certain genotypes and, therefore, may not be ideal for projects utilizing unrelated genotypes. This necessitates creation of multiple chips or the usage of technologies that permit design flexibility but are economical. Although GBS creates great opportunities to discover a large number of SNPs at lower per sample cost within the genotypes of interest, the lack of adequate computational capabilities such as reliable data imputation algorithms and powerful computers allowing quick processing and the storage of a large amount of sequencing data becomes a major bottleneck. Despite certain disadvantages or challenges, it is clear that SNP markers, in combination with genomics and other next-generation technologies, have been accelerating the pace and gains of plant breeding.
Acknowledgments
The authors would like to thank Drs. Shunxue Tang and Peizhong Zheng of the Trait Genetics and Technologies Department of Dow AgroSciences (DAS) and Raghav Ram of the IP Portfolio Development Department of DAS for careful review of the paper and the DAS Seeds and Traits R & D leaders Drs. David Meyer and Steve Thompson for general support and help.
References
- 1.Weber JL, May PE. Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. American Journal of Human Genetics. 1989;44(3):388–396. [PMC free article] [PubMed] [Google Scholar]
- 2.Ophir R, Graur D. Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene. 1997;205(1-2):191–202. doi: 10.1016/s0378-1119(97)00398-3. [DOI] [PubMed] [Google Scholar]
- 3.Wang DG, Fan JB, Siao CJ, et al. Large-scale identification, mapping, and genotyping of single- nucleotide polymorphisms in the human genome. Science. 1998;280(5366):1077–1082. doi: 10.1126/science.280.5366.1077. [DOI] [PubMed] [Google Scholar]
- 4.Botstein D, White RL, Skolnick M, Davis RW. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. American Journal of Human Genetics. 1980;32(3):314–331. [PMC free article] [PubMed] [Google Scholar]
- 5.Gupta PK, Varshney RK, Sharma PC, Ramesh B. Molecular markers and their applications in wheat breeding. Plant Breeding. 1999;118(5):369–390. [Google Scholar]
- 6.Bernardo R. Molecular markers and selection for complex traits in plants: learning from the last 20 years. Crop Science. 2008;48(5):1649–1664. [Google Scholar]
- 7.Welsh J, McClelland M. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Research. 1990;18(24):7213–7218. doi: 10.1093/nar/18.24.7213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Vos P, Hogers R, Bleeker M, et al. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research. 1995;23(21):4407–4414. doi: 10.1093/nar/23.21.4407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jacob HJ, Lindpaintner K, Lincoln SE, et al. Genetic mapping of a gene causing hypertension in the stroke-prone spontaneously hypertensive rat. Cell. 1991;67(1):213–224. doi: 10.1016/0092-8674(91)90584-l. [DOI] [PubMed] [Google Scholar]
- 10.Lander ES, Botstein S. Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics. 1989;121(1):p. 185. doi: 10.1093/genetics/121.1.185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Williams JGK, Kubelik AR, Livak KJ, Rafalski JA, Tingey SV. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Research. 1990;18(22):6531–6535. doi: 10.1093/nar/18.22.6531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zhang Z, Guo X, Liu B, Tang L, Chen F. Genetic diversity and genetic relationship of Jatropha curcas between China and Southeast Asian revealed by amplified fragment length polymorphisms. African Journal of Biotechnology. 2011;10(15):2825–2832. [Google Scholar]
- 13.Powell W, Machray GC, Proven J. Polymorphism revealed by simple sequence repeats. Trends in Plant Science. 1996;1(7):215–222. [Google Scholar]
- 14.Ghosh S, Malhotra P, Lalitha PV, Guha-Mukherjee S, Chauhan VS. Novel genetic mapping tools in plants: SNPs and LD-based approaches. Plant Science. 2002;162(3):329–333. [Google Scholar]
- 15.Ganal MW, Altmann T, Röder MS. SNP identification in crop plants. Current Opinion in Plant Biology. 2009;12(2):211–217. doi: 10.1016/j.pbi.2008.12.009. [DOI] [PubMed] [Google Scholar]
- 16.Meyers BC, Tingey SV, Morgante M. Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome. Genome Research. 2001;11(10):1660–1676. doi: 10.1101/gr.188201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wright SI, Bi IV, Schroeder SC, et al. Evolution: the effects of artificial selection on the maize genome. Science. 2005;308(5726):1310–1314. doi: 10.1126/science.1107891. [DOI] [PubMed] [Google Scholar]
- 18.Batley J, Barker G, O’Sullivan H, Edwards KJ, Edwards D. Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data. Plant Physiology. 2003;132(1):84–91. doi: 10.1104/pp.102.019422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pratap A, Gupta S, Kumar J, Solanki R. Soybean. Technological Innovations in Major World Oil Crops. 2012;1:293–321. [Google Scholar]
- 20.Choi IY, Hyten DL, Matukumalli LK, et al. A soybean transcript map: gene distribution, haplotype and single-nucleotide polymorphism analysis. Genetics. 2007;176(1):685–696. doi: 10.1534/genetics.107.070821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Mardis ER. The impact of next-generation sequencing technology on genetics. Trends in Genetics. 2008;24(3):133–141. doi: 10.1016/j.tig.2007.12.007. [DOI] [PubMed] [Google Scholar]
- 22.Morozova O, Marra MA. Applications of next-generation sequencing technologies in functional genomics. Genomics. 2008;92(5):255–264. doi: 10.1016/j.ygeno.2008.07.001. [DOI] [PubMed] [Google Scholar]
- 23.Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS. SNP discovery via 454 transcriptome sequencing. The Plant Journal. 2007;51(5):910–918. doi: 10.1111/j.1365-313X.2007.03193.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Trick M, Long Y, Meng J, Bancroft I. Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotechnology Journal. 2009;7(4):334–346. doi: 10.1111/j.1467-7652.2008.00396.x. [DOI] [PubMed] [Google Scholar]
- 25.Novaes E, Drost DR, Farmerie WG, et al. High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics. 2008;9, article 312 doi: 10.1186/1471-2164-9-312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bundock PC, Eliott FG, Ablett G, et al. Targeted single nucleotide polymorphism (SNP) discovery in a highly polyploid plant species using 454 sequencing. Plant Biotechnology Journal. 2009;7(4):347–354. doi: 10.1111/j.1467-7652.2009.00401.x. [DOI] [PubMed] [Google Scholar]
- 27.Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA. Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics. 2010;11(1, article 180) doi: 10.1186/1471-2164-11-180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lai K, Duran C, Berkman PJ, et al. Single nucleotide polymorphism discovery from wheat next-generation sequence data. Plant Biotechnology Journal. 2012;10(6):743–749. doi: 10.1111/j.1467-7652.2012.00718.x. [DOI] [PubMed] [Google Scholar]
- 29.Kuhn D. Design of an Illumina Infinium 6k SNPchip for genotyping two large avocado mapping populations. Proceedings of the 20th Conference on Plant and Animal Genome; January 2012; San Diego, CA. [Google Scholar]
- 30.Russell JR, Bayer M, Booth C, et al. Identification, utilisation and mapping of novel transcriptome-based markers from blackcurrant (Ribes nigrum) BMC Plant Biology. 2011;11, article 147 doi: 10.1186/1471-2229-11-147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hodges E, Xuan Z, Balija V, et al. Genome-wide in situ exon capture for selective resequencing. Nature Genetics. 2007;39(12):1522–1527. doi: 10.1038/ng.2007.42. [DOI] [PubMed] [Google Scholar]
- 32.Springer NM, Ying K, Fu Y, et al. Maize inbreds exhibit high levels of copy number variation (CNV) and presence/absence variation (PAV) in genome content. PLoS Genetics. 2009;5(11) doi: 10.1371/journal.pgen.1000734.e1000734 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Varshney RK. Gene-based marker systems in plants: high throughput approaches for marker discovery and genotyping. In: Jain SM, Brar DS, editors. Molecular Techniques in Crop Improvement. 2009. pp. 119–142. [Google Scholar]
- 34.Dean A. On a chromosome far, far away: LCRs and gene expression. Trends in Genetics. 2006;22(1):38–45. doi: 10.1016/j.tig.2005.11.001. [DOI] [PubMed] [Google Scholar]
- 35.Yuan Y, SanMiguel PJ, Bennetzen JL. High-Cot sequence analysis of the maize genome. The Plant Journal. 2003;34(2):249–255. doi: 10.1046/j.1365-313x.2003.01716.x. [DOI] [PubMed] [Google Scholar]
- 36.Emberton J, Ma J, Yuan Y, SanMiguel P, Bennetzen JL. Gene enrichment in maize with hypomethylated partial restriction (HMPR) libraries. Genome Research. 2005;15(10):1441–1446. doi: 10.1101/gr.3362105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME. Microarray-based genomic selection for high-throughput resequencing. Nature Methods. 2007;4(11):907–909. doi: 10.1038/nmeth1109. [DOI] [PubMed] [Google Scholar]
- 38.van Orsouw NJ, Hogers RCJ, Janssen A, et al. Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes. PLoS ONE. 2007;2(11) doi: 10.1371/journal.pone.0001172.e1172 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Baird NA, Etter PD, Atwood TS, et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE. 2008;3(10) doi: 10.1371/journal.pone.0003376.e3376 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Mammadov JA, Chen W, Ren R, et al. Development of highly polymorphic SNP markers from the complexity reduced portion of maize (Zea mays L.) genome for use in marker-assisted breeding. Theoretical and Applied Genetics. 2010;121(3):577–588. doi: 10.1007/s00122-010-1331-8. [DOI] [PubMed] [Google Scholar]
- 41.Chutimanitsakun Y, Nipper RW, Cuesta-Marcos A, et al. Construction and application for QTL analysis of a Restriction Site Associated DNA (RAD) linkage map in barley. BMC Genomics. 2011;12, article 4 doi: 10.1186/1471-2164-12-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Yu H, Xie W, Wang J, et al. Gains in QTL detection using an ultra-high density SNP map based on population sequencing relative to traditional RFLP/SSR markers. PLoS ONE. 2011;6(3) doi: 10.1371/journal.pone.0017595.e17595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Bus A, Hecht J, Huettel B, Reinhardt R, Stich B. High-throughput polymorphism detection and genotyping in Brassica napus using next-generation RAD sequencing. BMC Genomics. 2012;13(1):p. 281. doi: 10.1186/1471-2164-13-281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tang J, Leunissen JAM, Voorrips RE, van der Linden CG, Vosman B. HaploSNPer: a web-based allele and SNP detection tool. BMC Genetics. 2008;9, article 23 doi: 10.1186/1471-2156-9-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Narechania A, Gore MA, Buckler ES, et al. Large-scale discovery of gene-enriched SNPs. The Plant Genome. 2009;2(2):121–133. [Google Scholar]
- 46.Nelson JC, Wang S, Wu Y, et al. Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum. BMC Genomics. 2011;12, article 352 doi: 10.1186/1471-2164-12-352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Elshire RJ, Glaubitz JC, Sun Q, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE. 2011;6(5) doi: 10.1371/journal.pone.0019379.e19379 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. American Journal of Human Genetics. 2007;81(5):1084–1097. doi: 10.1086/521987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics. 2009;5(6) doi: 10.1371/journal.pgen.1000529.e1000529 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Huang X, Wei X, Sang T, et al. Genome-wide asociation studies of 14 agronomic traits in rice landraces. Nature Genetics. 2010;42(11):961–967. doi: 10.1038/ng.695. [DOI] [PubMed] [Google Scholar]
- 51.Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nature Reviews Genetics. 2010;11(7):499–511. doi: 10.1038/nrg2796. [DOI] [PubMed] [Google Scholar]
- 52.Fan JB, Oliphant A, Shen R, et al. Highly parallel SNP genotyping. Cold Spring Harbor Symposia on Quantitative Biology. 2003;68:69–78. doi: 10.1101/sqb.2003.68.69. [DOI] [PubMed] [Google Scholar]
- 53.Steemers FJ, Gunderson KL. Whole genome genotyping technologies on the BeadArray™ platform. Biotechnology Journal. 2007;2(1):41–49. doi: 10.1002/biot.200600213. [DOI] [PubMed] [Google Scholar]
- 54.Livak KJ, Flood SJA, Marmaro J, Giusti W, Deetz K. Oligonucleotides with fluorescent dyes at opposite ends provide a quenched probe system useful for detecting PCR product and nucleic acid hybridization. Genome Research. 1995;4(6):357–362. doi: 10.1101/gr.4.6.357. [DOI] [PubMed] [Google Scholar]
- 55.Kumpatla SP, Buyyarapu R, Abdurakhmonov IY, Mammadov JA. Genomics-assisted plant breeding in the 21st century: technological advances and progress. In: Abdurakhmonov IY, editor. Plant Breeding. pp. 131–184. [Google Scholar]
- 56.Buyyarapu R, Ren R, Kumpatla S, et al. In silico discovery and validation of SNP markers for molecular breeding in cotton. Proceedings of the 19th Conference on Plant & Animal Genome; January 2011; San Diego, Calif, USA. [Google Scholar]
- 57.Tanksley SD. Mapping polygenes. Annual Review of Genetics. 1993;27:205–233. doi: 10.1146/annurev.ge.27.120193.001225. [DOI] [PubMed] [Google Scholar]
- 58.Bhattramakki D, Dolan M, Hanafey M, et al. Insertion-deletion polymorphisms in 3′ regions of maize genes occur frequently and can be used as highly informative genetic markers. Plant Molecular Biology. 2002;48(5-6):539–547. doi: 10.1023/a:1014841612043. [DOI] [PubMed] [Google Scholar]
- 59.Jones ES, Sullivan H, Bhattramakki D, Smith JSC. A comparison of simple sequence repeat and single nucleotide polymorphism marker technologies for the genotypic analysis of maize (Zea mays L.) Theoretical and Applied Genetics. 2007;115(3):361–371. doi: 10.1007/s00122-007-0570-9. [DOI] [PubMed] [Google Scholar]
- 60.Konishi S, Izawa T, Lin SY, et al. An SNP caused loss of seed shattering during rice domestication. Science. 2006;312(5778):1392–1396. doi: 10.1126/science.1126410. [DOI] [PubMed] [Google Scholar]
- 61.Iyer AS, McCouch SR. The rice bacterial blight resistance gene xa5 encodes a novel form of disease resistance. Molecular Plant-Microbe Interactions. 2004;17(12):1348–1354. doi: 10.1094/MPMI.2004.17.12.1348. [DOI] [PubMed] [Google Scholar]
- 62.Drenkard E, Richter BG, Rozen S, et al. A simple procedure for the analysis of single nucleotide polymorphism facilitates map-based cloning in Arabidopsis. Plant Physiology. 2000;124(4):1483–1492. doi: 10.1104/pp.124.4.1483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Garg K, Green P, Nickerson DA. Identification of candidate coding region single nucleotide polymorphisms in 165 human genes using assembled expressed sequence tags. Genome Research. 1999;9(11):1087–1092. doi: 10.1101/gr.9.11.1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Nasu S, Suzuki J, Ohta R, et al. Search for and analysis of single nucleotide polymorphisms (SNPS) in rice (Oryza sativa, Oryza rufipogon) and establishment of SNP markers. DNA Research. 2002;9(5):163–171. doi: 10.1093/dnares/9.5.163. [DOI] [PubMed] [Google Scholar]
- 65.Hayashi K, Hashimoto N, Daigen M, Ashikawa I. Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus. Theoretical and Applied Genetics. 2004;108(7):1212–1220. doi: 10.1007/s00122-003-1553-0. [DOI] [PubMed] [Google Scholar]
- 66.Ashikari M, Matsuoka M. Identification, isolation and pyramiding of quantitative trait loci for rice breeding. Trends in Plant Science. 2006;11(7):344–350. doi: 10.1016/j.tplants.2006.05.008. [DOI] [PubMed] [Google Scholar]
- 67.Jena KK, Mackill DJ. Molecular markers and their use in marker-assisted selection in rice. Crop Science. 2008;48(4):1266–1276. [Google Scholar]
- 68.Varshney RK, Hoisington DA, Tyagi AK. Advances in cereal genomics and applications in crop breeding. Trends in Biotechnology. 2006;24(11):490–499. doi: 10.1016/j.tibtech.2006.08.006. [DOI] [PubMed] [Google Scholar]
- 69.Buckler ES, Holland JB, Bradbury PJ, et al. The genetic architecture of maize flowering time. Science. 2009;325(5941):714–718. doi: 10.1126/science.1174276. [DOI] [PubMed] [Google Scholar]
- 70.Poland JA, Bradbury PJ, Buckler ES, Nelson RJ. Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proceedings of the National Academy of Sciences of the United States of America. 2011;108(17):6893–6898. doi: 10.1073/pnas.1010894108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Zheng P, Allen WB, Roesler K, et al. A phenylalanine in DGAT is a key determinant of oil content and composition in maize. Nature Genetics. 2008;40(3):367–372. doi: 10.1038/ng.85. [DOI] [PubMed] [Google Scholar]
- 72.Akhunov E, Nicolet C, Dvorak J. Single nucleotide polymorphism genotyping in polyploid wheat with the Illumina GoldenGate assay. Theoretical and Applied Genetics. 2009;119(3):507–517. doi: 10.1007/s00122-009-1059-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Allen AM, Barker GL, Berry ST, et al. Transcript-specific, single-nucleotide polymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.) Plant Biotechnology Journal. 2011;9(9):1086–1099. doi: 10.1111/j.1467-7652.2011.00628.x. [DOI] [PubMed] [Google Scholar]
- 74.Bérard A, Le Paslier MC, Dardevet M, et al. High-throughput single nucleotide polymorphism genotyping in wheat (Triticum spp.) Plant Biotechnology Journal. 2009;7(4):364–374. doi: 10.1111/j.1467-7652.2009.00404.x. [DOI] [PubMed] [Google Scholar]
- 75.Winfield MO, Wilkinson PA, Allen AM, et al. Targeted re-sequencing of the allohexaploid wheat exome. Plant Biotechnology Journal. 2012;10(6):733–742. doi: 10.1111/j.1467-7652.2012.00713.x. [DOI] [PubMed] [Google Scholar]
- 76.Lagudah ES, Krattinger SG, Herrera-Foessel S, et al. Gene-specific markers for the wheat gene Lr34/Yr18/Pm38 which confers resistance to multiple fungal pathogens. Theoretical and Applied Genetics. 2009;119(5):889–898. doi: 10.1007/s00122-009-1097-z. [DOI] [PubMed] [Google Scholar]
- 77.Buerstmayr H, Ban T, Anderson JA. QTL mapping and marker-assisted selection for Fusarium head blight resistance in wheat: a review. Plant Breeding. 2009;128(1):1–26. [Google Scholar]
- 78.Bernardo AN, Ma H, Zhang D, Bai G. Single nucleotide polymorphism in wheat chromosome region harboring Fhb1 for Fusarium head blight resistance. Molecular Breeding. 2012;29(2):477–488. [Google Scholar]
- 79.Gupta PK, Langridge P, Mir RR. Marker-assisted wheat breeding: present status and future possibilities. Molecular Breeding. 2010;26(2):145–161. [Google Scholar]
- 80.Kim KS, Bellendir S, Hudson KA, et al. Fine mapping the soybean aphid resistance gene Rag1 in soybean. Theoretical and Applied Genetics. 2010;120(5):1063–1071. doi: 10.1007/s00122-009-1234-8. [DOI] [PubMed] [Google Scholar]
- 81.Kim KS, Hill CB, Hartman GL, Hyten DL, Hudson ME, Diers BW. Fine mapping of the soybean aphid-resistance gene Rag2 in soybean PI 200538. Theoretical and Applied Genetics. 2010;121(3):599–610. doi: 10.1007/s00122-010-1333-6. [DOI] [PubMed] [Google Scholar]
- 82.Ha BK, Hussey RS, Boerma HR. Development of SNP assays for marker-assisted selection of two southern root-knot nematode resistance QTL in soybean. Crop Science. 2007;47(2):S73–S82. [Google Scholar]
- 83.Hu X, Sullivan-Gilbert M, Gupta M, Thompson SA. Mapping of the loci controlling oleic and linolenic acid contents and development of fad2 and fad3 allele-specific markers in canola (Brassica napus L.) Theoretical and Applied Genetics. 2006;113(3):497–507. doi: 10.1007/s00122-006-0315-1. [DOI] [PubMed] [Google Scholar]
- 84.Lehmensiek A, Sutherland MW, McNamara RB. The use of high resolution melting (HRM) to map single nucleotide polymorphism markers linked to a covered smut resistance gene in barley. Theoretical and Applied Genetics. 2008;117(5):721–728. doi: 10.1007/s00122-008-0813-4. [DOI] [PubMed] [Google Scholar]
- 85.Grimmer MK, Trybush S, Hanley S, Francis SA, Karp A, Asher MJC. An anchored linkage map for sugar beet based on AFLP, SNP and RAPD markers and QTL mapping of a new source of resistance to Beet necrotic yellow vein virus. Theoretical and Applied Genetics. 2007;114(7):1151–1160. doi: 10.1007/s00122-007-0507-3. [DOI] [PubMed] [Google Scholar]
- 86.Grimmer MK, Kraft T, Francis SA, Asher MJC. QTL mapping of BNYVV resistance from the WB258 source in sugar beet. Plant Breeding. 2008;127(6):650–652. [Google Scholar]
- 87.Muchero W, Diop NN, Bhat PR, et al. A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and synteny based on EST-derived SNPs. Proceedings of the National Academy of Sciences of the United States of America. 2009;106(43):18159–18164. doi: 10.1073/pnas.0905886106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Jander G, Norris SR, Rounsley SD, Bush DF, Levin IM, Last RL. Arabidopsis map-based cloning in the post-genome era. Plant Physiology. 2002;129(2):440–450. doi: 10.1104/pp.003533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Abdurakhmonov IY, Abdukarimov A. Application of association mapping to understanding the genetic diversity of plant germplasm resources. International Journal of Plant Genomics. 2008;2008 doi: 10.1155/2008/574927.574927 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Hall D, Tegström C, Ingvarsson PK. Using association mapping to dissect the genetic basis of complex traits in plants. Briefings in Functional Genomics and Proteomics. 2010;9(2):157–165. doi: 10.1093/bfgp/elp048. [DOI] [PubMed] [Google Scholar]
- 91.Myles S, Peiffer J, Brown PJ, et al. Association mapping: critical considerations shift from genotyping to experimental design. Plant Cell. 2009;21(8):2194–2202. doi: 10.1105/tpc.109.068437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Gore M, Buckler ES, Yu J, Zhu C. Status and prospects of association mapping in plants. The Plant Genome. 2008;1(1):5–20. [Google Scholar]
- 93.Rafalski JA. Association genetics in crop improvement. Current Opinion in Plant Biology. 2010;13(2):174–180. doi: 10.1016/j.pbi.2009.12.004. [DOI] [PubMed] [Google Scholar]
- 94.Beló A, Zheng P, Luck S, et al. Whole genome scan detects an allelic variant of fad2 associated with increased oleic acid levels in maize. Molecular Genetics and Genomics. 2008;279(1):1–10. doi: 10.1007/s00438-007-0289-y. [DOI] [PubMed] [Google Scholar]
- 95.Tian F, Bradbury PJ, Brown PJ, et al. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nature Genetics. 2011;43(2):159–162. doi: 10.1038/ng.746. [DOI] [PubMed] [Google Scholar]
- 96.Roy JK, Smith KP, Muehlbauer GJ, Chao S, Close TJ, Steffenson BJ. Association mapping of spot blotch resistance in wild barley. Molecular Breeding. 2010;26(2):243–256. doi: 10.1007/s11032-010-9402-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Pajerowska-Mukhtar K, Stich B, Achenbach U, et al. Single nucleotide polymorphisms in the Allene Oxide Synthase 2 gene are associated with field resistance to late blight in populations of tetraploid potato cultivars. Genetics. 2009;181(3):1115–1127. doi: 10.1534/genetics.108.094268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Kump KL, Bradbury PJ, Wisser RJ, et al. Genome-wide association study of quantitative resistance to southern leaf blight in the maize nested association mapping population. Nature Genetics. 2011;43(2):163–168. doi: 10.1038/ng.747. [DOI] [PubMed] [Google Scholar]
- 99.Harper AL, Trick M, Higgins J, et al. Associative transcriptomics of traits in the polyploid crop species Brassica napus . Nature Biotechnology. 2012;30(8):798–802. doi: 10.1038/nbt.2302. [DOI] [PubMed] [Google Scholar]
- 100.Xu Y, Crouch JH. Marker-assisted selection in plant breeding: from publications to practice. Crop Science. 2008;48(2):391–407. [Google Scholar]
- 101.Collard BCY, Mackill DJ. Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philosophical Transactions of the Royal Society B. 2008;363(1491):557–572. doi: 10.1098/rstb.2007.2170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Septiningsih EM, Pamplona AM, Sanchez DL, et al. Development of submergence-tolerant rice cultivars: the Sub1 locus and beyond. Annals of Botany. 2009;103(2):151–160. doi: 10.1093/aob/mcn206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Jin L, Lu Y, Shao Y, et al. Molecular marker assisted selection for improvement of the eating, cooking and sensory quality of rice (Oryza sativa L.) Journal of Cereal Science. 2010;51(1):159–164. [Google Scholar]
- 104.Asif M, Shaheen T, Tabbasam N, Zafar Y, Paterson AH. Marker-assisted breeding in higher plants. Alternative Farming Systems, Biotechnology, Drought Stress and Ecological Fertilisation. 2011;6:39–76. [Google Scholar]
- 105.Naidoo R, Watson GMF, Derera J, Tongoona P, Laing M. Marker-assisted selection for low phytic acid (lpa1-1) with single nucleotide polymorphism marker and amplified fragment length polymorphisms for background selection in a maize backcross breeding programme. Molecular Breeding. 2012;30:1207–1217. [Google Scholar]
- 106.Eathington SR, Crosbie TM, Edwards MD, Reiter RS, Bull JK. Molecular markers in a commercial breeding program. Crop Science. 2007;47, supplement 3:S154–S163. [Google Scholar]
- 107.Rosso ML, Burleson SA, Maupin LM, Rainey KM. Development of breeder-friendly markers for selection of MIPS1 mutations in soybean. Molecular Breeding. 2011;28(1):127–132. [Google Scholar]
- 108.Ribaut JM, Ragot M. Marker-assisted selection to improve drought adaptation in maize: the backcross approach, perspectives, limitations, and alternatives. Journal of Experimental Botany. 2007;58(2):351–360. doi: 10.1093/jxb/erl214. [DOI] [PubMed] [Google Scholar]
- 109.Ren R, Nagel BA, Kumpatla SP, et al. Maize Cytoplasmic Male Sterility (Cms) C-Type Restorer Rf4 Gene, Molecular Markers And Their Use. Google Patents, 2011.
- 110.Ragot M, Lee M, Guimarães E, et al. Marker-assisted selection in maize: current status, potential, limitations and perspertives from the private and public sectors. Marker-Assisted Selection, Current Status and Future Perspectives in Crops, Livestock, Forestry and Fish. 2007:117–150. [Google Scholar]
- 111.Riday H. Paternity testing: a non-linkage based marker-assisted selection scheme for outbred forage species. Crop Science. 2011;51(2):631–641. [Google Scholar]
- 112.Gjertson DW, Brenner CH, Baur MP, et al. ISFG: recommendations on biostatistics in paternity testing. Forensic Science International. 2007;1(3-4):223–231. doi: 10.1016/j.fsigen.2007.06.006. [DOI] [PubMed] [Google Scholar]
- 113.Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157(4):1819–1829. doi: 10.1093/genetics/157.4.1819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Heffner EL, Sorrells ME, Jannink JL. Genomic selection for crop improvement. Crop Science. 2009;49(1):1–12. [Google Scholar]
- 115.Shengqiang Z, Dekkers JCM, Fernando RL, Jannink JL. Factors affecting accuracy from genomic selection in populations derived from multiple inbred lines: a barley case study. Genetics. 2009;182(1):355–364. doi: 10.1534/genetics.108.098277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Hayes B, Goddard M. Genome-wide association and genomic selection in animal breeding. Genome. 2010;53(11):876–883. doi: 10.1139/G10-076. [DOI] [PubMed] [Google Scholar]
- 117.Jannink JL, Lorenz AJ, Iwata H. Genomic selection in plant breeding: from theory to practice. Briefings in Functional Genomics and Proteomics. 2010;9(2):166–177. doi: 10.1093/bfgp/elq001. [DOI] [PubMed] [Google Scholar]
- 118.Mastrangelo AM, Mazzucotelli E, Guerra D, Vita P, Cattivelli L. Improvement of drought resistance in crops: from conventional breeding to genomic selection. Crop Stress and Its Management. 2012:225–259. [Google Scholar]
- 119.Resende MDV, Resende MFR, Jr., Sansaloni CP, et al. Genomic selection for growth and wood quality in Eucalyptus: capturing the missing heritability and accelerating breeding for complex traits in forest trees. New Phytologist. 2012;194(1):116–128. doi: 10.1111/j.1469-8137.2011.04038.x. [DOI] [PubMed] [Google Scholar]
- 120.Zhao Y, Gowda M, Liu W, et al. Accuracy of genomic selection in European maize elite breeding populations. Theoretical and Applied Genetics. 2012;124(4):769–776. doi: 10.1007/s00122-011-1745-y. [DOI] [PubMed] [Google Scholar]