Abstract
Seed-cotton yield (SY) and lint yield (LY) are the most important yield traits of cotton. Thus, it is critical to dissect their genetic architecture. Upland cotton (Gossypium hirsutum) is widely grown worldwide. In this study, a genome-wide association mapping was performed based on the CottonSNP80K array to dissect the genetic architecture of SY and LY in Upland cotton. Twenty-three significant associations were detected within four environments, including 11 associated with SY and 12 associated with LY. Seven single nucleotide polymorphisms (SNPs), TM234, TM237, TM247, TM255, TM256, TM263, and TM264, were co-associated with the two traits, which may indicate pleiotropy or intergenic tight linkages. Five SNPs, TM13332, TM39771, TM57119, TM81653, and TM81660, were coincided with those of previous reports and could be used in marker-assisted selection. Combining functional annotations with expression analyses of the genes identified within 400 kb of the significantly associated SNPs, we hypothesize that the three genes, Gh_D05G1077 and Gh_D13G1571 for SY, and Gh_A11G0775 for LY, may have the potential to increase cotton yield. The results would provide useful information for understanding the genetic basis of yield traits in Upland cotton and for facilitating its high-yield breeding through molecular design.
Keywords: genome-wide association mapping, single nucleotide polymorphisms, seed-cotton yield, lint yield, Upland cotton
Introduction
Improving yield continues to be a main goal in cotton breeding. Among the four cultivated cotton species, Gossypium hirsutum L., Gossypium barbadense L., Gossypium arboreum L., and Gossypium herbaceum L., the allotetraploid G. hirsutum (2n = 4x = 52), also known as “Upland cotton”, is grown on 30.9 million ha of land in more than 80 countries worldwide and accounts for more than 95% of the world’s cotton production (https://www.usda.gov/oce/forum/2016_speeches/Cotton_Outlook_2016.pdf). Dissecting the genetic architecture of yield traits is of great significance to Upland cotton’s yield breeding. Cotton yield traits consist of seed-cotton yield (SY), lint yield (LY), boll number per plant (BN), boll weight (BW), lint percentage (LP), lint index, and seed index (SI), all of which are complicated quantitative traits controlled by quantitative trait loci (QTLs) and environmental factors. It is difficult to improve these traits simultaneously using traditional breeding methods. The application of QTL-linked or QTL-associated molecular markers for target traits in marker-assisted selection (MAS) can help avoid environmental interference and improve breeding efficiency.
Using linkage mapping, many QTLs related to the yield traits of Upland cotton have been identified (An et al. 2010, Guo et al. 2006, Li et al. 2008, Liu et al. 2012, Qin et al. 2008, 2009, Wang et al. 2007, Wu et al. 2009, Xia et al. 2014, Zhang et al. 2010). These QTLs laid an important foundation for dissecting the genetic basis of cotton yield. As an alternative for detecting QTLs, association mapping based on linkage disequilibrium (LD), has been applied in dissecting many important and complicated cotton phenotypes, such as fiber quality (Abdurakhmonov et al. 2008a, 2009, Cai et al. 2014, Qin et al. 2015, Zhang et al. 2013), disease resistance (Mei et al. 2014, Zhao et al. 2014), salt resistance (Saeed et al. 2014), plant type (Li et al. 2016), and seed quality (Liu et al. 2015). For yield traits, Zhang et al. (2013) identified 102 simple sequence repeats (SSRs) associated with seven yield traits in 81 Upland cotton cultivars; Mei et al. (2013) detected 55 marker-trait associations between 26 SSRs and seven yield traits in 356 Upland cotton cultivars; Jia et al. (2014) detected 251 SSRs associated with LY, BN, BW, and LP in 323 G. hirsutum accessions; and Qin et al. (2015) identified 28 associated SSRs for BN, BW, and LP across more than one environment based on 241 G. hirsutum collections. In our previous study, 93 significantly associated SSRs for seven yield traits across more than one environment were detected (Li et al. 2017). However, these studies were mainly based on a limited number of SSR markers; consequently, the genetic bases of the quantitative traits could not be fully revealed at the genome-wide level.
With the wide application of high-density genotyping platforms, the development of numerous single nucleotide polymorphisms (SNPs) makes it possible to dissect the genetic architecture of quantitative traits through genome-wide association (GWA) mapping. Based on the genome-wide resequencing, Fang et al. (2017) detected 71 and 45 associated SNPs for LY and fiber quality, respectively, in 258 G. hirsutum accessions; Wang et al. (2017) identified 19 candidate SNPs for fiber-quality-related traits in 267 G. hirsutum accessions; and Du et al. (2018) identified 98 significant peak associations for 11 agronomically important traits in 215 G. arboreum accessions. Compared with a re-sequencing analysis, SNP arrays can produce large-scale genotyping data through one hybridization procedure at a relatively low cost. The SNP genotyping platform of Illumina (Infinium® technology) has been widely used over the last few years. In cotton, the first commercial high-density CottonSNP63K array, developed from 13 different discovery sets that represent a diverse range of G. hirsutum germplasms, as well as five other species, provided a new resource for the genetic dissection of agronomically and economically important traits (Hulse-Kemp et al. 2015). Using the CottonSNP63K array, Gapare et al. (2017) identified 17 and 50 significant SNP associations for fiber length and micronaire, respectively; Sun et al. (2017) detected 46 significant SNPs associated with five fiber quality traits; Huang et al. (2017) identified 324 significant SNPs associated with growth period, plant type, yield, and fiber quality. Recently, a new cotton SNP-chip, the high-throughput CottonSNP80K array, covering the Upland cotton genome, was successfully created based on sequencing information from “TM-1” together with re-sequencing data from 100+ diverse accessions of Upland cotton (Cai et al. 2017). Thus, the selected SNPs in CottonSNP80K could be distributed along the entire Upland cotton genome. To date, using this array, Cai et al. (2017) distinguished Upland cotton accessions and detected eight significant SNPs for three salt stress traits; Yuan et al. (2018) detected 47 SNPs significantly associated with seven cottonseed nutrients using 196 G. hirsutum accessions; and Dong et al. (2019) identified 30 and 23 significant SNPs associated with five fiber quality traits in six environments and the best linear unbiased predictions using 408 Upland cotton accessions, respectively. Their diverse application tests indicated that CottonSNP80K played important roles in germplasm genotyping, variety verification, functional genomics studies, and molecular breeding in cotton.
SY and LY are the most important cotton yield traits. Of which, SY is the total weight of cottonseed and fiber that reflects the potential productivity contributing to cotton fiber, whereas LY is only the weight of fiber that is directly related to the textile industry. Thus, it is critical to dissect their genetic architecture. To date, Huang et al. (2017) identified three and 28 associated SNPs for SY and LY, respectively, using the CottonSNP63K array. However, there have been no reports on the association mapping of SY and LY based on the CottonSNP80K array. In this study, GWA mapping was performed for SY and LY using the CottonSNP80K array. This study can provide useful information for further understanding the genetic basis of Upland cotton yield traits and for facilitating its high-yield breeding through molecular design.
Materials and Methods
Experimental materials
The association mapping panel consisted of 169 Upland cotton backbone cultivars (lines). Among them, 159 cultivars were selected from major cotton regions in China (Yellow River, Yangtze River, Northwestern China, and Northern China), and 10 were introduced from abroad (Li et al. 2018a). All the accessions were legally planted in the ecological cotton-growing areas of the Yellow River (Xinxiang City, Henan Province) (113°52′ E, 35°18′ N, 95 m above sea level) and Northwestern China (Shihezi City, Xinjiang Province) (85°56′, 44°16′ N, 442 m above sea level) in 2012 and 2013 with two replications. The climate characteristics of the four environments are listed in Table 1. Because of the large climatic differences between Yellow River, with a longer frost-free season, and Northwestern China, with a shorter frost-free season, we performed corresponding cultivation measures for the two locations. In Xinxiang, 14–16 plants were arranged in each row, with a row length of 5 m and a row interval of 1.0 m, while in Shihezi, 38–40 plants were arranged in each row, with a row length of 5 m and a row interval of 0.45 m. All the activities were performed as per normal local management practices. The four environments, 2012Xinxiang, 2013Xinxiang, 2012Shihezi, and 2013Shihezi, were designated E1, E2, E3, and E4, respectively.
Table 1.
Climate characteristic of four environments
| Env. | Lowest temperature (°C) | Highest temperature (°C) | Average temperature (°C) | Average relative humidity (%) | Average lowest temperature (°C) | Average highest temperature (°C) | Sunshine duration (h) |
|---|---|---|---|---|---|---|---|
| E1 | 14.25 | 34.78 | 24.27 | 57.17 | 19.78 | 29.28 | 187.48 |
| E2 | 13.55 | 36.67 | 24.95 | 54.67 | 20.18 | 30.03 | 174.72 |
| E3 | 8.39 | 33.71 | 20.87 | 42.29 | 14.29 | 28.21 | 291.29 |
| E4 | 7.59 | 34.30 | 19.91 | 48.14 | 13.76 | 27.17 | 272.09 |
E1, E2, E3, and E4 indicate four environments: 2012Xinxiang, 2013Xinxiang, 2012Shihezi, and 2013Shihezi, respectively.
Trait phenotyping
Two cotton yield traits, SY per plant (g/plant) and LY per plant (g/plant), were investigated. Ten consecutive plants in the middle of each row were tagged for trait phenotyping. At the cotton harvest stage, all of the opening bolls from the 10 tagged plants in each row were gathered to calculate SY, and LY was subsequently determined after ginning. Statistical analyses were carried out using SAS 9.4 software (SAS Institute Inc., Cary, NC, USA). Analyses of phenotypic change trends for SY and LY are shown as boxplots for the environments and were drawn using the “R” program. Broad-sense heritability (H2) was calculated for each trait using the lme4 package in the “R” program (Bates et al. 2013).
SNP genotyping
Detailed descriptions of the SNP genotyping processes were published previously (Li et al. 2018a). Briefly, single-base extension was performed, and the chip was scanned using Illumina iScan (Illumina Inc., San Diego, CA, USA). Image files were saved and analyzed using the GenomeStudio Genotyping Module (v 1.9.4, Illumina). The SNPs with minor allele frequencies of ≥ 0.05 and integrities of ≥ 50% in the population were used for screening polymorphic loci through SNP filtering. Finally, a set of 49,650 high-quality SNPs from 77,774 original SNPs was obtained for further analysis.
Kinship and GWA mapping
The kinship coefficient of each pair of individuals was calculated using the factored spectrally transformed linear mixed models (FaST-LMM) program, and the GWA mapping was performed using the LMM provided by the FaST-LMM program (Lippert et al. 2011) (http://www.nature.com/naturemethods/), which has been widely applied in GWA mapping (Chen et al. 2014, Fang et al. 2016, Yang et al. 2014). In the FaST-LMM program, the population structure was modeled as a random effect in LMM using the kinship (K) matrix. Manhattan plots were drawn using qqman in the R package (Turner 2014). The significance of the associations between SNPs and traits was based on the threshold of the Bonferroni correction for multiple tests (1/n), where n was the total number of SNPs used in the association mapping (Sun et al. 2017). To confirm the stability of trait-associated SNPs in the current study, we investigated whether they were consistent with previously mapped markers/QTLs for SY and LY (Huang et al. 2017, Jia et al. 2014, Keerio et al. 2018, Li et al. 2008, 2018b, Liu et al. 2012, Mei et al. 2014, Qin et al. 2009, Wang et al. 2007, Wu et al. 2009, Xia et al. 2014). The SSR primer sequences were obtained from the CottonGen Database (http://www.cottongen.org), and the physical locations were mapped to the reference genome (Zhang et al. 2015) using electronic PCR. For SNPs, we compared the physical locations directly.
Gene annotation and expression analysis
In our previous study, the average LD decay distance of our population for the AD genome was estimated to be ~400 kb, where r2 dropped to half of the maximum value (Li et al. 2018a). Because of the LD decay distance and data from other publications, such as Sun et al. (2017) and Su et al. (2016), which assumed that the regions of SNP-associated genes for target traits were 200 kb and 1 Mb, respectively, we did a statistical analysis to identify genes located within 400 kb (200 kb upstream and downstream) of significant trait-associated SNPs. Further gene annotations were obtained from non-redundant protein sequences (nr) (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/) (Altschul et al. 1997), a gene ontology analysis (GO) (http://www.geneontology.org/) (Ashburner et al. 2000), eukaryotic orthologous groups (KOG) (ftp://ftp.ncbi.nih.gov/pub/COG/KOG/) (Koonin et al. 2004), and the Kyoto Encyclopedia of Genes and Genomes (KEGG) (ftp://ftp.genome.jp/pub/kegg/) (Kanehisa et al. 2004). RNA-seq data from 19 G. hirsutum tissues (root, stem, leaf, torus, seed, ovules from −3, −1, 0, 1, 3, 5, 10, 20, 25, and 35 days post-anthesis, and fibers from 5, 10, 20 and 25 days post-anthesis) were downloaded from the NCBI SRA database under accession code PRJNA248163 (http://www.ncbi.nlm.nih.gov/sra/?term=PRJNA248163) (Zhang et al. 2015). Raw data in the fastq format were filtered using NGS QC Toolkit (Version 2.3) (Patel and Jain 2012). Clean data were obtained by removing reads containing adapters, poly-Ns, and low-quality reads from raw data. All the downstream analyses were based on clean data of a high quality determined by Q30. RNA-seq clean reads were mapped to the TM-1 genome using Tophat (Version 2.0.8) (Trapnell et al. 2009). The fragments per kilobase of transcript per million fragments mapped reads values were calculated using Cufflinks (Version 2.1.1) (Trapnell et al. 2010), and the normalized values indicated the expression level of each gene. A heat map of the candidate gene expression pattern was created using Mev 4.9 (Saeed et al. 2003).
Results
Phenotypic variations for SY and LY
Phenotypic values for SY and LY of the 169 accessions in four environments were used for variation analysis. Both traits varied widely (Table 2). SY showed 3.207–7.538-fold variations across the four environments, ranging from 18.943 to 123.640 g in E1, 17.488 to 131.826 g in E2, 14.700 to 80.800 g in E3, and 22.100 to 70.871 g in E4. LY showed 2.990–10.724-fold variations across the four environments, ranging from 6.638 to 50.143 g in E1, 4.779 to 51.250 g in E2, 5.313 to 36.605 g in E3, and 9.184 to 27.456 g in E4. The mean coefficients of variance for SY and LY were 27.81% and 30.19%, respectively. The analysis of variance indicated that the variances in genotype (G), environment (E), and the interaction between genotype and environment (G × E) were all highly significant, which suggested that both SY and LY were complicated quantitative traits controlled by G, E, and G × E effects. The H2 values of SY and LY were 55.8% and 62.7% respectively, which suggested that these two traits were under moderate genetic control. Furthermore, SY and LY were significantly positively correlated, with a correlation coefficient of 0.96 (data not shown).
Table 2.
Phenotypic variation and broad-sense heritability for seed-cotton and lint yields across four environments
| Trait | Env. | Max | Min | Average | Std | CV | G | E | G × E | H2 (%) |
|---|---|---|---|---|---|---|---|---|---|---|
| SY | E1 | 123.640 | 18.943 | 69.588 | 23.044 | 0.331 | 3.16*** | 306.58*** | 1.70*** | 55.8 |
| E2 | 131.826 | 17.488 | 73.258 | 20.555 | 0.281 | |||||
| E3 | 80.800 | 14.700 | 42.531 | 11.924 | 0.280 | |||||
| E4 | 70.871 | 22.100 | 45.153 | 9.947 | 0.220 | |||||
| LY | E1 | 50.143 | 6.638 | 26.097 | 9.503 | 0.364 | 3.91*** | 202.93*** | 1.71*** | 62.7 |
| E2 | 51.250 | 4.779 | 26.771 | 8.558 | 0.320 | |||||
| E3 | 36.605 | 5.313 | 16.787 | 4.984 | 0.297 | |||||
| E4 | 27.456 | 9.184 | 18.324 | 4.157 | 0.227 |
SY: seed-cotton yield, LY: lint yield; E1, E2, E3, and E4 indicate four environments: 2012Xinxiang, 2013Xinxiang, 2012Shihezi, and 2013Shihezi, respectively;
Significant at the P = 0.001 level;
H2: broad-sense heritability.
Estimates of relative kinship
A set of 49,650 high-quality SNPs were used to assess the kinship coefficient. The kinship coefficient of each pair of individuals was calculated using the FaST-LMM program. The pairwise relative kinship value of 0 accounted for 57.38% of all the kinship coefficients. In addition, kinship values from 0 to 0.05 accounted for more than 81.64% of all the pairwise kinship coefficients (Fig. 1). Only 0.18% of the pairwise relative kinship coefficients were greater than 0.5. The kinship analysis indicated that most accessions had no, or a weak, relationship with others in this cotton panel, which agreed with their various sources. In the FaST-LMM program, population structure was modeled as a random effect in the LMM using the kinship (K) matrix (Supplemental Table 1). This controlled for spurious associations because the genomic inflation factor was near 1 in all the GWA analyses.
Fig. 1.
Histogram frequency distribution of pairwise relative kinship coefficients.
GWA mapping for SY and LY
GWA mapping for SY and LY was performed using the LMM provided by the FaST-LMM program. The quantile–quantile plots showed that the model could be used to identify association signals (Supplemental Figs. 1, 2). To obtain reliable results, the SNPs with −log10P values greater than 4.69 (1/49,650) were selected as significant trait-associated SNPs. In total, 23 SNPs associated with SY and/or LY were found to be significant within the four environments (Table 3). For SY, 11 significant SNPs were detected, which were distributed on the four chromosomes, A1, A11, D5, and D13. The proportion of phenotypic variation explained by markers ranged from 3.40% to 23.44% across the four environments. For LY, 12 significant SNPs were detected on the three chromosomes, A1, A5, and A11, and explained 3.73% to 23.92% of the phenotypic variation across the different environments. In addition, seven SNPs, TM234, TM237, TM247, TM255, TM256, TM263, and TM264, were identified as co-associated with SY and LY in E4 and were located close to each other in the peak region (4.62–4.83 Mb) on chromosome A1 (Fig. 2).
Table 3.
Genome-wide significant association single nucleotide polymorphisms (SNPs) of seed-cotton and lint yields
| Trait | SNP | Chr. | Position | Alleles | −log10P | R2 (%) | Env. |
|---|---|---|---|---|---|---|---|
| SY | TM234 | A1 | 4619010 | T/C | 4.7 | 17.96 | E4 |
| TM237 | A1 | 4658823 | C/G | 5.26 | 19.41 | E4 | |
| TM247 | A1 | 4701228 | A/G | 6.71 | 23.44 | E4 | |
| TM255 | A1 | 4768647 | T/C | 5.45 | 20.37 | E4 | |
| TM256 | A1 | 4772798 | T/C | 4.91 | 19.93 | E4 | |
| TM263 | A1 | 4813977 | A/G | 5.83 | 20.42 | E4 | |
| TM264 | A1 | 4834038 | T/C | 6.13 | 20.33 | E4 | |
| TM39771 | A11 | 90796512 | T/A | 4.94 | 3.40 | E3 | |
| TM57119 | D5 | 9348880 | T/C | 4.94 | 13.02 | E2 | |
| TM81653 | D13 | 48302193 | T/C | 5 | 19.02 | E1 | |
| TM81660 | D13 | 48354365 | T/C | 5 | 20.15 | E1 | |
| LY | TM234 | A1 | 4619010 | T/C | 5.11 | 18.92 | E4 |
| TM237 | A1 | 4658823 | C/G | 5.57 | 20.02 | E4 | |
| TM241 | A1 | 4674079 | T/G | 4.76 | 2.39 | E4 | |
| TM247 | A1 | 4701228 | A/G | 6.86 | 23.92 | E4 | |
| TM254 | A1 | 4750704 | A/G | 4.98 | 18.28 | E4 | |
| TM255 | A1 | 4768647 | T/C | 5.73 | 20.95 | E4 | |
| TM256 | A1 | 4772798 | T/C | 4.96 | 20.57 | E4 | |
| TM261 | A1 | 4805646 | T/C | 4.71 | 3.53 | E4 | |
| TM263 | A1 | 4813977 | A/G | 6.11 | 20.99 | E4 | |
| TM264 | A1 | 4834038 | T/C | 6.27 | 20.97 | E4 | |
| TM13332 | A5 | 88305539 | T/C | 4.75 | 8.34 | E3 | |
| TM37263 | A11 | 7633456 | T/A | 5.32 | 3.73 | E3 |
SY: seed-cotton yield, LY: lint yield; R2 (%): phenotypic variation explained by marker; E1, E2, E3, and E4 indicate four environments: 2012Xinxiang, 2013Xinxiang, 2012Shihezi, and 2013Shihezi, respectively.
Fig. 2.
Manhattan plots of genome-wide association (GWA) mapping results for seed-cotton yield (A) and lint yield (B) on chromosome A1 in E4. The dashed line represents the significance threshold. SY: seed-cotton yield, LY: lint yield.
To confirm the stability of trait-associated SNPs in the current study, we investigated previously mapped markers/QTLs for SY and LY. A total of 213 primer sequences and/or the physical locations were obtained. Finally, five SNPs, TM13332, TM39771, TM57119, TM81653, and TM81660, which were identified in the current study and located on chromosomes A5, A11, D5, and D13, respectively, coincided with previously reported markers/QTLs (Fig. 3). Specifically, the marker TM13332 on chromosome A5 associated with LY was near NAU3036, which was detected in our previous study (Li et al. 2017), with the distance between them being 2.3 Mb. The marker TM39771 on chromosome A11 associated with SY was near NAU5428 and GH256, which were identified in our previous study (Li et al. 2017), with the distances between them and TM39771 being 1.4 Mb and 3 Mb, respectively. The marker TM57119 on chromosome D5 associated with SY was near NAU3269, which was detected by Mei et al. (2013), with the distance between them being 6.3 Mb. The markers TM81653 and TM81660 on chromosome D13, which are associated with SY, were on the qSCY-07X-c18-1 (BNL0193–BNL0569) overlap detected by Yu et al. (2013). These two markers were also positioned near GH501, NAU2443, and DPL0864, which were detected in our previous study (Li et al. 2017), with distances of 0.7 Mb, 3 Mb, and 8 Mb, respectively, from TM81653 and TM81660.
Fig. 3.
Five SNPs, TM13332, TM39771, TM57119, TM81653 and TM81660, identified in the current study coincided with previously reported markers/QTLs on four chromosomes. The unit of physical distance for the chromosomes is Mb.
Gene annotation and expression analysis
We did a statistics for the genes located near the significant trait-associated SNPs using the G. hirsutum TM-1 genome (Zhang et al. 2015). A total of 177 genes were identified within 400 kb (200 kb upstream and downstream) of the significant SNPs (Supplemental Table 2), including 116 for SY and 96 for LY (because seven SNPs on chromosome A1 were co-associated, 35 genes were common to SY and LY). We investigated the distribution of these genes by referring to the genome information. In total, 114 genes were in the A subgenome and 63 were in the D subgenome. Comprehensive functional analyses of the nr, GO, KOG, and KEGG databases (Supplemental Table 2) showed that 43 genes were correlated with binding; 40 genes were correlated with cell wall/membrane/envelope/ribosomal biogenesis pathways; 30 genes were components of proteins/enzymes; 19 genes were correlated with transferase activities; 16 genes were identified as substance transport and/or metabolism; 16 genes were correlated with kinase activities; 13 were correlated with morphogenesis and organ development; 8 genes were correlated with defense/resistance-responsive; 7 genes were correlated with signal transduction mechanisms; 4 genes had functions in energy production and conversion; 4 were correlated with posttranslational modification, protein turnover, and chaperones; and 3, 2, 3, 3, and 2 were involved in cell division, seed dormancy or germination control, root development or cell differentiation, leaf morphogenesis, and regulation of flower development, respectively. The other 40 genes were uncharacterized, putative, or coded for a hypothetical protein.
To investigate which genes were responsible for SY and/or LY, the expression levels of 177 genes were analyzed using RNA-seq data (Supplemental Table 3) from 19 G. hirsutum tissues (Zhang et al. 2015). A heat map of the 177 gene expression patterns in G. hirsutum tissues are shown in Fig. 4. For SY, 23 genes presented higher expression levels in nearly all of the tissues, except torus; 26 genes were barely expressed; and10 genes (Gh_A01G0320, Gh_A01G0325, Gh_A01G0335, Gh_A01G0336, Gh_A01G0350, Gh_A11G2795, Gh_D05G1109, Gh_D05G1112, Gh_D05G1116, and Gh_D13G1581) were highly expressed in ovules. For LY, 18 genes presented higher expression levels in nearly all of the tissues, except torus; 19 genes were barely expressed; and 2 (Gh_A01G0347 and Gh_A11G0760) were highly expressed during fiber development.
Fig. 4.
Heat map of the 177 gene expression in tissues of G. hirsutum (TM-1). Red indicates high expression, and green indicates low expression.
Discussion
A suitable association mapping panel should embrace as much phenotypic and genotypic variation as can be reliably measured in common environments (Flint-Garcia et al. 2005). In the present study, the phenotypic evaluation of SY and LY suggested there was relatively abundant variation and significant differences among genotypes and environments. Based on the CottonSNP80K array, the marker data showed that 49,650 of the 77,774 SNPs (accounting for 63.84%) were high-quality polymorphic markers. Similarly, Cai et al. (2017) detected 59,502 SNPs (76.51%) using 352 cotton accessions; Yuan et al. (2018) detected 41,815 SNPs (53.76%) using 196 G. hirsutum accessions; and Dong et al. (2019) identified 48,072 SNPs (61.81%) using 408 Upland cotton accessions. In the recent reports, using the CottonSNP63K array, Gapare et al. (2017) detected 32,183 SNPs (51.04%) using 103 cotton accessions; Huang et al. (2017) detected 11,975 quantified SNPs (18.99%) using 503 G. hirsutum accessions; and Sun et al. (2017) detected 10,511 SNPs (16.67%) using 719 diverse accessions of Upland cotton. The level of DNA polymorphisms was higher compared with the CottonSNP80K array than that reported in studies using the CottonSNP63K array, which greatly improved the detection power of association mapping for target traits. In this study, the H2 values of SY and LY were 55.8% and 62.7%, respectively. Although these two traits were under moderate genetic control, 11 SNPs associated with SY and 12 SNPs associated with LY were detected within four environments, and also almost all of these SNPs exhibited phenotypic contributions of greater than 10% (Table 3). This further indicated the high efficiency of the CottonSNP80K array for GWA mapping. Among the 23 associated SNPs, 7 SNPs, TM234, TM237, TM247, TM255, TM256, TM263, and TM264, located close to each other in the peak region (4.62–4.83 Mb) on chromosome A1, were co-associated with SY and LY. Similarly, our previous study identified 23 co-associated markers with two or more different traits, of which seven were co-associated with SY and LY (Li et al. 2017). Similar conclusions were reached in other publications. Mei et al. (2013) reported that 17 markers were co-associated with two or more different yield traits, of which four were co-associated with SY and LY; Huang et al. (2017) detected one associated marker that was co-associated with SY and LY. Furthermore, the phenomenon of co-localization QTLs in linkage mapping also existed. Li et al. (2008) and Yu et al. (2013) found four and seven QTL clusters, respectively, which affect SY and LY simultaneously. Thus, pleiotropy or tight intergenic linkage could exist, which may explain the positive correlations between SY and LY. In breeding programs, it would be useful to improve SY and LY simultaneously using closely linked advantageous genes.
Obtaining stable markers/QTLs is the prerequisite and foundation for MAS. To confirm the stability of trait-associated SNPs in the current study, we investigated whether they were consistent with previously mapped markers/QTLs for SY and LY. Using the physical positions of the previously identified SSR markers, five SNPs (TM13332, TM39771, TM57119, TM81653, and TM81660) on chromosomes A5, A11, D5, and D13 identified in the current study coincided with the previously reported markers/QTLs. In addition, previous studies identified associated SNPs for yield traits, but most of the results focused on BN, BW, LP, and SI (Du et al. 2018, Fang et al. 2017, Wang et al. 2017). For SY and LY, Huang et al. (2017) detected 28 SNPs associated with LY. Of them, three SNPs, i02589Gh, i19569Gh, and i07452Gh, were on chromosomes A1, A5, and A11, respectively. Keerio et al. (2018) detected one SNP, M561, linked to a QTL for LY, qLW-A05-1. However, these four SNPs were beyond the LD decay distance of ~400 kb for the SNPs identified in our study. Li et al. (2018b) detected a few QTLs for SY and LY, but we could not compare the results owing to the unknown physical positions of the SNPs related to these QTLs. In summary, the five SNPs, which were detected simultaneously in different populations with different genetic backgrounds, have sufficient stability and can potentially be used in MAS breeding. However, we did not find significant trait-associated SNPs that were detected simultaneously in different environments in the current study. This might be attributed to: (1) the complex inheritance of yield traits, especially a highly significant interaction between yield-related genes and environment; (2) the limited number of SNP markers on the cotton SNP-chip, which results in the inability to determine all the SNPs in a region; (3) insufficient information on SY and LY; and (4) the relatively small sample size. Thus, we will enlarge the research population and focus on an analysis of interactions between genotype and environment using resequencing technology in the future.
GWA mapping is especially useful for identifying candidate genes underlying complicated quantitative traits (Flister et al. 2013). In this study, 177 genes were identified within 400 kb of the significantly associated SNPs. Among these, eight genes, Gh_A01G0320, Gh_A11G0775, Gh_A11G0777, Gh_D05G1077, Gh_D05G1078, Gh_D05G1091, Gh_D05G1113, and Gh_D13G1571, are directly related to the growth and development of plants, and thus may affect cotton yield. Specifically, Gh_A11G0775 is homologous to COP9, which defines a novel signaling step in mediating the light control of plant development, together with COP8 and COP11 (Wei et al. 1994). Abdurakhmonov et al. (2008a) also reported COP9 as a photomorphogenesis-related factor, which was targeted by putative ovule-derived short interfering RNAs during the initiation and elongation phases of fiber development in Upland cotton, thus indirectly supporting a role for light signaling in fiber traits. Gh_D05G1113 is homologous to hcf173, a mutant that shows a high chlorophyll fluorescence phenotype, severely affects the accumulation of photosystem II subunits in Arabidopsis thaliana (Schult et al. 2007), and may affect the photochemical reflectance index in soybean (Herritt et al. 2016). Gh_D05G1091 is homologous to AtFAAH, whose overexpression alters phytohormone accumulation and signaling, which, in turn, compromises innate immunity against bacterial pathogens (Kang et al. 2008). Gh_D05G1078 is homologous to ACA9, which is a key regulator of pollen tube growth and fertilization in flowering plants (Schiøtt et al. 2004). Gh_D05G1077 is homologous to AT2G18840, which plays a critical role in pollen tube-pistil signaling, fertilization, and seed production, as reported by Hafidh et al. (2016). Gh_A01G0320 encodes LZF1/STH3, which was found to act as a positive regulator of photomorphogenesis, to affect anthocyanin accumulation and chloroplast biogenesis, and to regulate light-dependent development in Arabidopsis (Chang et al. 2008, Datta et al. 2008). Gh_D13G1571 is homologous to AT1G53190 (TEAR1), which positively regulates CIN-like TCP activity to promote leaf development by mediating the degradation of the TCP repressor TIE1 in Arabidopsis (Zhang et al. 2017). Gh_A11G0777 is homologous to AT4G00730, which affects the accumulation of anthocyanin in subepidermal cells and the organization of cells in the primary root in A. thaliana, suggesting that this homeobox gene determines the nature of the epidermis and/or subepidermis (Kubo et al. 1999). Although these genes are predicted to affect cotton yield, direct evidence that the functions of these genes are correlated with cotton yield is still lacking.
RNA-seq can determine allele-specific expression and gene expression that can reveal linkages or associations with genetic markers, even across experimental conditions, and then detect variations. The analysis of gene expression levels is a preliminary approach to gene functional analyses. In the present study, the expression levels of 177 genes, including 116 genes for SY and 95 for LY (35 candidate genes were common for SY and LY), were analyzed based on RNA-seq data in 19 tissues. For SY, 10 genes, Gh_A01G0320, Gh_A01G0325, Gh_A01G0335, Gh_A01G0336, Gh_A01G0350, Gh_A11G2795, Gh_D05G1109, Gh_D05G1112, Gh_D05G1116, and Gh_D13G1581, were highly expressed in ovule, suggesting that these genes may affect the formation and development of cottonseed, and thus contribute to the SY. The two genes, Gh_D05G1077 and Gh_D13G1571, presented high expression levels in all of the tissues, except the torus. As mentioned above, the homologous gene of Gh_D05G1077 plays a critical role in pollen tube-pistil signaling, fertilization, and seed production (Hafidh et al. 2016), and the homologous gene of Gh_D13G1571 plays an important role in leaf development by mediating the degradation of the TCP repressor TIE1 (Zhang et al. 2017). Thus, these two genes may contribute to the biological production of cotton, and thus also contribute to SY. For LY, two genes, Gh_A01G0347 and Gh_A11G0760, were highly expressed during fiber development, suggesting that these genes have potential roles in increasing LY. The gene Gh_A11G0775 was expressed highly in all of the tissues, except the torus, suggesting that it contributes to the biological production of cotton. Additionally, as mentioned above, the homologous COP9 gene, Gh_A11G0775, might be involved in a novel signaling step that mediates the light control of fiber development in Upland cotton (Abdurakhmonov et al. 2008b, Wei et al. 1994).
Although the candidate genes identified in this study showed great potential for increasing the yield of Upland cotton, it is necessary to determine the functional characteristics of these genes by analyzing overexpressing and down-regulated lines. Additionally, an analysis of interactions between previously reported genes or environmental response factors and QTLs, as well as investigating factors that regulate co-association, could be of interest.
Supplementary Information
Acknowledgments
This work was supported by National Key Research and Invention Program of the Thirteenth (2016YFD0101413), National Natural Science Foundation of China (31671743), Technology Demonstration and Industrialization of Seed Industry Facing the Five Central Asian Countries (161100510100), and Key Scientific Research Projects of Higher Education of Henan Province (17A210012).
Footnotes
Author Contribution Statement
CQL, TD and QYY perform the experiments. GRL, XLG and RRS analyze the data. YYW, CQL and QLW prepare the manuscript. QLW provides the materials. All authors have read, edited and approved the current version of the manuscript.
Literature Cited
- Abdurakhmonov, I.Y., Kohel, R.J., Yu, J.Z., Pepper, A.E., Abdullaev, A.A., Kushanov, F.N., Salakhutdinov, I.B., Buriev, Z.T., Saha, S., Scheffler, B.E.et al. (2008a) Molecular diversity and association mapping of fiber quality traits in exotic G. hirsutum L. germplasm. Genomics 92: 478–487. [DOI] [PubMed] [Google Scholar]
- Abdurakhmonov, I.Y., Devor, E.J., Buriev, Z.T., Huang, L.Y., Makamov, A., Shermatov, S.E., Bozorov, T., Kushanov, F.N., Mavlonov, G.T. and Abdukarimov, A. (2008b) Small RNA regulation of ovule development in the cotton plant, G. hirsutum L. BMC Plant Biol. 8: 93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abdurakhmonov, I.Y., Saha, S., Jenkins, J.N., Buriev, Z.T., Shermatov, S.E., Scheffler, B.E., Pepper, A.E., Yu, J.Z., Kohel, R.J. and Abdukarimov, A. (2009) Linkage disequilibrium based association mapping of fiber quality traits in G. hirsutum L. variety germplasm. Genetica 136: 401–417. [DOI] [PubMed] [Google Scholar]
- Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J.H., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- An, C.F., Jenkins, J.N., Wu, J.X., Guo, Y.F. and McCarty, J.C. (2010) Use of fiber and fuzz mutants to detect QTL for yield components, seed, and fiber traits of Upland cotton. Euphytica 172: 21–34. [Google Scholar]
- Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T.et al. (2000) Gene ontology: tool for the unification of biology. Nat. Genet. 25: 25–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bates, D., Maechler, M. and Bolker, B. (2013) Lme4: Linear mixed-effects models using S4 classes, R package version 0.999999-2. [Google Scholar]
- Cai, C.P., Ye, W.X., Zhang, T.Z. and Guo, W.Z. (2014) Association analysis of fiber quality traits and exploration of elite alleles in Upland cotton cultivars/accessions (Gossypium hirsutum L.). J. Integr. Plant Biol. 56: 51–62. [DOI] [PubMed] [Google Scholar]
- Cai, C.P., Zhu, G.Z., Zhang, T.Z. and Guo, W.Z. (2017) High-density 80K SNP array is a powerful tool for genotyping G. hirsutum accessions and genome analysis. BMC Genomics 18: 654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang, C.S., Li, Y.H., Chen, L.T., Chen, W.C., Hsieh, W.P., Shin, J., Jane, W.N., Chou, S.J., Choi, G., Hu, J.M.et al. (2008) LZF1, a HY5-regulated transcriptional factor, functions in Arabidopsis deetiolation. Plant J. 54: 205–219. [DOI] [PubMed] [Google Scholar]
- Chen, W., Gao, Y., Xie, W., Gong, L., Lu, K., Wang, W., Li, Y., Liu, X., Zhang, H.Y., Dong, H.X.et al. (2014) Genome-wide association analyses provide genetic and biochemical insights into natural variation in rice metabolism. Nat. Genet. 46: 714–721. [DOI] [PubMed] [Google Scholar]
- Datta, S., Johansson, H., Hettiarachchi, C., Irigoyen, M.L., Desai, M., Rubio, V. and Holm, M. (2008) LAF1/SALT TOLERANCE HOMOLOG3, an Arabidopsis B-box protein involved in light-dependent development and gene expression, undergoes COP1-mediated ubiquitination. Plant Cell 20: 2324–2338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dong, C.G., Wang, J., Yu, Y., Ju, L.Z., Zhou, X.F., Ma, X.M., Mei, G.F., Han, Z.G., Si, Z.F., Li, B.C.et al. (2019) Identifying functional genes influencing Gossypium hirsutum fiber quality. Front. Plant Sci. 9: 1968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Du, X., Huang, G., He, S.P., Yang, Z.E., Sun, G.F., Ma, X.F., Li, N., Zhang, X.Y., Sun, J.L., Liu, M.et al. (2018) Resequencing of 243 diploid cotton accessions based on an updated a genome identifies the genetic basis of key agronomic traits. Nat. Genet. 50: 796–802. [DOI] [PubMed] [Google Scholar]
- Fang, C.Y., Zhang, H., Wan, J., Wu, Y.Y., Li, K., Jin, C., Chen, W., Wang, S.C., Wang, W.S., Zhang, H.W.et al. (2016) Control of leaf senescence by an MeOH-jasmonates cascade that is epigenetically regulated by OsSRT1 in rice. Mol. Plant 9: 1366–1378. [DOI] [PubMed] [Google Scholar]
- Fang, L., Wang, Q., Hu, Y., Jia, Y.H., Chen, J.D., Liu, B.L., Zhang, Z.Y., Guan, X.Y., Chen, S.Q., Zhou, B.L.et al. (2017) Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat. Genet. 49: 1089–1098. [DOI] [PubMed] [Google Scholar]
- Flint-Garcia, S.A., Thuillet, A.C., Yu, J., Pressoir, G., Romero, S.M., Mitchell, S.E., Doebley, J., Kresovich, S., Goodman, M.M. and Buckler, E.S. (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J. 44: 1054–1064. [DOI] [PubMed] [Google Scholar]
- Flister, M.J., Tsaih, S., O’Meara, C., Endres, B., Hoffman, M.J., Geurts, A.M., Dwinell, M.R., Lazar, J., Jacob, H.J. and Moreno, C. (2013) Identifying multiple causative genes at a single GWAS locus. Genome Res. 23: 1996–2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gapare, W., Conaty, W., Zhu, Q.H., Liu, S.M., Stiller, W., Llewllyn, D. and Wilson, I. (2017) Genome-wide association study of yield components and fibre quality traits in a cotton germplasm diversity panel. Euphytica 213: 66. [Google Scholar]
- Guo, W.Z., Ma, G.J., Zhu, Y.C., Yi, C.X. and Zhang, T.Z. (2006) Molecular tagging and mapping of quantitative trait loci for lint percentage and morphological marker genes in Upland cotton. J. Integr. Plant Biol. 48: 320–326. [Google Scholar]
- Hafidh, S., Potěšil, D., Fíla, J., Čapková, V., Zdráhal, Z. and Honys, D. (2016) Quantitative proteomics of the tobacco pollen tube secretome identifies novel pollen tube guidance proteins important for fertilization. Genome Biol. 17: 81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herritt, M., Dhanapal, A.P. and Fritschi, F.B. (2016) Identification of genomic loci associated with the photochemical reflectance index by genome-wide association study in soybean. Plant Genome 9. [DOI] [PubMed] [Google Scholar]
- Huang, C., Nie, X.H., Shen, C., You, C.Y., Li, W., Zhao, W.X., Zhang, X.L. and Lin, Z.X. (2017) Population structure and genetic basis of the agronomic traits of Upland cotton in China revealed by a genome-wide association study using high-density SNPs. Plant Biotechnol. J. 15: 1374–1386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hulse-Kemp, A.M., Lemm, J., Plieske, J., Ashrafi, H., Buyyarapu, R., Fang, D.D., Frelichowski, J., Giband, M., Hague, S., Hinze, L.L.et al. (2015) Development of a 63k SNP array for cotton and high-density mapping of intraspecific and interspecific populations of Gossypium spp. G3 (Bethesda) 5: 1187–1209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jia, Y.H., Sun, X.W., Sun, J.L., Pan, Z.E., Wang, X.W., He, S.P., Xiao, S.H., Shi, W.J., Zhou, Z.L., Pang, B.Y.et al. (2014) Association mapping for epistasis and environmental interaction of yield traits in 323 cotton cultivars under 9 different environments. PLoS ONE 9: e95882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y. and Hattori, M. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res. 32: 277–280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kang, L., Wang, Y.S., Uppalapati, S.R., Wang, K., Tang, Y., Vadapalli, V., Venables, B.J., Chapman, K.D., Blancaflor, E.B. and Mysore, K.S. (2008) Overexpression of a fatty acid amide hydrolase compromises innate immunity in Arabidopsis. Plant J. 56: 336–349. [DOI] [PubMed] [Google Scholar]
- Keerio, A.A., Shen, C., Nie, Y., Ahmed, M.M., Zhang, X. and Lin, Z. (2018) QTL mapping for fiber quality and yield traits based on introgression lines derived from Gossypium hirsutum × G. tomentosum. Int. J. Mol. Sci. 19: 243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koonin, E.V., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Krylov, D.M., Makarova, K.S., Mazumder, R., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S.et al. (2004) A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol. 5: R7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kubo, H., Peeters, A.J., Aarts, M.G., Pereira, A. and Koornneef, M. (1999) ANTHOCYANINLESS2, a homeobox gene affecting anthocyanin distribution and root development in Arabidopsis. Plant Cell 11: 1217–1226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li, C.Q., Guo, W.Z., Ma, X.L. and Zhang, T.Z. (2008) Tagging and mapping of QTL for yield and its components in Upland cotton (Gossypium hirsutum L.) population with varied lint percentage. Cotton Sci. 20: 163–169. [Google Scholar]
- Li, C.Q., Ai, N.J., Zhu, Y.J., Wang, Y.Q., Chen, X.D., Li, F., Hu, Q.Y. and Wang, Q.L. (2016) Association mapping and favourable allele exploration for plant architecture traits in Upland cotton (Gossypium hirsutum L.) accessions. J. Agric. Sci. 154: 567–583. [Google Scholar]
- Li, C.Q., Dong, N., Fu, Y.Z., Sun, R.R. and Wang, Q.L. (2017) Marker detection and elite allele mining for yield traits in Upland cotton (Gossypium hirsutum L.) by association mapping. J. Agric. Sci. 155: 613–628. [Google Scholar]
- Li, C.Q., Wang, Y.Y., Ai, N.J., Li, Y. and Song, J.F. (2018a) A genome-wide association study of early-maturation traits in Upland cotton based on the CottonSNP80K array. J. Integr. Plant Biol. 60: 970–985. [DOI] [PubMed] [Google Scholar]
- Li, C., Zhao, T.L., Yu, H.R., Li, C., Deng, X.L., Dong, Y.T., Zhang, F., Zhang, Y., Mei, L., Chen, J.H.et al. (2018b) Genetic basis of heterosis for yield and yield components explored by QTL mapping across four genetic populations in upland cotton. BMC Genomics 19: 910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lippert, C., Listgarten, J., Liu, Y., Kadie, C.M., Davidson, R.I. and Heckerman, D. (2011) FaST linear mixed models for genome-wide association studies. Nat. Methods 8: 833–835. [DOI] [PubMed] [Google Scholar]
- Liu, G.Z., Mei, H.X., Wang, S., Li, X.H., Zhu, X.F. and Zhang, T.Z. (2015) Association mapping of seed oil and protein contents in Upland cotton. Euphytica 205: 637–645. [Google Scholar]
- Liu, R.Z., Wang, B.H., Guo, W.Z., Qin, Y.S., Wang, L.G., Zhang, Y.M. and Zhang, T.Z. (2012) Quantitative trait loci mapping for yield and its components by using two immortalized populations of a heterotic hybrid in Gossypium hirsutum L. Mol. Breed. 29: 297–311. [Google Scholar]
- Mei, H.X., Zhu, X.F. and Zhang, T.Z. (2013) Favorable QTL alleles for yield and its components identified by association mapping in Chinese Upland cotton cultivars. PLoS ONE 8: e82193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mei, H.X., Ai, N.J., Zhang, X., Ning, Z.Y. and Zhang, T.Z. (2014) QTLs conferring FOV7 resistance detected by linkage and association mapping in Upland cotton. Euphytica 197: 237–249. [Google Scholar]
- Patel, R.K. and Jain, M. (2012) NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS ONE 7: e30619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin, H.D., Guo, W.Z., Zhang, Y.M. and Zhang, T.Z. (2008) QTL mapping of yield and fiber traits based on a four-way cross population in Gossypium hirsutum L. Theor. Appl. Genet. 117: 883–894. [DOI] [PubMed] [Google Scholar]
- Qin, H.D., Chen, M., Yi, X.D., Bie, S., Zhang, C., Zhang, Y.C., Lan, J.Y., Meng, Y.Y., Yuan, Y.L. and Jiao, C.H. (2015) Identification of associated SSR markers for yield component and fiber quality traits based on frame map and Upland cotton collections. PLoS ONE 10: e0118073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin, Y.S., Liu, R.Z., Mei, H.X., Zhang, T.Z. and Guo, W.Z. (2009) QTL mapping for yield traits in Upland cotton (Gossypium hirsutum L.). Acta Agron. Sin. 35: 1812–1821. [Google Scholar]
- Saeed, A.I., Sharov, V., White, J., Li, J., Liang, W., Bhagabati, N., Braisted, J., Klapa, M., Currier, T., Thiagarajan, M.et al. (2003) Tm4: a free, open-source system for microarray data management and analysis. Biotechniques 34: 374–378. [DOI] [PubMed] [Google Scholar]
- Saeed, M., Guo, W.Z. and Zhang, T.Z. (2014) Association mapping for salinity tolerance in cotton (Gossypium hirsutum L.) germplasm from US and diverse regions of China. Aust. J. Crop Sci. 8: 338–346. [Google Scholar]
- Schiøtt, M., Romanowsky, S.M., Bækgaard, L., Jakobsen, M.K., Palmgren, M.G. and Harper, J.F. (2004) A plant plasma membrane Ca2+ pump is required for normal pollen tube growth and fertilization. Proc. Natl. Acad. Sci. USA 101: 9502–9507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schult, K., Meierhoff, K., Paradies, S., Töller, T., Wolff, P. and Westhoff, P. (2007) The nuclear-encoded factor HCF173 is involved in the initiation of translation of the psbA mRNA in Arabidopsis thaliana. Plant Cell 19: 1329–1346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Su, J.J., Pang, C.Y., Wei, H.L., Li, L.B., Liang, B., Wang, C.X., Song, M.Z., Wang, H.T., Zhao, S.Q., Jia, X.Y.et al. (2016) Identification of favorable SNP alleles and candidate genes for traits related to early maturity via GWAS in Upland cotton. BMC Genomics 17: 687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun, Z.W., Wang, X.F., Liu, Z.W., Gu, Q.S., Zhang, Y., Li, Z.K., Ke, H.F., Yang, J., Wu, J., Wu, L.Q.et al. (2017) Genome-wide association study discovered genetic variation and candidate genes of fibre quality traits in Gossypium hirsutum L. Plant Biotechnol. J. 15: 982–996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell, C., Pachter, L. and Salzberg, S.L. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., Baren, M.J., Salzberg, S.L., Wold, B.J. and Pachter, L. (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28: 511–515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turner, S.D. (2014) qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. bioRxiv. 10.1101/005165. [DOI]
- Wang, B.H., Guo, W.Z., Zhu, X.F., Wu, Y.T., Huang, N.T. and Zhang, T.Z. (2007) QTL mapping of yield and yield components for elite hybrid derived-RILs in Upland cotton. J. Genet. Genomics 34: 35–45. [DOI] [PubMed] [Google Scholar]
- Wang, M.J., Tu, L.L., Lin, M., Lin, Z.X., Wang, P.C., Yang, Q.Y., Ye, Z.X., Shen, C., Li, J.Y., Zhang, L.et al. (2017) Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat. Genet. 49: 579–587. [DOI] [PubMed] [Google Scholar]
- Wei, N., Chamovitz, D.A. and Deng, X.W. (1994) Arabidopsis COP9 is a component of a novel signaling complex mediating light control of development. Cell 78: 117–124. [DOI] [PubMed] [Google Scholar]
- Wu, J.X., Osman Ariel, G., Johnien, J., Jack, M.C. and Zhu, J. (2009) Quantitative analysis and QTL mapping for agronomic and fiber traits in an RI population of Upland cotton. Euphytica 165: 231–245. [Google Scholar]
- Xia, Z., Zhang, X., Liu, Y.Y., Jia, Z.F., Zhao, H.H., Li, C.Q. and Wang, Q.L. (2014) Major gene identification and quantitative trait locus mapping for yield-related traits in Upland cotton (Gossypium hirsutum L.). J. Integr. Agric. 13: 299–309. [Google Scholar]
- Yang, W.N., Guo, Z.L., Huang, C.L., Duan, L.F., Chen, G.X., Jiang, N., Fang, W., Feng, H., Xie, W.B., Lian, X.M.et al. (2014) Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice. Nat. Commun. 5: 5087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu, J.W., Zhang, K., Li, S.Y., Yu, S.X., Zhai, H.H., Wu, M., Li, X.L., Fan, S.L., Song, M.Z., Yang, D.G.et al. (2013) Mapping quantitative trait loci for lint yield and fiber quality across environments in a Gossypium hirsutum × Gossypium barbadense backcross inbred line population. Theor. Appl. Genet. 126: 275–287. [DOI] [PubMed] [Google Scholar]
- Yuan, Y.C., Wang, X.L., Wang, L.J., Xing, H.X., Wang, Q.K., Saeed, M., Tao, J.C., Feng, W., Zhang, G.H., Song, X.L.et al. (2018) Genome-wide association study identifies candidate genes related to seed oil composition and protein content in Gossypium hirsutum L. Front. Plant Sci. 9: 1359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, J., Chen, X., Zhang, K., Liu, D.J., Wei, X.Q. and Zhang, Z.S. (2010) QTL mapping of yield traits with composite cross population in Upland cotton (Gossypium hirsutum L.). J. Agric. Biotechnol. 18: 476–481. [Google Scholar]
- Zhang, J.Z., Wei, B.Y., Yuan, R.R., Wang, J.H., Ding, M.X., Chen, Z.Y., Yu, H. and Qin, G.J. (2017) The Arabidopsis RING-type E3 ligase TEAR1 controls leaf development by targeting the TIE1 transcriptional repressor for degradation. Plant Cell 29: 243–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, T.Z., Qian, N., Zhu, X.F., Chen, H., Wang, S., Mei, H.X. and Zhang, Y.M. (2013) Variations and transmission of QTL alleles for yield and fiber qualities in Upland cotton cultivars developed in China. PLoS ONE 8: e57220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, T.Z., Hu, Y., Jiang, W.K., Fang, L., Guan, X.Y., Chen, J.D., Zhang, J.B., Saski, C.A., Scheffler, B.E., Stelly, D.M.et al. (2015) Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33: 531–537. [DOI] [PubMed] [Google Scholar]
- Zhao, Y., Wang, H., Chen, W. and Li, Y.H. (2014) Genetic structure, linkage disequilibrium and association mapping of verticillium wilt resistance in elite cotton (Gossypium hirsutum L.) germplasm population. PLoS ONE 9: e86308. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




