Skip to main content
Plant Biotechnology Journal logoLink to Plant Biotechnology Journal
. 2021 Feb 7;19(6):1110–1124. doi: 10.1111/pbi.13532

Large‐fragment insertion activates gene GaFZ (Ga08G0121) and is associated with the fuzz and trichome reduction in cotton (Gossypium arboreum)

Xiaoyang Wang 1,3,, Yuchen Miao 2,, Yingfan Cai 2,, Gaofei Sun 4, Yinhua Jia 1, Song Song 1, Zhaoe Pan 1, Yuanming Zhang 3, Liyuan Wang 1, Guoyong Fu 1, Qiong Gao 1, Gaoxiang Ji 1, Pengpeng Wang 1, Baojun Chen 1, Zhen Peng 1, Xiaomeng Zhang 1, Xiao Wang 1, Yi Ding 1, Daowu Hu 1, Xiaoli Geng 1, Liru Wang 1, Baoyin Pang 1, Wenfang Gong 1,5,, Shoupu He 1,, Xiongming Du 1,
PMCID: PMC8196653  PMID: 33369825

Summary

Cotton seeds are typically covered by lint and fuzz fibres. Natural ‘fuzzless’ mutants are an ideal model system for identifying genes that regulate cell initiation and elongation. Here, using a genome‐wide association study (GWAS), we identified a ~ 6.2 kb insertion, larINDEL FZ , located at the end of chromosome 8, composed of a ~ 5.0 kb repetitive sequence and a ~ 1.2 kb fragment translocated from chromosome 12 in fuzzless Gossypium arboreum. The presence of larINDEL FZ was associated with a fuzzless seed and reduced trichome phenotypes in G. arboreum. This distant insertion was predicted to be an enhancer, located ~ 18 kb upstream of the dominant‐repressor GaFZ (Ga08G0121). Ectopic overexpression of GaFZ in Arabidopsis thaliana and G. hirsutum suggested that GaFZ negatively modulates fuzz and trichome development. Co‐expression and interaction analyses demonstrated that GaFZ might impact fuzz fibre/trichome development by repressing the expression of genes in the very‐long‐chain fatty acid elongation pathway. Thus, we identified a novel regulator of fibre/trichome development while providing insights into the importance of noncoding sequences in cotton.

Keywords: Gossypium arboreum, fuzzless mutant, structural variation, genome‐wide association study

Introduction

Mutations are distributed extensively in the genome, including single‐nucleotide polymorphisms (SNPs), small insertions/deletions (InDels), copy number (CNVs) and structural variations (SVs). The precision and integrity of large‐scale SVs detection are limited by technological restrictions. However, with the development of next‐generation sequencing technologies and algorithms, many InDels/CNVs/SVs have been identified and found to be associated with important traits in animals and plants (Chia et al., 2012; Mills et al., 2011; Yang et al., 2013).

Enhancers are cis‐regulatory elements (CREs) derived from SVs or the movement of transposable elements (TEs) and control gene expression in an orchestrated spatiotemporal pattern during development (Paliou and Andrey, 2018). A classic example of an enhancer in maize (Zea mays) is a noncoding segment located ~ 58 kb upstream of the teosinte branched 1 (tb1) gene, which was found to enhance tb1 transcription and affect quantitative phenotypes (Clark et al., 2006). Further study demonstrated that this segment included a Hopscotch retrotransposon inserted in a regulatory region of tb1 and acting as an enhancer to increased tb1 expression; this explains why maize exhibits more distinctive apical dominance than its progenitor teosinte (Studer et al., 2011). Similar studies have been reported since the identification of tb1. The introduction of a 366‐bp CRE in the ZmVPP1 promoter increased drought tolerance in maize (Wang et al., 2016). Additionally, a Harbinger‐like TE located ~ 57 kb upstream of ZmCCT9 negatively regulates this flowering‐associated CCT transcription factor (Huang et al., 2018). In rice, insertion of a 17.1‐kb tandem duplication segment into the OsGL7 locus led to elevated expression of GL7, resulting in improvements in grain yield and quality (Wang et al., 2015). In wheat, a terminal‐repeat retrotransposon in miniature element was found in the TaMs2 locus and demonstrated to cause male sterility (Ni et al., 2017; Xia et al., 2017). Together, these studies indicate that some complex and important agricultural traits are determined by large‐scale genomic variations rather than small‐scale nucleotide mutations.

Cotton (Gossypium) fibres are the main source of natural textile materials. These fibres are divided into two different types, viz. long lint and short fuzz, both of which are single‐celled tubular outgrowths arising from seed coat epidermal cells. The cells initiated from the epidermal layer at −3 to 0 days post‐anthesis (DPA) develop into lint fibres, whereas fuzz fibres originate from these cells at + 3 to 5 DPA (Du et al., 2013). Most studies have focused on the development of lint, with little attention given to fuzz fibre. Studies of fuzz initiation are essential for understanding the mechanism of cotton fibre development. Moreover, during the cotton seed production process, fuzz fibres are anatomically differentiated from lint fibre, integrated tightly with the seed epidermis and not easy to gin, thus disturbing the seed coating. Strong acids must be used to remove fuzz, increasing production cost and leading to severe environmental pollution. Hence, a ‘fuzzless’ seed phenotype would be beneficial and more environment‐friendly.

Fuzzless mutants are important genetic resources for discovering genes associated with fibre development. Several important G. hirsutum mutants, including Xu142fl, N1 and n2 , have been well studied and characterized (Wan et al., 2014, 2016). The dominant fuzzless gene N1 was mapped to chromosome A12 in G. hirsutum using forward genetics, and designated GhMML3_A12, which encodes a MYBMIXTA‐like transcription factor (Wan et al., 2016). Furthermore, by combining map‐based cloning and qRT‐PCR, the recessive fuzzless gene n2 was mapped to chromosome D12, and MYB25‐like_Dt was proposed as the candidate gene in G. barbadense (Zhu et al., 2018). A recent survey indicated that non‐synonymous SNPs in the coding region of MML3‐A were common in all six mutants (G. hirsutum) and responsible for both the dominant and recessive fuzzless phenotype (Naoumkina et al., 2020). A previous study showed that GaHD‐1, a recessive locus on Chr06, contains an alternative splicing mutation and may confer the lintless trait in G. arboreum mutant A2_106 (Liu et al., 2020 ). Another study demonstrated that dysfunctional GaHD1 may cause fibreless mutant sma‐4 (ha) (Ding et al., 2020). A genome‐wide association study (GWAS) identified a major quantitative trait locus responsible for the fuzzless phenotype on chromosome 8 (Du et al., 2018), and another report fine‐mapped the quantitative trait locus using a large F2 population, suggesting GaFZ (Ga08G0121) as the candidate gene (Feng et al., 2019). A recent study identified two regions associated with fuzz development, one of which contains GaFZ (Liu et al., 2020). Although GaFZ was considered as the most likely causal gene responsible for the fuzzless phenotype in G. arboreum, however, the function and causal variation of this gene remains unknown.

Cotton fibres, like Arabidopsis thaliana leaf trichomes, are elongated single‐cell structures of epidermal origin and may share similar regulatory mechanisms (Serna and Martin, 2006). Trichome development in A. thaliana is regulated by MYB/bHLH/WD40 (MBW) complexes, including R2R3 MYB, GLABROUS1 (GL1), the basic helix–loop–helix protein‐encoding gene GLABRA3 (GL3), ENHANCER OF GLABRA3 (EGL3), the WD40 protein‐encoding gene and TRANSPARENT TESTA GLABRA1(TTG1), which either promote or inhibit trichome initiation and growth (Serna and Martin, 2006). The homeodomain (HD‐Zip IV) transcription factor GLABRA2 (GL2) is the downstream target of the MBW complexes and is involved in regulating trichome morphogenesis (Khosla et al., 2014). In cotton, GaMYB2 has a similar function as AtGL1, which can rescue gl1 glabrous phenotype in trichome formation (Wang et al., 2004). GhMYB109, a homologous gene of AtGL1, is involved in the positive regulation of fibre initiation and elongation (Pu et al., 2008; Suo et al., 2003). GhMYB25‐like, belonging to the MYBMIXTA‐like (MML) transcription factor family, acts upstream of the homeodomain leucine zipper protein GhHD‐1 and positively regulates cotton seed trichome development (Walford et al., 2011, 2012). Two MYB25‐like transcription factors, GhMML4_D12 and GhMML3_A12, were isolated as known regulators of fibre development (Wan et al., 2016; Wu et al., 2018). Moreover, three cotton HD‐Zip IV factors (HOX1/2/3) have been identified, and the interaction of GhHOX3 with GhHD1 affects fibre elongation in cotton (Guan et al., 2008; Shan et al., 2014). Using a map‐based cloning strategy, HD1 genes (GaHD1, GbHD1 and GhHD1), were identified as key regulators of lint fibre, stem hair and leaf trichome initiation in the diploids G. arboreum, allotetraploids G. barbadense and G. hirsutum, respectively (Ding et al., 2020; Ding et al., 2015; Liu et al., 2020; Niu et al., 2018; Tang et al., 2019). However, whether additional regulators of trichome development exist in cotton and their association with the MBW‐GL2 system are unknown.

Very‐long‐chain fatty acid (VLCFA) synthesis‐related genes have been suggested as positive regulators of cotton fibre elongation (Lassner et al., 1996; Qin et al., 2007a). Many cotton genes involved in fatty acid chain elongation and nonspecific lipid transfer are significantly upregulated during early fibre development, indicating that this pathway may be essential for fibre‐cell elongation (Ji et al., 2003; Qin et al., 2007a; Qin et al., 2007b; Shi et al., 2006). VLCFA biosynthesis occurring in the endoplasmic reticulum is controlled by the fatty acid elongase complex which comprises four proteins, 3‐ketoacyl‐CoA synthase (KCS), 3‐ketoacyl‐CoA reductase (KCR), 3‐hydroxyacyl‐CoA dehydratase (HACD) and trans‐2‐enoyl‐CoA reductase (ECR) (Fehling and Mukherjee, 1991). KCS is the first rate‐limiting enzyme determining VLCFA production in higher plants (Lassner et al., 1996). In Arabidopsis thaliana lacking KCS, abnormal plant morphology and a glabrous phenotype were observed (Pruitt et al., 2000; Reina‐Pinto et al., 2009; Yephremov et al., 1999), suggesting that the VLCFA biosynthesis pathway is involved in trichome development. Therefore, identifying regulators affecting VLCFA and elongation pathway genes may be important for understanding cotton fibre and trichome development.

Following up on previous work (Du et al., 2018; Feng et al., 2019; Liu et al., 2020), we identified a large insertion in G. arboreum associated with the fuzzless/glabrous trait by upregulating GaFZ expression. GaFZ may function independently of the MBW‐GL2 system by directly repressing the fatty acid elongation pathway. Our results provide insight into the mechanisms of fibre/trichome development in plants.

Results

A large noncoding insertion is responsible for the fuzzless and the reduced leaf trichome number in G. arboreum acc. GA0149

Previous studies of G. arboreum suggested that seed fuzz was controlled by a single dominant locus (designated FZ) on chromosome 8, although the causal variation remained unknown (Du et al., 2018; Feng et al., 2019; Liu et al., 2020). We found that seed fuzz correlated significantly with the amount of leaf trichomes (Figure 1a; Figure 2c); therefore, based on published genotypes (Du et al., 2018), we supplemented the GWAS for the leaf trichome trait using non‐SNP‐based markers (InDels of any size, copy number duplications and deletions). Large‐insertion markers (larINDEL FZ ) for the seed fuzz and leaf trichome traits co‐localized to a common region (FZ) on chromosome 8, where the signal intensity for larINDEL FZ (‐log P = 33.60) was much stronger than that of the highest SNP FZ (‐log P = 18.95) (Figure 1a‐b). Except for SNP FZ and larINDEL FZ , we found no association for all other variations (Figure S1‐S3). Four linkage disequilibrium (LD) blocks were identified in FZ, whereas SNP FZ , which is not localized to any LD blocks, was located in the second intron of Casparian‐strip membrane protein (Ga08G0117) (Figure 1c‐d). Investigation of the association between genotypes of SNP FZ and larINDEL FZ and seed fuzz revealed that the presence of larINDEL FZ was highly correlated with the fuzzless phenotype (Figure 1e). Together, these results indicate that the causal variation behind the seed fuzz and leaf trichome traits is likely larINDEL FZ , rather than SNP FZ .

Figure 1.

Figure 1

Both the seed fuzz and leaf hair traits are associated with large insertions/deletions in chromosome 8 in G. arboreum. (a) Manhattan plots of the GWAS of seed fuzz and leaf hairs using SNP and SV (large insertion) markers. The strongest associations, designated by red arrows and regions, are highlighted by red transparent columns. The thresholds of GWAS are indicated by red dot lines. (b) Leaf hair and seed fuzz phenotypes between fuzzy and fuzzless accessions. Scale bar = 2cm (top) and 1 cm (bottom). (c) Local Manhattan plot of the GWAS on chromosome 8. The positions of SNP FZ and larINDEL FZ are marked. (d) Location of the gene models and linkage disequilibrium (LD) status in the local Manhattan plot region. Four putative LD blocks were identified, and the corresponding genomic regions are represented by grey bars. The levels of pairwise correlation between SNPs are represented by different colours. (e) Association between the seed fuzz traits and genotypes of SNP FZ and larINDEL FZ in the GWAS population determined by resequencing mapping reads.

Figure 2.

Figure 2

Genomic structure of larINDEL FZ and its association with seed fuzz and trichome phenotypes (a) Schematic diagram of the larINDEL FZ region with continuous contigs designed by overlap PCR in chromosome 8. Genomic structure of the fuzzy accession (acc. GA0146, left) and position of the insertion fragment in the fuzzless accession (acc. GA0149, right). All PCR products are shown as red lines with primer pair numbers (middle). The gel indicates the product amplified by Primer 6 (6F/6R). (b) Schematic diagram of genomic components for both fuzzy (acc. GA0146) and fuzzless (acc. GA0149) accessions nearby larINDEL FZ . The five types of repeat elements are shown by arrowed boxes in different colours (Figure S5). The repeat element combinations are circled by red solid lines and categorized as A, B and C (red letters). The breakpoints (BPs) duplicated translocation from Chr.12 and duplication from Chr.8 are marked by red vertical dotted lines. Read depth is shown by grey columns; the horizontal black dotted lines indicate the average sequencing depth for these two accessions (~25×); and the region with an abnormally high read depth is circled by a black dotted line. The location of Primer 6F/6R on the two accessions is marked by blue arrows. (c) Relationships among seed fuzz, leaf hair and larINDEL FZ diversity for each individual in the F2 population and GWAS population. The number of leaf hairs was determined by microscopy with a fixed field (n = 9, three fields were investigated for each individual, three individuals were randomly selected). larINDEL FZ (+/+), larINDEL FZ (+/‐) and larINDEL FZ (‐/‐) indicate the insertion was homozygous present, heterozygous present and absent in individuals, respectively. The upper, median and lower quartiles for each group are denoted by three vertical lines. Significances of difference between groups were derived with an unpaired two‐tailed Student’s t‐test.

Bioinformatic tools can estimate only the type and approximate location of SVs. larINDEL FZ was amplified from representative fuzzy and fuzzless accessions (GA0146 and GA0149, respectively) using 19 overlapping primer sets, and the whole sequence across this region (Chr08: 880–903 kb) was assembled (Figure 2a and Table S1). Only the fragment amplified by Primer‐6F6R, with a 6226‐bp larINDEL FZ , showed a difference between accessions. Sequencing and alignment of the PCR products from fuzzy and fuzzless accessions identified Primer‐6F6R and breakpoint‐1 (Figure 2a, Figure S4 and S5).

To determine the genomic structure of larINDEL FZ , we first constructed the fuzzless sequence (Figure S6a, right panel) by integrating the insertion (larINDEL FZ ) into the fuzzy sequence (Figure S6a, left panel). Collinearity analysis of this region showed that five major repeat elements exist in both sequences, which were categorized into three combinations: A, B and C (Figure S6a). The fuzzless accession contained all three combinations, whereas the fuzzy accession had only B and C. In the fuzzless accession, combination A sequences included three repeat elements because of the duplication of combination C upstream of GaFZ (Ga08G0121). Another group A element (deep blue) was a translocated duplication from chromosome 12 (Figure 2b). Combination B may be a duplication relic of C, which was interrupted by the insertion of two inverted (orange, Figure 2b and Figure S6a) and two very small (dark green, Figure 2b and Figure S6a) repeat elements. The different orientations and coverage depth of Illumina read pairs in combination C, between the fuzzless and the fuzzy accessions, support this duplication event (Figure 2b), for which a total of three possible breakpoints (BPs) were identified. The position of BP‐1 (Chr08: 887 151 bp), located at the left boundary where larINDEL FZ inserted, was further identified by aligning ten sequenced clones (Figure S5). BP‐2 (Chr12: 100 898 047 bp) and BP‐3 (Chr12: 100 899 234 bp) were located on chromosome 12 (Figure 2b, Figure S6b‐e). Therefore, the ~ 6.2‐kb larINDEL FZ insertion identified in the fuzzless accession is composed of a homolog repeat combination (A, ~5.0 kb) and a ~ 1.2‐kb homologous fragment derived from chromosome 12. The sequence similarity compared with its original copy (combination C) emphasized this duplicate insertion as a recent event (Figure 2b, Figure S7). According to sequence identity analysis of repeat elements near locus FZ, multiple duplication events likely occurred in this region and continuously accumulated SNPs, but only the most recent duplication caused the fuzzless phenotype (Figure S7). Characterization of the genomic structure of locus FZ may improve the understanding of both the evolutionary history and biological function of this region.

Linkage/association between larINDEL FZ and traits of interest

Previous work demonstrated that the fuzzless trait is controlled by a single dominant gene in GA0149 (Du et al., 2018; Feng et al., 2019). To assess the linkage/association between larINDEL FZ and traits of interest, an F2 population consisting of 1212 individuals was developed by crossing the fuzzless mutant with fewer leaf trichomes (GA0149, female parent) to wild type (GA0146, male parent). The number of leaf trichomes per individual was quantified from the F2 and GWAS populations. The fuzzless accessions/individuals had significantly fewer leaf trichomes and less seed fuzz in the F2 (Figure 2c). Primers 6F/6R and 6F/6R' were used to distinguish heterozygous (larINDEL FZ (+/−)), homozygous fuzzless (larINDEL FZ (−/−)) and homozygous fuzzy (larINDEL FZ (+/+)) plants in the F2 population (Figure 2c, Figure S8a and Table S2). As expected, a large number of heterozygous insertions (n = 624) was detected in fuzzless individuals, which were consistent with a 1:2:1 segregation ratio (homozygous fuzzy: heterozygous fuzzless: homozygous fuzzless; Chi‐square test, P = 0.192). Primer 6F/6R could easily distinguish between homozygous fuzzy and fuzzless accessions in the GWAS population, as they were predominant (Figure 2c and Figure S8b). Overall, our results indicate that the presence or absence of the 6226‐bp larINDEL FZ insertion on chromosome 8 is strongly associated with leaf trichome and seed fuzz phenotypes in G. arboreum.

larINDEL FZ was the key regulator enhancing GaFZ expression

We investigated the expression patterns of eight genes in FZ during different developmental stages in fibres, roots, hypocotyls and leaves. Consistent with previous reports (Feng et al., 2019), only GaFZ exhibited specific expression patterns. GaFZ was highly expressed during the fuzz critical initiation stages (from + 3 to + 5 DPA) in the fuzzless compared with the fuzzy accession (Figure 3a, Figure S9). To verify the relationship between GaFZ expression and larINDEL FZ , we randomly selected six accessions each of the fuzzy and fuzzless accessions from the GWAS population to analyse the sequence diversity of the genic and promoter regions. Five variations were detected in GaFZ, including InDel‐2 (3 bp) and SNP‐2 (C/G), both of which were located in the genic region of GaFZ (Figure 3b). InDel‐1 (24 bp) and SNP‐1 (A/T) were found in the GaFZ promoter (Figure 3b). Haplotype data from this region showed that larINDEL FZ was present in all fuzzless accessions (Figure 3b). Further, the GaFZ expression was higher during developmental stages of 0 to + 5DPA (Figure 3c) in the fuzzless accessions. Several additional SNPs were detected in the regions flanking GaFZ but did not appear to be relevant to the fuzzless trait (Figure S10).

Figure 3.

Figure 3

larINDEL FZ activates GaFZ gene expression. (a) Expression profiles of eight genes in the FZ locus region during critical stages of fuzz development. DPA, day post‐anthesis. (b, c) Schematic diagram of all variation polymorphisms in the FZ region and their association with fuzz phenotypes and downstream gene (GaFZ) expression at three stages in the fuzzy and fuzzless accessions. The genotypes of larINDEL FZ in these accessions are marked by red boxes. The genotypes of InDels were represented by symbols ‘+’ or ‘‐’, and the genotypes of SNPs are shown by bases. (d) GUS promoter activity of GaFZ in fuzzy (GA0146) and fuzzless (GA0149) accessions. Each experiment was performed with three independent biological replicates. Data are shown as the mean ± SD. The P values were determined by Student's t‐test. (e) Schematic representation of the reporter vector used to determine the effect of the insert fragment on gene expression in the cotton leaf protoplast transient expression system. This vector contained the minimal CaMV 35S (35S) promoter (minimal promoter of the cauliflower mosaic virus), Renilla luciferase (REN) controlled by 35S, firefly luciferase (LUC) and nopaline synthase terminator (Ter). The red arrow indicates the multiple cloning site (MCS). (f) Dual‐luciferase transient expression assay in cotton protoplasts. The fragments with various lengths and orientations (left) were cloned into the reporter vector, and the corresponding effects on LUC were tested (right). Each experiment was performed with four independent biological replicates. The relative fluorescence value of the empty vector was set as a control. Data are shown as the mean ± SD. Comparison was performed between the vector with greatest (6F6R‐2) and second‐greatest (LUC‐2) average of LUC expression. The P value was determined by Dunnett's multiple comparison test.

To further confirm whether the promoter‐localized polymorphism was involved in regulating GaFZ expression, we cloned a 2000‐bp promoter sequence of GaFZ from accessions GA0146 and GA0149. Vectors pGaFZ GA0149::GUS and pGaFZ GA0146::GUS were constructed for transient expression in tobacco (Nicotiana benthamiana) leaves. The expression level showed no significant difference (P = 0.9746) between pGaFZ GA0149::GUS and pGaFZ GA0146::GUS, but both were significantly lower than that of pCaMV35S::GUS (P = 0.0010) (Figure 3d). These findings suggest that larINDEL FZ is the genomic variation responsible for upregulation of GaFZ.

To test the effect of larINDEL FZ on GaFZ expression, we cloned various lengths of larINDEL FZ into a reporter vector carrying the minimal promoter of the cauliflower mosaic virus, firefly luciferase (LUC) open reading frame, and nopaline synthase terminator (Ter) (Figure 3e). The effects of these constructs on LUC expression were compared in cotton protoplasts. Only the construct carrying the complete fragment (amplified with primer 6F6R) with a specific orientation significantly activated luciferase (Figure 3f). Therefore, enhancement of GaFZ expression by larINDEL FZ may be activated only when the inserted fragment is linked to the flanking sequence in a specific orientation.

To evaluate the impact of larINDEL FZ on gene expression within 1 Mb of the flanking FZ region (Chr. 8: 405 400‐1 405 400 bp), we compared the expression of 117 genes. GaFZ was the only gene significantly up‐regulated during fuzz‐initiation stages (0 to + 5DPA) in the fuzzless accession (Figure S11). Moreover, PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) predictions indicated that larINDEL FZ contained an abundance of enhancer cis‐acting and core promoter elements (Figure S12; Table S3). These results suggest that larINDEL FZ might act as an enhancer, specifically by upregulating GaFZ expression during the fuzz‐initiation stages.

GaFZ suppresses trichome and fuzz fibre development

To confirm the subcellular location of GaFZ, constructs 35S::GaFZ‐GFP and 35S::GaFZ‐RFP were prepared and transformed into A. thaliana and tobacco, respectively. The fluorescent signal was observed predominantly in the membrane and nucleus, with some fluorescence detected in the cytoplasm (Figure S13a–c), indicating that GaFZ acts as a receptor at the membrane and regulatory factor in the nucleus. Phylogenetic analysis of the conserved GaFZ domain from various species indicated that homologous proteins can be classified into three groups (Figure S14). GaFZ clustered with group 1, with most of Gossypium homologs, and was most closely related to GhA08G010900 (Figure S14). The A. thaliana homologs of GaFZ (AtGIR1 and AtGIR2), AtGIR1 and AtGIR2 are known to act as negatively regulate of root hair development (Wu and Citovsky, 2017a, 2017b), clustered in group 3. Together, these data suggest that the functions of these homologs differentiated between Gossypium and Arabidopsis.

To investigate the effects of GaFZ on trichome and fibre development, we transformed GaFZ into A. thaliana and Upland cotton (G. hirsutum) (Figure 4 and Figure S15‐S18). Compared to wild‐type A. thaliana, 35S::GaFZ transgenic lines exhibited abnormal leaves and significantly reduced (~fivefold) leaf trichome numbers (Figure S16a–e). To confirm whether these phenotypes resulted from overexpression of GaFZ, we examined the expression levels of GaFZ in the T3 generation and found that GaFZ was dramatically up‐regulated (>10 000‐fold) in the overexpression lines compared with that in the wild type (Figure S16f). For Upland cotton overexpressing GaFZ, compared with WT (Figure 4a–c), transgenic lines exhibited a glabrous phenotype in young cauline leaves, stems and veins, which were clearly distinct from the wild type (Figure 4d–f, h, Figure S17). Three positive lines (OE‐L11, OE‐L14 and OE‐L25) showed high GaFZ expression, whereas the negative control lines (OE‐L18) showed very low expression levels similar to wild type, as determined by qRT‐PCR analysis (Figure 4i–j). However, no leaf abnormalities were observed (Figure S18). Lint fibre length in OE‐L14 plants appeared to be slightly shorter than that of the wild type and controls (OE‐L18 and OE‐L29), but was not evident in the other three lines (OE‐L11, OE‐L14 and OE‐L25) (Figure S19). The amount of fuzz fibre was reduced in the transgenic cotton lines (Figure 4g, Figure S19), but root hairs showed no significant difference between the wild‐type and overexpression lines in both A. thaliana and cotton (Figure S20). Our results confirm that the function of GaFZ may differ from that of AtGIR, as GaFZ suppressed trichome and fuzz fibre development but not that of root hairs, in A. thaliana and cotton.

Figure 4.

Figure 4

GaFZ repressed cotton trichome and fuzz development. (a‐f) Trichome numbers and fibre phenotype in WT (top) and transgenic lines (below). Young leaves (a, d), stems (b, e), leaf veins (c, f) and the fuzz fibre phenotype (g). Bars = 1 cm in a, d and g and 0.5 cm in b, c, e and f. (h) Statistical analysis of trichome numbers of transgenic and control cotton plants (n = 3). Data are shown as the mean ± SD, and the P values were determined by Student’s t‐test. (i) Electrophoretogram of PCR verification. M, DNA marker, DL2000; lanes 1‐8, PCR products from the genomic DNA of transgenic plant leaves; lane 7, PCR product by deionized water; lane 8, PCR product by the recombinant plasmid of GaFZ, the primers were 35S‐F/OEGaFZ‐Sac1. (j) Quantitative real‐time PCR analysis of different GaFZ‐overexpressing transgenic lines (n = 3). WT, wild type. Data are shown as the mean ± SD.

GaFZ might repress a key step in VLCFA elongation

To investigate whether GaFZ is involved in the MBW‐GL2 system or other reported genes regulating fuzz/trichome development, we analysed global gene co‐expression profiles between fuzzy and fuzzless cotton accessions, as well as in A. thaliana overexpressing GaFZ. A total of 1343 and 367 differentially expressed genes (DEGs) were identified in A. thaliana lines and G. arboreum at fuzz critical development stages (+3 to + 5 DPA), respectively (Figure S21). In both species, these genes were mainly involved in secondary metabolisms, such as fatty acid elongation (ko00062) and cutin, suberin and wax biosynthesis (ko00073) (Figure 5, Figure S22 and Table S4). The expression of cotton homologs GaMYB109 and GaWER (homolog of AtGL1), GaHOX1 (homolog of AtGL2)/GaHOX2/GaHOX3, GaDEL65 (AtGL3) and GaWD40 (AtTTG1) in the MBW system was not significantly different between the two cotton accessions, and only a slight decrease in AtGL2 transcript levels was observed in GaFZ‐overexpressing A. thaliana (Table S5–S6). Yeast two‐hybrid assay results also indicated that no direct interaction occurred between GaFZ and the MBW‐GL2 system (Figure S23a). At the transcriptional level, a transient dual‐luciferase and GUS activity assay in tobacco demonstrated that the LUC/REN level and GUS activity of pGaHOX1 + GaFZ was unchanged compared with that of pGaHOX1 (Figure S23b–c). For the expression level of other reported trichome/fuzz‐regulating genes, different expression was found in GaHD1, but no difference was detected in GaMML3, between the fuzzy and fuzzless accession at the critical stage of fuzz development (Figure S24). Additionally, most DEGs were involved in the fatty acid elongation pathway in fuzzless cotton and A. thaliana overexpressing GaFZ (Figure 5). Several key DEGs in G. arboreum, including β‐ketoacyl‐CoA synthase 19 (GaKCS19), β‐ketoacyl‐CoA synthase 12 (GaKCS12) and β‐ketoacyl‐CoA synthase 2 (GaKCS2), showed significantly reduced expression patterns at the fuzz fibre formation stage (+3 to + 5DPA) in fuzzless cotton (Figure 5; Table S7). GaFZ‐overexpressing A. thaliana plants also exhibited decreased trichome levels and reduced expression that was five‐ to sixfold lower than in non‐transformed plants which involved the β‐ketoacyl‐CoA synthase (AtKCS) genes—AtKCS1, AtKCS3, AtKCS6, AtKCS10 and AtKCS12 (Figure S25a; Table S8). Genes encoding a subset of wax biosynthetic enzymes, such as cytochrome P450 (AtCYP86A2), ECERIFERUM1 (CER1) and midchain alkane hydroxylase (AtMAH1), were down‐regulated in OE plants (Figure S25a; Table S8), as were genes encoding putative wax ABC transporters (GDSL lipase and AtCER5) (Figure S25a; Table S9). A toluidine blue (TB) test confirmed that wax biosynthesis and transport were inhibited in OE plants (Figure S25b). We predicted that GaFZ might have down‐regulated the first step of fatty acid elongation (KCS genes) and further repressed downstream pathways in fatty acid elongation to impact fuzz or trichome development.

Figure 5.

Figure 5

Proposed model showing the mechanism of GaFZ in regulating cotton fuzz/leaf hair development. Heat map indicates gene transcript profiles with FPKM value in different tissues and developmental stages between fuzzy (GA0146) and fuzzless accessions (GA0149), each horizonal block represents one gene. The simplified regulation model of fatty acid elongation and wax, cutin and suberin biosynthesis was modified from Li‐Beisson et al. (2013). Simply, GaFZ was activated by an inserted enhancer located at ~ 18 kb upstream, indicating that it inhibits the expression of genes involved with the fatty acid elongation pathway, further repressed the development of cotton fuzz and trichome through wax, cutin and subering biosynthesis pathway in G. arboreum fuzzless. KCR, ketoacyl‐CoA reductase; HCD, β‐hydroxyacyl‐CoA dehydratase; ECR, enoyl‐CoA reductase.

Discussion

Insertion larINDEL FZ is a major regulator responsible for the fuzzless seed phenotype in G. arboreum

Lack of fuzz (fuzzless) is a common mutant seed phenotype in cotton. In commercial cotton seed production, natural naked seeds do not require treatment with strong acids during the seed‐coating process, thus reducing environmental pollution. However, commercial fuzzless varieties are rare, as the fuzzless phenotype is associated with poor yield (lint fibre). In G. arboreum, nearly a quarter (57 of 215 accessions) of investigated germplasms were found to be fuzzless (Du et al., 2018), though little is known about the fuzzless regulatory mechanism in these germplasms. As a diploid, G. arboreum is an ideal model for studying the genetic basis of fuzzless or hairless mutations in cotton. Recently, using varied technologies, the major locus for fuzzless in G. arboreum was confirmed on chromosome 8 (Du et al., 2018; Feng et al., 2019; Liu et al., 2020). In contrast to previous studies (Feng et al., 2019; Liu et al., 2020), we found that GaFZ function may differ from that of its A. thaliana homologs GIR1/GIR2. In A. thaliana, GIR1/GIR2 represses root hair development by regulating GL2 (Wu and Citovsky, 2017a, 2017b). However, there was no difference in root hair density between the wild‐type and overexpressed cotton (Figure S20).

Moreover, Feng et al. (2019) and Liu et al. (2020) suggested that SNPs or small InDels are responsible for regulating GaFZ; however, we confirmed that variations such as those in GaFZ were not correlated with its expression (Figure 3b‐c, Figure S10). We performed follow‐up work analysis to determine the genetic basis of the fuzzless/glabrous trait in G. arboreum. However, in contrast to the results obtained by traditional GWAS, the strongest SNP signal (SNP FZ ) identified was not responsible for the fuzzless phenotype (Figure 1e) or expression of GaFZ (Figure 3a). The associated SNP cluster identified may be regarded as tags, and thus other variations nearby may be important. Therefore, we compared raw read mapping data at this region between fuzzy and fuzzless accessions and successfully identified a large‐insertion (larINDEL FZ ) upstream of GaFZ in the fuzzless mutant. This SV was also confirmed in non‐SNP GWAS.

In humans, large‐scale genomic changes have been considered as major causes of disease (Stankiewicz and Lupski, 2010). Similarly, in‐depth studies of plant genomes revealed that many important agricultural traits are also controlled by genomic SVs. CNVs often increase the transcript levels of associated genes, such as the genetic basis of hybrid incompatibility between japonica and indica and grain size diversity in rice (Shen et al., 2018; Wang et al., 2015). SVs can act as enhancers and disrupters by inserting or deleting sequences in the host genome, which is often induced by TE activity (Duan et al., 2017; Studer et al., 2011). The G. arboreum genome harbours > 85% repeat sequences (Du et al., 2018), providing an abundance of resources for yielding regulators under natural selection or domestication (Wang et al., 2017). Here, we found evidence that distant structural variations in certain genes can effectively impact important phenotypes in cotton, and highlighted the importance of noncoding regulators in crops. Enhancers are CREs, which control the spatiotemporal expression of some functional genes (Paliou and Andrey, 2018), and are typically derived from the insertion of TEs or genomic SVs; including deletions, inversions, duplications and translocations. We identified a large insertion with a translocation, larINDEL FZ , which acts as a distant directional enhancer (~18 kb) for GaFZ, ultimately regulating the fuzzless trait in cotton.

Fuzz development is controlled by multiple loci in cotton

Complex metabolic networks coordinately regulate cotton fibre and trichome initiation. In our study, the fuzzless phenotype in most accessions was associated with larINDEL FZ (Figure 1). Rong et al. (2005) and Liu et al. (2020) also reported several G. arboreum fuzzless accessions with a non‐dominant inheritance pattern, which may explain these fuzzless accessions without larINDEL FZ . Perhaps because of the low allelic frequency in the population (minor allele frequency < 0.05), other regulators responsible for fuzzless phenotype were not identified in the previous GWAS project (Du et al., 2018). To date, loci influencing seed fuzz in the G. hirsutum germplasm population have not been identified at the genome‐wide level. However, both dominant and recessive inheritance patterns have been reported to regulate seed fuzz (Rong et al., 2005). Additionally, Zhu et al. (2018) identified five loci associated with the recessive fuzzless trait, implying the presence of a complicated regulating network in G. barbadense. Therefore, larger population size should be used to comprehensively screen loci resulting in fuzz reduction in G. arboreum. More surveys, such as investigation of hybrid phenotype between extensive fuzzless mutants and fuzzy accession, should be conducted to comprehensively explore other genetic factors responsible for the fuzzless trait in G. arboreum.

The function of the homologous gene (GaFZ) or sequence variations of the locus FZ region is unknown in the other two cultivated tetraploid cottons (G. hirsutum and G. barbadense). Deficient expression of GhMML3 (GhMYB25‐like) was suggested to be associated with the dominant fuzzless phenotype in G. hirsutum mutant N1 (Wan et al., 2016). However, we found that the expression level of GaMML3 did not differ between the fuzzy and dominant fuzzless accession in G. arboreum (Figure S24). This inconsistency suggests that the dominant fuzzless phenotype was controlled by different loci in G. hirsutum and G arboreum. Recently, a study of a new naked‐tufted mutant (n4 t ) was the first to mention a relationship between the fuzzless phenotype and GaFZ homolog in G. hirsutum (Naoumkina et al., 2020). We demonstrated that overexpression of GaFZ in G. hirsutum led to a significant reduction in the fuzz and trichome (Figure 4), indicating that GaFZ (or its homologs in other cotton species) might regulate similar downstream pathway influencing fuzz/trichome development in both G. arboreum and G. hirsutum. However, larINDEL FZ is most likely a mutation that specifically exists in the G. arboreum genome, which may have resulted the activity of noncoding sequences. Studies are needed detect whether the large‐scale sequence variations also present in the homologous region of FZ locus in both G. hirsutum and G. barbadense genomes, as well as their impact on the expression of GaFZ homologs and fuzz or trichome phenotypes in the future study. Further analysis of the pattern of sequence variations of this active noncoding region among cotton species is an ideal model for understanding the genome evolution of cotton.

GaFZ may repress key VLCFA genes (KCSs) and further regulate fuzz/trichome

According to Qin et al., 2007, long‐fibre cotton contains almost twice the amount of KCS transcripts than short‐fibre cotton at the fast‐growing + 10 DPA stage. Our findings indicate that GaFZ may be associated with fatty acid elongation and cutin, suberin and wax biosynthesis (Figure 4). Transcription of all 11 KCS genes was reduced in GaFZ‐OE A. thaliana, accompanied by a distinct decrease in trichome numbers (Table S9). Particularly, KCS2 was 15‐fold lower in GaFZ‐OE compared with that in wild‐type A. thaliana. Genes encoding two major components of the fatty acid elongation complex, KCR (YBR159) and 3‐hydroxyacyl‐CoA dehydratase (PAS2) (Qin et al., 2007b; Roudier et al., 2010), were also transcriptionally down‐regulated in GaFZ‐OE A. thaliana (Table S9). These findings suggest that GaFZ might regulate KCSs.

Plant waxes are typically composed of primary alcohols, wax esters, aldehydes and alkanes, derived from VLCFAs, with chain lengths ranging from C20:0 to C34:0 (Yeats and Rose, 2013). Epidermal cells and cotton fibres require wax and cutin for elongation (Degani et al., 2002; Suh et al., 2005). Two genes, CER1 and CER3, are involved in alkane biosynthesis and highly expressed in trichomes (Bernard et al., 2012; Chen et al., 2003). The levels of enzymes CER1 and MAH1, which are involved in the reactions associated with cutin, suberin and wax biosynthesis in the decarbonylation pathway, and WSD1, which is involved in the acyl reduction pathway (Greer et al., 2007; Li et al., 2008; Rowland et al., 2007), were reduced in GaFZ‐OE plants. Additionally, CER5, encoding a plasma membrane‐localized ABC transporter that exports cutin monomers and waxes (Luo et al., 2007; Panikashvili et al., 2007; Ukitsu et al., 2007) were regulated similarly by GaFZ. GDSL lipases were recently suggested to be plant cutinases (Takahashi et al., 2010), and the levels of GDSL lipase genes were decreased by 23‐fold in GaFZ‐OE plants. A TB test revealed dark blue staining in GaFZ‐OE plants, which is consistent with observations of cuticle‐defective 35S:MYB106‐SRDX plants (Oshima et al., 2013). Therefore, GaFZ may also act as a significant negative regulator of the biosynthesis and export of cuticular cutin, suberin and wax, which form the outermost layer of cotton fibre.

Based on the genomic variations and gene functional analysis, we proposed a model for fuzz development in G. arboreum (Figure 5). Our results indicated that GaFZ is a dominant inhibitory regulator of fuzz initiation, activated by a ~ 18‐kb large‐insertion (larINDEL FZ ) upstream of the gene, and acts predominantly by repressing genes involved in fatty acid biosynthesis (Figure 5). Identification of the regulatory mechanism behind GaFZ provides insight into towards the noncoding sequence function, as well as provides opportunities for the development of environmentally‐friendly (by avoiding usage of strong acids to remove fuzz during seed production process) cotton varieties by combining favourable alleles related to fibre yield and fibre quality.

Experimental procedures

Plant materials, growth conditions and phenotyping

A GWAS population containing 215 accessions was used (Du et al., 2018). The two representatives, G. arboreum cultivars, fuzzy and fuzzless‐type (accessions GA0146 and GA0149, respectively), were described previously (Wang et al., 2008). The F2 population (1218 individuals) was developed by crossing GA0146 and GA0149. For GWAS and segregating populations, plants were grown in the experimental field in Anyang, China. The seed fuzz trait was easily identified after ginning. The fourth youngest leaf (the fourth leaf closest to the shoot apical meristem) was collected from field or greenhouse plants, and leaf trichomes on the main veins were counted using an inverted microscope. Three fields of view were counted on each of three leaves for each sample.

Transgenic cotton lines were grown in a controlled environment (28°C day/25°C night, 75% relative humidity, and 16‐h photoperiod with a light intensity of ~ 100 molm2/s). Arabidopsis thaliana and N. benthamiana seeds were sterilized with 3% NaClO for 10 min and 70% ethanol for 1 min, and then washed with sterile ddH2O five times. Seeds were grown on MS media with 0.8% agar and 3% sucrose, vernalized at 4°C for 2 days and transferred to a growth chamber (21°C day/18°C night, 60–80% relative humidity and 16‐h photoperiod with a light intensity of ~ 100 molm2/s) (Liu et al., 2017; Murashige and Skoog, 1962). Seven‐day‐old wild‐type and transgenic A. thaliana seedlings grown on vertical on solid MS media were used for root hair density determination (Wu and Citovsky, 2017a, 2017b).

GWAS

SNP‐based GWAS was performed as described previously (Du et al., 2018). Non‐SNP markers were first transformed into a nucleotide format suitable for EMMAX (ver. emmax‐beta‐07Mar2010) (Kang et al., 2010). SNPs and InDels were called using GATK (ver. 4.0.12.0) (Du et al., 2018). SVs and CNVs from the whole population (215 accessions of G. arboreum and 1218 F2 individuals) were called using BreakDancer (v1.1) and CNVnator (v0.3), respectively (Abyzov et al., 2011; Chen et al., 2009). The existence of an InDel marker in an accession was marked as ‘T’, and ‘C’ if not present. For SV markers, chromosomes were divided into 100‐bp fragments as windows, and each window was regarded as a single marker. Windows containing SVs were marked as ‘T’, and those without were marked as ‘C’. The transformation of CNVs markers was similar to SVs. Genome‐wide significance thresholds of all tested traits were evaluated with the formula P = 0.05/n (where n is the number of variations).

Structural variation validation and visualization

For PCR validation, primers flanking the breakpoint of the interposition were designed using Primer 5.0 (Table S1). SV6 was amplified from GA0146 and GA0149 genomic DNA using Advantage 2 PCR Enzyme (Clontech, Mountain View, CA, USA) with an extension time of 4.5 min. PCR products were cloned into T‐vectors and sequenced. The GenBank number of GA0149‐SV6 sequence is MT584306. We distinguish homozygous fuzzy and fuzzless accessions using the primer sets of 6F/6R and 6F/6R’ and amplified the inserted sequence in GA0146, GA0149, hybrid F1 and F2 populations using Phanta Max Super‐Fidelity DNA Polymerase (Vazyme, Nanjing, China) with an extension time of 1.5 min.

The variations in the FZ region (Figure S3, Figure S6b‐e, Figure S10) were also exhibited by Integrative Genomics Viewer (IGV, ver. 2.3.97) (Robinson et al., 2011). The introduction of viewing alignment data in Integrative Genomics Viewer is found at http://software.broadinstitute.org/software/igv/AlignmentData.

RNA sequencing and expression analyses

For RNA extraction, the root and hypocotyls of GA0149 and GA0146 were collected from two‐day‐old seedlings, and fresh leaves were collected from three‐week‐old greenhouse‐grown seedlings. Fibre and ovules (at −2 day post‐anthesis −2, 0, +3, +5, +8, +13, +18, +23DPA), flowers, bracts and stamens were harvested from the field. Samples were immediately frozen in liquid nitrogen and stored at −80°C. Total RNA was extracted using the RNAprep Pure Plant Kit (polysaccharides and polyphenolic‐rich) (TIANGEN, Beijing, China) followed by DNase I (TIANGEN) treatment, according to the manufacturer’s instructions.

Simultaneously, A. thaliana RNA of wild type and 35S::GaFZ at the vegetative stage (samples were named as WT‐1 and OE‐1) and reproductive growth stage (samples were named as WT‐2 and OE‐2) was extracted (three replications each) using TRIzol (Invitrogen, Carlsbad, CA, USA), respectively. A total of 15 μg RNA was used to construct Strand‐specific Illumina RNA‐Seq libraries (Zhong et al., 2011), and RNA sequencing was performed on an Illumina HiSeq 2000 platform (Biomarker technologies, Beijing). After removing the adaptor sequences and low‐quality reads, the remaining reads were mapped to the G. arboreum reference genome (Du et al., 2018) and A. thaliana (https://www.arabidopsis.org/) reference genomes using TopHat (ver. 2.0.13) (Kim et al., 2013). Only reads with a perfect match or one mismatch were further analysed and annotated. The raw transcriptome sequencing data are available in the NCBI Sequence Read Archive (SRA) under the BioProject accession ID PRJNA622455.

Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway functional enrichment and Gene Ontology were performed using the KEGG Ortholog database (https://www.kegg.jp) and Gene Ontology database (http://www.geneontology.org), respectively. Nr (non‐redundant protein sequence database, https://blast.ncbi.nlm.nih.gov/Blast.cgi) and Swiss‐Prot (https://www.uniprot.org/blast/) were used for gene annotation. The Clusters of Orthologous Groups database was used for functional classification (Ashburner et al., 2000; Kanehisa et al., 2004).

Gene expression was calculated by FPKM using Cufflinks (Trapnell et al., 2010). Differentially expressed genes were identified as log2(fold change) ≥1.5, with a false discovery rate of ≤ 0.05.

The GaFZ expression level was validated by quantitative real‐time PCR and RT‐PCR. A total of 1 μg RNA was synthesized using a PrimeScript RT Reagent Kit with gDNA Eraser (Perfect Real Time) (Takara, Shiga, Japan) according to the manufacturer’s instructions. Primers for real‐time and RT‐PCR were designed using NCBI Primer‐BLAST (https://www.ncbi.nlm.nih.gov/tools/primer‐blast/index.cgi?LINK_LOC=BlastHome) (Table S1). Real‐time PCR was performed using the 7500 Fast Real‐Time PCR system (Applied Biosystems, Foster City, CA, USA) and SYBR Premix Ex Taq II (Tli RNaseH Plus) (Takara) in 20‐μL reactions following the manufacturer’s instructions. The PCR efficiency of all primers was firstly evaluated by LinRegPCR (Ruijter et al., 2009), which revealed an efficiency of 90.5%‐110.2%. Histone 3 (AF024716) from cotton was selected as an internal control. The relative gene expression level was calculated by the 2−ΔΔCt method (Livak and Schmittgen, 2001). Three technical replicates and independent experiments were tested.

Semi‐quantitative PCR was performed to further verify the expression levels of GaFZ. The reverse‐transcribed cDNA template was serially diluted. Histone3 (AF024716) was amplified using PCR Mix (CWBIO, Beijing, China) to adjust the initial template concentration following gel band brightness. The appropriate concentration of the template was used to amplify GaFZ from various samples.

Subcellular localization of GaFZ

The full‐length CDS of GaFZ (without stop codons) was amplified from GA0149 with specific primers (Table S1) and inserted into the transient expression vectors pBinGFP438 and pBinRFP438, generating constructs 35S::GaFZ‐GFP and 35S::GaFZ‐RFP, respectively (Liu et al., 2014). Recombined plasmids 35S::GaFZ‐GFP and 35S::GaFZ‐RFP and positive control plasmids 35S::GFP and 35S::RFP were transferred into the Agrobacterium tumefaciens strain LBA4404 by liquid nitrogen freezing and thawing. Vector 35S::GaFZ‐GFP was transferred into A. tumefaciens strain GV3101. Agrobacterium tumefaciens cells were cultured overnight at 28°C with shaking at 200 rpm, collected by centrifugation and resuspended in infiltration media (10 mM MgCl2, 10 mM MES‐KOH at pH 5.6 and 200 μM acetosyringone) to an OD600 of 0.8‐1.0 and incubated at room temperature for 2 h. Nicotiana benthamiana leaves were infiltrated with A. tumefaciens LBA4404 containing the expression and control vectors. Arabidopsis thaliana was transformed by floral dip (Clough and Bent, 1998) with A. tumefaciens GV3101 carrying 35S::GaFZ‐GFP. The fluorescent protein signal was observed using a Zeiss LSM 710 confocal microscope (Zeiss, Oberkochen, Germany) with excitation and emission wavelengths of 488 and 495‐530 nm, respectively.

CDS and promoter sequence analyses of GaFZ alleles

Following the CTAB method (Song et al., 1998 ), genomic DNA was extracted from fresh cotton leaves (acc. GA0146 and acc. GA0149), from which the full‐length of GaFZ and the upstream 2000 bp promoters of GaFZ promoters were cloned. All PCRs were run with KOD FX enzyme (Toyobo, Osaka, Japan), following standard PCR protocols. The products were cloned into the T‐19 simple vector (Takara Kit, Code No. 3271) and sequenced (General Biosystem, Durham, NC, USA). The diversity of nucleotide sequences was analysed using DNAMAN (ver. 6.0).

GaFZ promoter activity

GaFZ promoters, 2000 bp upstream of the transcription start site, were cloned from GA0146 and GA0149 genomes with primers pGaFZ −2000‐Sca I and pGaFZ −2000‐Xba I (Table S1). pGaFZ −2000‐GA0146 and pGaFZ −2000‐GA0149 were fused into the binary vector pBI121 with GUS, respectively. The recombinant plasmids were induced into A. tumefaciens strain LBA4404 and cultured in LB containing 50 μg/mL kanamycin, 30 μg/mL rifampicin and 50 μg/mL streptomycin sulphate at 28°C with shaking at 200 rpm for 12‐16 h. The Agrobacteria cells were centrifuged and resuspended in infiltration buffer (10 mM MgCl2, 10 mM MES and 200 μM acetosyringone at pH 5.7) to an OD600 of 0.8, and incubated at room temperature for 2 h. The abaxial sides of N. benthamiana leaves were infiltrated with the suspension and cultured in a dark greenhouse for 24 h, after which they were moved to a growth chamber with a 16‐h photoperiod for two days.

For GUS histochemical staining, fresh transgenic N. benthamiana tissue was cut into 3‐5 mm strips, followed immediately by incubation in staining solution (1 mM X‐Gluc, 100 mM pH 7.0 sodium phosphate, 10 mM EDTA, 0.5 mM potassium ferricyanide, 0.5 mM potassium ferrocyanide and 0.3% Triton X‐100) at 37°C for 12 h. To remove chlorophyll, the tissue was washed with an ethanol series (50, 70, and 95%) and stored in 70% ethanol. Samples were imaged with a stereomicroscope (Leica Microsystems, Wetzlar, Germany), and GUS activity was detected as described previously (Jefferson, 1987). Briefly, tissue samples were ground into a fine powder, added to the GUS‐extraction solution and centrifuged for 2 min at 1000 rpm. The supernatant contained the crude GUS enzyme. The substrate 4‐methylumbelliferyl‐d‐glucopyranoside (4‐MUG) (Sigma‐Aldrich, Shanghai) was added to the supernatant, and fluorescence of the 4‐methylumbelliferone (4‐MU) (Sigma‐Aldrich, St. Louis, MO, USA) product was determined with excitation and emission wavelengths of 365 and 455 nm, respectively. The amount of 4‐MU was calculated using the 4‐MU standard curve. Protein concentration was used to normalize GUS activity. Each experiment was performed with at least three biological and technical replicates.

Protoplast transient expression

The insertion–deletion fragments of different lengths were cloned and inserted upstream of the 35S minimal promoter in the pGreenII 0800‐LUC vector, as described previously (Huang et al., 2018). The primers used to construct all vectors are listed in Table S1. For all transfections, plasmids were purified from Escherichia coli DH5α (CWBIO) using the Plasmid Maxi Kit (Tiangen) and adjusted to 1 µg/µL. Cotton protoplasts were isolated from 10‐day‐old etiolated GA0149 seedling cotyledons, following (slightly modified) established methods (Huang et al., 2018). Transfection was performed as described for polyethylene glycol‐mediated transformation (Kuroha et al., 2018). Protoplasts harbouring the plasmid were transferred into 6‐well plates and cultured the dark at 25°C for 16‐18 h. The LUC assay was measured using the Dual‐Luciferase Reporter Assay System (Promega Madison, WI, USA), following the manufacturer's protocol. Six biological replicates and technical replicates were assayed.

GaFZ overexpression vector construction and transformation

To construct the overexpression vector, the GaFZ full‐length CDS was PCR‐amplified from GA0149 with primer sets OEGaFZ‐XabI and OEGaFZ‐SacI, subcloned into T‐Vector pMD19 (Takara) and confirmed by sequencing. The cloned fragments were digested with XbaI and SacI, and then ligated into the pBI1221 (containing the CaMV 35S promoter) XbaI‐SacI cloning site. This vector was transformed into A. tumefaciens strains GV3101 and LBA4404. The strain GV3101 including vector was introduced into Arabidopsis following the floral dip method (Clough and Bent, 1998), and the strain LBA4404 including vector was transferred into cotton (HM1) plants following the A. tumefaciens‐mediated hypocotyl transformation method (Jin et al., 2006). Transgenic plants were selected on media containing 40 μg/mL of kanamycin and 50 μg/mL of cefotaxime. To identify the transgenic positive lines, the OE lines were genotyped by PCR using a specific primer set (CaMV35S‐F/OEGaFZ‐SacI) (Table S1). The GaFZ expression level was also determined in all OE lines using a specific primer set (Table S1).

Yeast two‐hybrid

The full‐length CDS for the transcription factors (GaWER, GaHOX1, GaDEL65 and GaWD40) was amplified and cloned into the bait vector, pGBKT7. Full‐length GaFZ CDS was inserted into the prey vector, pGADT7. pGBKT7 was transformed into Y2HGold yeast, and pGADT7 into Y187 yeast following the Transformation System (Clontech). Non‐autoactivation of the bait and non‐toxicity of the prey vectors were determined. Interactions between GaFZ and the transcription factors were detected using the mating method following the Matchmaker Gold Yeast Two‐Hybrid System on SD/‐Leu/‐Trp and SD/‐Ade/‐His/‐Leu/‐Trp media containing X‐gal (5‐bromo‐4‐chloro‐3‐indolyl‐b‐D‐galactopyranoside) media (Clontech).

Transcriptional activation assays

The GaHOX1 promoter (2000 bp upstream of the ATG) was amplified and cloned into pGreenII 0800‐LUC to drive expression of the firefly LUC reporter gene. To determine relative effectiveness, the Renilla reporter gene was driven by the CaMV 35S promoter in the same vector. The coding sequence of GaFZ was amplified and fused to pGreenII 62‐SK as an effector plasmid. The dual‐reporter system was performed as described previously (Liu et al., 2008). Relative luciferase activities were determined by dividing the LUC value with the REN value. To generate the GUS reporter vector, the GaHOX1 promoter was cloned into pcambia1391, to drive expression of GUS. The CDS of GaFZ was amplified and fused to the 2300 flag as an effector plasmid. The recombinant vectors were transferred into A. tumefaciens GV3101 with the helper plasmid pSoup‐P19 (ensuring high levels of transient gene expression). Agrobacteria cells containing the effector and reporter plasmids were mixed 1:2, respectively, and then co‐transformed into tobacco leaves. GUS activity was measured as described previously (Jefferson, 1987). At least six biological replicates were performed for each transient assay.

Phylogenetic analysis

The full‐length FZ sequences from cotton and other species were aligned using CLUSTALX (v2.1) (Larkin et al., 2007). A neighbour‐joining phylogenetic tree was constructed using MEGA (v.7.0) with 1000 bootstraps.

Trypan blue staining

Three‐week‐old A. thaliana were stained with trypan blue (TB) as described previously (Tanaka et al., 2004).

Conflict of interest

The authors declare no competing financial interests.

Author contributions

X.M.D., W.F.G. and S.P.H. conceived the project and designed the research. S.P.H., G.F.S. and Y.M. Z. performed built the non‐SNP analysis pipeline and performed GWAS. G.F.S. performed the bioinformatic analysis. Z.E.P., X.M.Z., P.P.W., Y.D., Y.H.J., X.L. G., B.Y.P., D.W.H. and L.R.W. designed and carried out field collection work. X.Y.W, Y.C.M and Y.F.C performed most of the experiments. X.Y.W. and L.Y.W. performed protoplast separation experiment. G.Y.F. performed the locus homozygous identification. S.S, Z.P and G.X.J performed LUC activity. Q.G., B.J.C. and G.Y.F. performed yeast two‐hybrid assays. S.P.H, W.F.G., X.Y.W., Y.C.M. and X.M.D. wrote the paper. All authors read and approved the manuscript.

Supporting information

Figure S1 GWAS generated by different types of variations on seed fuzz.

Figure S2 GWAS generated by different types of variations on leaf hair.

Figure S3 Comparison of mapping paired‐end reads in candidate region between fuzzy and fuzzless accessions.

Figure S4 Gels of all designated markers for continuous contigs designing overlapped PCR in locus FZ region on chromosome 8 between fuzzy (GA0146) and fuzzless (GA0146) accessions.

Figure S5 Identification of the exact breakpoint and sequence of larINDEL FZ .

Figure S6 Deciphering of sequence details in locus FZ region.

Figure S7 Proposed evolutionary model nearby the genomic region of larINDEL FZ .

Figure S8 Test of genotype‐phenotype linkage in both GWAS and F2 segregation populations.

Figure S9 Expression of GaFZ in different tissues between fuzzy (GA0146) and fuzzless (GA0149) accessions by RT‐PCR (a) and RNA‐seq (b).

Figure S10 Alignment of GaFZ (Ga08G0121) flanking sequences between GA0146 (fuzzy) and GA0149 (fuzzless).

Figure S11 Enhancement effect on gene expression within 1Mb‐length region flanking larINDEL FZ .

Figure S12 Illustration of cis‐regulatory elements in the sequence amplified by primer 6F6R in fuzzless and fuzzy accession.

Figure S13 Subcellular localization of GaFZ.

Figure S14 Phylogenetic tree analysis GaFZ among different species.

Figure S15 Selection of transgenic T0 generation‐positive seedlings in MS included kanamycin and cefotaxime.

Figure S16 Phenotypes and gene expression of GaFZ in overexpressing Arabidopsis plants.

Figure S17 Stem trichome phenotype of wild type and GaFZ‐overexpressing cotton.

Figure S18 Whole plant phenotype of Wild‐type (WT) and 35S::GaFZ (OE) in cotton.

Figure S19 Fuzz and lint phenotypes in WT and OE cottons.

Figure S20 Root hair phenotype in OE Arabidopsis and OE cotton.

Figure S21 Venn diagrams showed the common differential expressed genes (DEGs) in two developmental stages of GaFZ‐OE Arabidopsis lines vs. WT (a), and in two fuzz initiation stages (+3, +5DPA) of fuzzless G. arboreum line vs. fuzzy line (b).

Figure S22 KEGG enrichment analysis of DEGs between WT and OE Arabidopsis, and between fuzzy and fuzzless G. arboreum.

Figure S23 Interactions of GaFZ with GaHOX1, GaWER, GaDEL65 and GaWD40 at the protein and transcriptional level.

Figure S24 Gene expression of GaHD1 and GaMML3 between fuzzy (GA0146) and fuzzless (GA0149) accession.

Figure S25 GaFZ affects the gene expression of fatty acid elongation (FAE) pathway and cutin, suberin and wax biosynthesis pathway in Arabidopsis.

PBI-19-1110-s002.pdf (20.4MB, pdf)

Table S1 All primers used in this study.

Table S2 Genotype of larINDEL FZ , seed fuzz and leaf hair in both GWAS population and F2 segregation population.

Table S3 Detailed information of cis‐regulatory elements in larINDEL FZ .

Table S4 KEGG enriched pathways in GaFZ‐OE Arabidopsis line vs. WT.

Table S5 Expression of GL1, GL2, GL3 and TTG1 in Arabidopsis.

Table S6 Expression of GL1, GL2, GL3 and TTG1 in G. arboreum.

Table S7 Gene expression of biosynthesis of unsaturated fatty acids, fatty acid elongation, and cutin, suberin and wax biosynthesis pathway in G. arboreum.

Table S8 DEGs in cutin, suberin and wax biosynthesis, and fatty acid elongation pathway in Arabidopsis.

Table S9 DEGs in fatty acid and wax transporter, and fatty acid biosynthesis in Arabidopsis.

PBI-19-1110-s001.xlsx (145.2KB, xlsx)

Acknowledgements

We thank academician Dr. Xiaoya Chen (Chinese Academy of Sciences, Shanghai, China) for providing the Arabidopsis mutant gl2 , carefully reviewing and giving valuable suggestions for the paper; Dr. Cheng. Huang (China Agricultural University) for providing the dual‐reporter vector with mini promoter and the technology of separation protoplast; Dr. Hengling Wei and Dr. Zhaoen Yang (Institute of Cotton Research of CAAS) for providing pGreenII 0800‐LUC, pGreenII 62‐SK and 2300 flag vector; Prof. Zhonghu Liu (College of Animal Science and Veterinary Medicine, Henan Agricultural University) for providing condition to determine LUC activity; Prof. Lihua Ma, Mr. Tianping Suo and Prof. Xiaoxuan Song (State Key Laboratory of Cotton Biology) for helping of scanning electron microscopy (SEM); and Prof. Fushun Hao of Henan University for providing pCAMBIA1391 Vector. We also thank Dr. Mian Fasal Nazir for editing our manuscript. This work was supported by funding from the National Key Technology R&D Program, the Ministry of Science and Technology (2016YFD0100203, 2016YFD0100306 and 2017YFD0101601) the National Natural Science Foundation of China (grant 31621005).

Wang, X. , Miao, Y. , Cai, Y. , Sun, G. , Jia, Y. , Song, S. , Pan, Z. , Zhang, Y. , Wang, L. , Fu, G. , Gao, Q. , Ji, G. , Wang, P. , Chen, B. , Peng, Z. , Zhang, X. , Wang, X. , Ding, Y. , Hu, D. , Geng, X. , Wang, L. , Pang, B. , Gong, W. , He, S. and Du, X. (2021) Large‐fragment insertion activates gene GaFZ (Ga08G0121) and is associated with the fuzz and trichome reduction in cotton (Gossypium arboreum). Plant Biotechnol J, 10.1111/pbi.13532

Contributor Information

Wenfang Gong, Email: gwf018@126.com.

Shoupu He, Email: heshoupu@caas.cn.

Xiongming Du, Email: dujeffrey8848@hotmail.com.

References

  1. Abyzov, A. , Urban, A.E. , Snyder, M. and Gerstein, M. (2011) CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ashburner, M. , Ball, C.A. , Blake, J.A. , Botstein, D. , Butler, H. , Cherry, J.M. , Davis, A.P. et al. (2000) Gene Ontology: tool for the unification of biology. Nat. Genet 25, 25–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bernard, A. , Domergue, F. , Pascal, S. , Jetter, R. , Renne, C. , Faure, J.D. , Haslam, R.P. et al. (2012) Reconstitution of plant alkane biosynthesis in yeast demonstrates that Arabidopsis ECERIFERUM1 and ECERIFERUM3 are core components of a very‐long‐chain alkane synthesis complex. Plant Cell 24, 3106–3118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Chen, X. , Goodwin, S.M. , Boroff, V.L. , Liu, X. and Jenks, M.A. (2003) Cloning and characterization of the WAX2 gene of Arabidopsis involved in cuticle membrane and wax production. Plant Cell 15, 1170–1185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Chen, K. , Wallis, J.W. , McLellan, M.D. , Larson, D.E. , Kalicki, J.M. , Pohl, C.S. , McGrath, S.D. et al. (2009) BreakDancer: an algorithm for high‐resolution mapping of genomic structural variation. Nat. Methods 6, 677–681. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Clark, R.M. , Wagler, T.N. , Quijada, P. and Doebley, J. (2006) A distant upstream enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and inflorescent architecture. Nat. Genet. 38, 594–597. [DOI] [PubMed] [Google Scholar]
  7. Clough, S.J. and Bent, A.F. (1998) Floral dip: a simplified method for Agrobacterium‐mediated transformation of Arabidopsis thaliana . Plant J. 16, 735–743. [DOI] [PubMed] [Google Scholar]
  8. Degani, O. , Gepstein, S. and Dosoretz, C.G. (2002) Potential Use of cutinase in enzymatic scouring of cotton fiber cuticle. Appl. Biochem. Biotechnol. 102–103, 277–289. [DOI] [PubMed] [Google Scholar]
  9. Ding, M. , Cao, Y. , He, S. , Sun, J. , Dai, H. , Zhang, H. , Sun, C. et al. (2020) GaHD1, a candidate gene for the Gossypium arboreum SMA‐4 mutant, promotes trichome and fiber initiation by cellular H2O2 and Ca2+ signals. Plant Mol. Biol. 103, 409–423. [DOI] [PubMed] [Google Scholar]
  10. Ding, M. , Ye, W. , Lin, L. , He, S. , Du, X. , Chen, A. , Cao, Y. et al. (2015) The hairless stem phenotype of cotton (Gossypium barbadense) is linked to a copia‐like retrotransposon insertion in a Homeodomain‐Leucine Zipper Gene (HD1). Genetics 201, 143–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Du, S.J. , Dong, C.J. , Zhang, B. , Lai, T.F. , Du, X.M. and Liu, J.Y. (2013) Comparative proteomic analysis reveals differentially expressed proteins correlated with fuzz fiber initiation in diploid cotton (Gossypium arboreum L.). J. Proteomics 82, 113–129. [DOI] [PubMed] [Google Scholar]
  12. Du, X. , Huang, G. , He, S. , Yang, Z. , Sun, G. , Ma, X. , Li, N. et al. (2018) Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits. Nat. Genet. 50, 796–802. [DOI] [PubMed] [Google Scholar]
  13. Duan, P. , Xu, J. , Zeng, D. , Zhang, B. , Geng, M. , Zhang, G. , Huang, K. et al. (2017) Natural variation in the promoter of GSE5 contributes to grain size diversity in rice. Mol. Plant 10, 685–694. [DOI] [PubMed] [Google Scholar]
  14. Fehling, E. and Mukherjee, K.D. (1991) Acyl‐CoA elongase from a higher plant (Lunaria annua): metabolic intermediates of very‐long‐chain acyl‐CoA products and substrate specificity. Biochim. Biophys. Acta. 1082, 239–246. [DOI] [PubMed] [Google Scholar]
  15. Feng, X. , Cheng, H. , Zuo, D. , Zhang, Y. , Wang, Q. , Liu, K. , Ashraf, J. et al. (2019) Fine mapping and identification of the fuzzless gene GaFzl in DPL972 (Gossypium arboreum). Theor. Appl. Genet. 132, 2169–2179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Greer, S. , Wen, M. , Bird, D. , Wu, X. , Samuels, L. , Kunst, L. and Jetter, R. (2007) The cytochrome P450 enzyme CYP96A15 is the midchain alkane hydroxylase responsible for formation of secondary alcohols and ketones in stem cuticular wax of Arabidopsis . Plant Physiol. 145, 653–667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Guan, X.Y. , Li, Q.J. , Shan, C.M. , Wang, S. , Mao, Y.B. , Wang, L.J. and Chen, X.Y. (2008) The HD‐Zip IV gene GaHOX1 from cotton is a functional homologue of the Arabidopsis GLABRA2 . Physiol. Plant. 134, 174–182. [DOI] [PubMed] [Google Scholar]
  18. Huang, C. , Sun, H. , Xu, D. , Chen, Q. , Liang, Y. , Wang, X. , Xu, G. et al. (2018) ZmCCT9 enhances maize adaptation to higher latitudes. Proc. Natl. Acad. Sci. USA 115, E334–E341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jefferson, R.A. (1987) GUS fusions: β‐glucuronidase as a sensitive and versatile gene fusion marker in higher plants. The EMBO Journal 6, 3901–3907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Ji, S.J. , Lu, Y.C. , Feng, J.X. , Wei, G. , Li, J. , Shi, Y.H. , Fu, Q. et al.(2003) Isolation and analyses of genes preferentially expressed during early cotton fiber development by subtractive PCR and cDNA array. Nucleic Acids Res. 31, 2534–2543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jin, S. , Zhang, X. , Nie, Y. , Guo, X. , Liang, S. and Zhu, H. (2006) Identification of a novel elite genotype for in vitro culture and genetic transformation of cotton. Biol. Plant. 50, 519–524. [Google Scholar]
  22. Kanehisa, M. , Goto, S. , Kawashima, S. , Okuno, Y. and Hattori, M. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kang, H.M. , Sul, J.H. , Service, S.K. , Zaitlen, N.A. , Kong, S.‐Y. , Freimer, N.B. , Sabatti, C. et al. (2010). Variance component model to account for sample structure in genome‐wide association studies. Nat. Genet. 42, 348–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Khosla, A. , Paper, J.M. , Boehler, A.P. , Bradley, A.M. , Neumann, T.R. and Schrick, K. (2014) HD‐Zip Proteins GL2 and HDG11 have redundant functions in Arabidopsis Trichomes, and GL2 activates a positive feedback loop via MYB23. Plant Cell 26, 2184–2200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kim, D. , Pertea, G. , Trapnell, C. , Pimentel, H. , Kelley, R. and Salzberg, S.L. (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. Evol. 14, R36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kuroha, T. , Nagai, K. , Gamuyao, R. , Wang, D.R. , Furuta, T. , Nakamori, M. , Kitaoka, T. et al. (2018) Ethylene‐gibberellin signaling underlies adaptation of rice to periodic flooding. Science 361, 181–186. [DOI] [PubMed] [Google Scholar]
  27. Larkin, Ma , Blackshields, G. , Brown, Np , Chenna, R. , McGettigan, Pa , McWilliam, H. , Valentin, F. et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948. [DOI] [PubMed] [Google Scholar]
  28. Lassner, M.W. , Lardizabal, K. and Metz, J.G. (1996) A Jojoba β‐Ketoacyl‐CoA Synthase cDNA complements the canola fatty acid elongation mutation in transgenic plants. Plant Cell 8, 281–292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Li, F. , Wu, X. , Lam, P. , Bird, D. , Zheng, H. , Samuels, L. , Jetter, R. et al. (2008) Identification of the wax ester synthase/Acyl‐Coenzyme A: diacylglycerol acyltransferase WSD1 required for stem wax ester biosynthesis in Arabidopsis . Plant Physiol. 148, 97–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Li‐Beisson, Y. , Shorrosh, B. , Beisson, F. , Andersson, M.X. , Arondel, V. , Bates, P.D. , Baud, S. et al. (2013) Acyl‐lipid metabolism. Arabidopsis Book 11, e0161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Liu, X. , Moncuquet, P. , Zhu, Q.H. , Stiller, W. , Zhang, Z. and Wilson, I. (2020) Genetic identification and transcriptome analysis of lintless and fuzzless traits in Gossypium arboreum L. Int. J. Mol. Sci. 21, 1675–1696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Liu, T. , Song, T. , Zhang, X. , Yuan, H. , Su, L. , Li, W. , Xu, J. et al. (2014) Unconventionally secreted effectors of two filamentous pathogens target plant salicylate biosynthesis. Nat. Commun. 5, 4686–4696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Liu, B. , Sun, L. , Ma, L. and Hao, F.S. (2017) Both AtrbohD and AtrbohF are essential for mediating responses to oxygen deficiency in Arabidopsis . Plant Cell Rep. 36, 947–957. [DOI] [PubMed] [Google Scholar]
  34. Liu, H.T. , Yu, X. , Li, K. , Klejnot, J. , Yang, H. , Lisiero, D. and Lin, C. (2008) Photoexcited CRY2 interacts with CIB1 to regulate transcription and floral initiation in Arabidopsis . Science 322, 1525–1539. [DOI] [PubMed] [Google Scholar]
  35. Livak, K.J. and Schmittgen, T.D. (2001) Analysis of relative gene expression data using real‐time quantitative PCR and the 2‐△△C T Method. Methods 25, 402–408. [DOI] [PubMed] [Google Scholar]
  36. Luo, B. , Xue, X.Y. , Hu, W.L. , Wang, L.J. and Chen, X.Y. (2007) An ABC transporter gene of Arabidopsis thaliana, AtWBC11, is involved in cuticle development and prevention of organ fusion. Plant Cell Physiol. 48, 1790–1802. [DOI] [PubMed] [Google Scholar]
  37. Mills, R.E. , Walter, K. , Stewart, C. , Handsaker, R.E. , Chen, K. , Alkan, C. , Abyzov, A. et al. (2011) Mapping copy number variation by population‐scale genome sequencing. Nature 470, 59–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Murashige, T. and Skoog, F. (1962) A revised medium for rapid growth and bio assays with tobacco tissue cultures. Physiol. Plantarum. 15, 473–497. [Google Scholar]
  39. Naoumkina, M. , Thyssen, G.N. , Fang, D.D. , Li, P. and Florane, C.B. (2020) Elucidation of sequence polymorphism in fuzzless‐seed cotton lines. Mol. Genet. Genomics. https://link.springer.com/article/10.1007/s00438-020-01736-z [DOI] [PubMed] [Google Scholar]
  40. Ni, F. , Qi, J. , Hao, Q. , Lyu, B. , Luo, M.‐C. , Wang, Y. , Chen, F. et al. (2017) Wheat Ms2 encodes for an orphan protein that confers male sterility in grass species. Nat. Commun. 8, 15121–15133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Niu, E.L. , Cai, C.P. , Bao, J.H. , Wu, S. , Zhao, L. and Guo, W.Z. (2018) Up‐regulation of a homeodomain‐leucine zipper gene HD‐1 contributes to trichome initiation and development in cotton. J. Integr. Agr. 18, 361–371. [Google Scholar]
  42. Oshima, Y. , Shikata, M. , Koyama, T. , Ohtsubo, N. , Mitsuda, N. and Ohme‐Takagi, M. (2013) MIXTA‐like transcription factors and WAX INDUCER1/SHINE1 coordinately regulate cuticle development in Arabidopsis and Torenia fournieri . Plant Cell 25, 1609–1624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Paliou, C. and Andrey, G. (2018) Large genomic insertion at the Shh locus results in hammer toes through enhancer adoption. Proc. Natl. Acad. Sci. USA 115, 839–841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Panikashvili, D. , Savaldi‐Goldstein, S. , Mandel, T. , Yifhar, T. , Franke, R.B. , Hofer, R. , Schreiber, L. et al. (2007) The Arabidopsis DESPERADO/AtWBC11 transporter is required for cutin and wax secretion. Plant Physiol. 145, 1345–1360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Pruitt, R.E. , Vielle‐Calzada, J.P. , Ploense, S.E. , Grossniklaus, U. and Lolle, S.J. (2000) FIDDLEHEAD, a gene required to suppress epidermal cell interactions in Arabidopsis, encodes a putative lipid biosynthetic enzyme. Proc. Natl. Acad. Sci. USA 97, 1311–1316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Pu, L. , Li, Q. , Fan, X. , Yang, W. and Xue, Y. (2008) The R2R3 MYB transcription factor GhMYB109 is required for cotton fiber development. Genetics 180, 811–820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Qin, Y.M. , Hu, C.Y. , Pang, Y. , Kastaniotis, A.J. , Hiltunen, J.K. and Zhu, Y.X. (2007b) Saturated very‐long‐chain fatty acids promote cotton fiber and Arabidopsis cell elongation by activating ethylene biosynthesis. Plant Cell 19, 3692–3704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Qin, Y.M. , Pujol, F.M. , Hu, C.Y. , Feng, J.X. , Kastaniotis, A.J. , Hiltunen, J.K. and Zhu, Y.X. (2007a) Genetic and biochemical studies in yeast reveal that the cotton fibre‐specific GhCER6 gene functions in fatty acid elongation. J. Exp. Bot. 58, 473–481. [DOI] [PubMed] [Google Scholar]
  49. Reina‐Pinto, J.J. , Voisin, D. , Kurdyukov, S. , Faust, A. , Haslam, R.P. , Michaelson, L.V. , Efremova, N. et al. (2009) Misexpression of FATTY ACID ELONGATION1 in the Arabidopsis epidermis induces cell death and suggests a critical role for phospholipase A2 in this process. Plant Cell 21, 1252–1272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Robinson, J.T. , Thorvaldsdóttir, H. , Winckler, W. , Guttman, M. , Lander, E.S. , Getz, G. and Mesirov, J.P. (2011) Integrative genomics viewer. Nat. Biotechnol. 29, 24–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Rong, J. , Pierce, G.J. , Waghmare, V.N. , Rogers, C.J. , Desai, A. , Chee, P.W. , May, O.L. et al. (2005) Genetic mapping and comparative analysis of seven mutants related to seed fiber development in cotton. Theor. Appl. Genet. 111, 1137–1146. [DOI] [PubMed] [Google Scholar]
  52. Roudier, F. , Gissot, L. , Beaudoin, F. , Haslam, R. , Michaelson, L. , Marion, J. , Molino, D. et al. (2010) Very‐long‐chain fatty acids are involved in polar auxin transport and developmental patterning in Arabidopsis . Plant Cell 22, 364–375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Rowland, O. , Lee, R. , Franke, R. , Schreiber, L. and Kunst, L. (2007) The CER3 wax biosynthetic gene from Arabidopsis thaliana is allelic to WAX2/YRE/FLP1 . FEBS Lett. 581, 3538–3544. [DOI] [PubMed] [Google Scholar]
  54. Ruijter, J.M. , Ramakers, C. , Hoogaars, W.M.H. , Karlen, Y. , Bakker, O. , van den Hoff, M.J.B. and Moorman, A.F.M. (2009) Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res. 37, e45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Serna, L. and Martin, C. (2006) Trichomes: different regulatory networks lead to convergent structures. Trends Plant Sci. 11, 274–280. [DOI] [PubMed] [Google Scholar]
  56. Shan, C.‐M. , Shangguan, X.‐X. , Zhao, B. , Zhang, X.‐F. , Chao, L.‐M. , Yang, C.‐Q. , Wang, L.‐J. et al. (2014) Control of cotton fibre elongation by a homeodomain transcription factor GhHOX3. Nat. Commun. 5, 5519–5528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Shen, W. , Paxton, C. N. , Szankasi, P. , Longhurst, M. , Schumacher, J. A. , Frizzell, K. A. , Sorrells, S. M. et al. (2018) Detection of genome‐wide copy number variants in myeloid malignancies using next‐generation sequencing. J. Clin. Pathol. 71, 372–378. [DOI] [PubMed] [Google Scholar]
  58. Shi, Y.‐H. , Zhu, S.‐W. , Mao, X.‐Z. , Feng, J.‐X. , Qin, Y.‐M. , Zhang, L. , Cheng, J. et al. (2006) Transcriptome profiling, molecular biological, and physiological studies reveal a major role for ethylene in cotton fiber cell elongation. Plant Cell 18, 651–664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Song, G. , Cui, R. , Wang, K. , Guo, L. , Li, S. , Wang, C. and Zhang, X. (1998) A rapid improved CTAB method for extraction of cotton genomic DNA. Acta. Go. Ssypii. Sinica. 10, 273–275. [Google Scholar]
  60. Stankiewicz, P. and Lupski, J.R. (2010) Structural variation in the human genome and its role in disease. Annu. Rev. Med. 61, 437–455. [DOI] [PubMed] [Google Scholar]
  61. Studer, A. , Zhao, Q. , Ross‐Ibarra, J. and Doebley, J. (2011) Identification of a functional transposon insertion in the maize domestication gene tb1 . Nat. Genet. 43, 1160–1163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Suh, M.C. , Samuels, A.L. , Jetter, R. , Kunst, L. , Pollard, M. , Ohlrogge, J. and Beisson, F. (2005) Cuticular lipid composition, surface structure, and gene expression in Arabidopsis stem epidermis. Plant Physiol. 139, 1649–1665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Suo, J. , Liang, X. , Pu, L. , Zhang, Y. and Xue, Y. (2003) Identification of GhMYB109 encoding a R2R3 MYB transcription factor that expressed specifically in fiber initials and elongating fibers of cotton (Gossypium hirsutum L.). Biochim. Biophys. Acta. 1630, 25–34. [DOI] [PubMed] [Google Scholar]
  64. Takahashi, K. , Shimada, T. , Kondo, M. , Tamai, A. , Mori, M. , Nishimura, M. and Hara‐Nishimura, I. (2010) Ectopic expression of an esterase, which is a candidate for the unidentified plant cutinase, causes cuticular defects in Arabidopsis thaliana . Plant Cell Physiol. 51, 123–131. [DOI] [PubMed] [Google Scholar]
  65. Tanaka, T. , Tanaka, H. , Machida, C. , Watanabe, M. and Machida, Y. (2004) A new method for rapid visualization of defects in leaf cuticle reveals five intrinsic patterns of surface defects in Arabidopsis . Plant J. 37, 139–146. [DOI] [PubMed] [Google Scholar]
  66. Tang, M. , Wu, X. , Cao, Y. , Qin, Y. Ding, M. , Jiang, Y. , Sun, C. et al. (2019) Preferential insertion of a Ty1 LTR‐retrotransposon into the A sub‐genome's HD1 gene significantly correlated with the reduction in stem trichomes of tetraploid cotton. Mol. Genet. Genomics 295, 47–54. [DOI] [PubMed] [Google Scholar]
  67. Trapnell, C. , Williams, B. A. , Pertea, G. , Mortazavi, Ali , Kwan, G. , van Baren, M. J. , Salzberg, S. L. et al. (2010) Transcript assembly and quantification by RNA‐Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Ukitsu, H. , Kuromori, T. , Toyooka, K. , Goto, Y. , Matsuoka, K. , Sakuradani, E. , Shimizu, S. et al. (2007) Cytological and biochemical analysis of COF1, an Arabidopsis mutant of an ABC transporter gene. Plant Cell Physiol. 48, 1524–1533. [DOI] [PubMed] [Google Scholar]
  69. Walford, S.A. , Wu, Y. , Llewellyn, D.J. and Dennis, E.S. (2011) GhMYB25‐like: a key factor in early cotton fibre development. Plant J. 65, 785–797. [DOI] [PubMed] [Google Scholar]
  70. Walford, S.A. , Wu, Y. , Llewellyn, D.J. and Dennis, E.S. (2012) Epidermal cell differentiation in cotton mediated by the homeodomain leucine zipper gene, GhHD‐1 . Plant J. 71, 464–478. [DOI] [PubMed] [Google Scholar]
  71. Wan, Q. , Guan, X. , Yang, N. , Wu, H. , Pan, M. , Liu, B. , Fang, L. et al. (2016) Small interfering RNAs from bidirectional transcripts of GhMML3_A12 regulate cotton fiber development. New Phytol. 210, 1298–1310. [DOI] [PubMed] [Google Scholar]
  72. Wan, Q. , Zhang, H. , Ye, W. , Wu, H. and Zhang, T. (2014) Genome‐wide transcriptome profiling revealed cotton fuzz fiber development having a similar molecular model as Arabidopsis trichome. PLoS One 9, e97313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Wang, M. , Tu, L. , Lin, M. , Lin, Z. , Wang, P. , Yang, Q. , Ye, Z. et al. (2017) Asymmetric subgenome selection and cis‐regulatory divergence during cotton domestication. Nat. Genet. 49, 579–587. [DOI] [PubMed] [Google Scholar]
  74. Wang, X. , Wang, H. , Liu, S. , Ferjani, A. , Li, J. , Yan, J. , Yang, X. et al. (2016) Genetic variation in ZmVPP1 contributes to drought tolerance in maize seedlings. Nat. Genet. 48, 1233–1241. [DOI] [PubMed] [Google Scholar]
  75. Wang, S. , Wang, J.‐W. , Yu, N. , Li, C.‐H. , Luo, B. , Gou, J.‐Y. , Wang, L.‐J. et al. (2004) Control of plant trichome development by a cotton fiber MYB gene. Plant Cell 16, 2323–2334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wang, Y. , Xiong, G. , Hu, J. , Jiang, L. , Yu, H. , Xu, J. , Fang, Y. et al. (2015) Copy number variation at the GL7 locus contributes to grain size diversity in rice. Nat. Genet. 47, 944–948. [DOI] [PubMed] [Google Scholar]
  77. Wang, S. , Zhao, G.H. , Jia, Y.H. and Du, X.M. (2008) Cloning and characterization of a CAP gene expressed in Gossypium arboreum fuzzless mutant. Crop Sci. 48, 2314–2320. [Google Scholar]
  78. Wu, R. and Citovsky, V. (2017a) Adaptor proteins GIR1 and GIR2. I. Interaction with the repressor GLABRA2 and regulation of root hair development. Biochem. Biophys. Res. Commun. 488, 547–553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Wu, R. and Citovsky, V. (2017b) Adaptor proteins GIR1 and GIR2. II. Interaction with the co‐repressor TOPLESS and promotion of histone deacetylation of target chromatin. Biochem. Biophys. Res. Commun. 488, 609–613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Wu, H. , Tian, Y. , Wan, Q. , Fang, L. , Guan, X. , Chen, J. , Hu, Y. et al. (2018) Genetics and evolution of MIXTA genes regulating cotton lint fiber development. New Phytol. 217, 883–895. [DOI] [PubMed] [Google Scholar]
  81. Xia, C. , Zhang, L. , Zou, C. , Gu, Y. , Duan, J. , Zhao, G. , Wu, J. et al. (2017) A TRIM insertion in the promoter of Ms2 causes male sterility in wheat. Nat. Commun. 8, 15407–15416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Yang, L. , Luquette, L.J. , Gehlenborg, N. , Xi, R. , Haseley, P.S. , Hsieh, C.‐H. , Zhang, C. et al. (2013) Diverse mechanisms of somatic structural variations in human cancer genomes. Cell 153, 919–929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Yeats, T.H. and Rose, J.K. (2013) The formation and function of plant cuticles. Plant Physiol. 163, 5–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Yephremov, A. , Wisman, E. , Huijser, P. , Huijser, C. , Wellesen, K. and Saedler, H. (1999) Characterization of the FIDDLEHEAD gene of Arabidopsis reveals a link between adhesion response and cell differentiation in the epidermis. Plant Cell 11, 2187–2201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Zhong, S. , Joung, J.‐G. , Zheng, Y. , Chen, Y.‐R. , Liu, B. , Shao, Y. , Xiang, J.Z. et al. (2011) High‐throughput illumina strand‐specific RNA sequencing library preparation. Cold Spring Harb. Protoc. 2011, 940–949. [DOI] [PubMed] [Google Scholar]
  86. Zhu, Q.H. , Yuan, Y. , Stiller, W. , Jia, Y. , Wang, P. , Pan, Z. , Du, X. et al. (2018) Genetic dissection of the fuzzless seed trait in Gossypium barbadense . J. Exp. Bot. 69, 997–1009. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1 GWAS generated by different types of variations on seed fuzz.

Figure S2 GWAS generated by different types of variations on leaf hair.

Figure S3 Comparison of mapping paired‐end reads in candidate region between fuzzy and fuzzless accessions.

Figure S4 Gels of all designated markers for continuous contigs designing overlapped PCR in locus FZ region on chromosome 8 between fuzzy (GA0146) and fuzzless (GA0146) accessions.

Figure S5 Identification of the exact breakpoint and sequence of larINDEL FZ .

Figure S6 Deciphering of sequence details in locus FZ region.

Figure S7 Proposed evolutionary model nearby the genomic region of larINDEL FZ .

Figure S8 Test of genotype‐phenotype linkage in both GWAS and F2 segregation populations.

Figure S9 Expression of GaFZ in different tissues between fuzzy (GA0146) and fuzzless (GA0149) accessions by RT‐PCR (a) and RNA‐seq (b).

Figure S10 Alignment of GaFZ (Ga08G0121) flanking sequences between GA0146 (fuzzy) and GA0149 (fuzzless).

Figure S11 Enhancement effect on gene expression within 1Mb‐length region flanking larINDEL FZ .

Figure S12 Illustration of cis‐regulatory elements in the sequence amplified by primer 6F6R in fuzzless and fuzzy accession.

Figure S13 Subcellular localization of GaFZ.

Figure S14 Phylogenetic tree analysis GaFZ among different species.

Figure S15 Selection of transgenic T0 generation‐positive seedlings in MS included kanamycin and cefotaxime.

Figure S16 Phenotypes and gene expression of GaFZ in overexpressing Arabidopsis plants.

Figure S17 Stem trichome phenotype of wild type and GaFZ‐overexpressing cotton.

Figure S18 Whole plant phenotype of Wild‐type (WT) and 35S::GaFZ (OE) in cotton.

Figure S19 Fuzz and lint phenotypes in WT and OE cottons.

Figure S20 Root hair phenotype in OE Arabidopsis and OE cotton.

Figure S21 Venn diagrams showed the common differential expressed genes (DEGs) in two developmental stages of GaFZ‐OE Arabidopsis lines vs. WT (a), and in two fuzz initiation stages (+3, +5DPA) of fuzzless G. arboreum line vs. fuzzy line (b).

Figure S22 KEGG enrichment analysis of DEGs between WT and OE Arabidopsis, and between fuzzy and fuzzless G. arboreum.

Figure S23 Interactions of GaFZ with GaHOX1, GaWER, GaDEL65 and GaWD40 at the protein and transcriptional level.

Figure S24 Gene expression of GaHD1 and GaMML3 between fuzzy (GA0146) and fuzzless (GA0149) accession.

Figure S25 GaFZ affects the gene expression of fatty acid elongation (FAE) pathway and cutin, suberin and wax biosynthesis pathway in Arabidopsis.

PBI-19-1110-s002.pdf (20.4MB, pdf)

Table S1 All primers used in this study.

Table S2 Genotype of larINDEL FZ , seed fuzz and leaf hair in both GWAS population and F2 segregation population.

Table S3 Detailed information of cis‐regulatory elements in larINDEL FZ .

Table S4 KEGG enriched pathways in GaFZ‐OE Arabidopsis line vs. WT.

Table S5 Expression of GL1, GL2, GL3 and TTG1 in Arabidopsis.

Table S6 Expression of GL1, GL2, GL3 and TTG1 in G. arboreum.

Table S7 Gene expression of biosynthesis of unsaturated fatty acids, fatty acid elongation, and cutin, suberin and wax biosynthesis pathway in G. arboreum.

Table S8 DEGs in cutin, suberin and wax biosynthesis, and fatty acid elongation pathway in Arabidopsis.

Table S9 DEGs in fatty acid and wax transporter, and fatty acid biosynthesis in Arabidopsis.

PBI-19-1110-s001.xlsx (145.2KB, xlsx)

Articles from Plant Biotechnology Journal are provided here courtesy of Society for Experimental Biology (SEB) and the Association of Applied Biologists (AAB) and John Wiley and Sons, Ltd

RESOURCES