Abstract
During the domestication of the goose a change in its feather color took place, however, the molecular mechanisms responsible for this change are not completely understood. Here, we performed whole-genome resequencing on three pooled samples of geese (feral and domestic geese), with two distinct feather colors, to identify genes that might regulate feather color. We identified around 8 million SNPs within each of the three pools and validated allele frequencies for a subset of these SNPs using PCR and Sanger sequencing. Several genomic regions with signatures of differential selection were found when we compared the gray and white feather color populations using the FST and Hp approaches. When we combined previous functional studies with our genomic analyses we identified 26 genes (KITLG, MITF, TYRO3, KIT, AP3B1, SMARCA2, ROR2, CSNK1G3, CCDC112, VAMP7, SLC16A2, LOC106047519, RLIM, KIAA2022, ST8SIA4, LOC106044163, TRPM6, TICAM2, LOC106038556, LOC106038575, LOC106038574, LOC106038594, LOC106038573, LOC106038604, LOC106047489, and LOC106047492) that potentially regulate feather color in geese. These results substantially expand the catalog of potential feather color regulators in geese and provide a basis for further studies on domestication and avian feather coloration.
Keywords: goose, feather color, genome, pool-seq, SNP
Introduction
Through more than 5000 years of constant artificial selection, domesticated geese have acquired a number of modifications to their appearance compared to their wild ancestors and relatives (Zeuner, 1963; Albarella, 2005). Most Asian and some European domestic goose breeds were derived from the swan goose (Anser cygnoides) (Buckland and Gérard, 2002). Domestication involved a complex set of metabolic, physiological and behavioral changes, including traits involving the liver, meat, eggs and feathers, but the most visible difference between wild and domestic swan geese is their feather coloration (Sossinka, 1982). Wild swan geese are characterized by their iconic feathers with gray stripes (Figure 1A), while domestic swan geese have an all-white color appearance (Figures 1B,C; Gao et al., 2016).
Since feather coloration and patterns are prominent features in birds, and play essential roles in their survival, mechanisms that regulate the differentiation of feather color has been intensively studied (Abolins-Abols et al., 2018; Gao et al., 2018). Feather color is the consequence of two different, but related, physical processes, pigmentation and structural coloration, where pigmentation is the primary basis for the color diversity in animals (D’Alba et al., 2012). Melanins and carotenoids are widely distributed pigments in avian feathers and are the main contributors to the diversity of feather color in birds. The melanin content is usually higher than that of carotenoids, where studies have shown that the melanin content of feathers in swallows is four orders of magnitude greater than that of carotenoids (McGraw et al., 2004; McGraw, 2006; Delhey, 2015). The genetic control of melanogenesis in birds is achieved through genes that encode specific enzymes involved in melanin synthesis as well as other regulatory and structural proteins required for the distribution of melanin (Galván and Solano, 2016).
Investigations in some avian species have identified a limited number of genes involved in the mechanisms controlling feather coloration, however, only a few studies have focused on changes in feather color in geese or swans (Theron et al., 2001; Mundy et al., 2004; Bed’hom et al., 2012; Emaresi et al., 2013; Wang et al., 2014; Poelstra et al., 2015; Zhou et al., 2018; Wen et al., 2021). Melanic plumage polymorphisms in the lesser snow geese (Anser caerulescens caerulescens) and arctic skuas (Stercorarius parasiticus) correlate with changes in the copy number of variant MC1R alleles (Mundy et al., 2004). In the black and black-necked swans (Cygnus atratus and C. melanocoryphus), independently derived nucleotide substitutions in MC1R, which cause amino acid changes at important functional sites, have been identified that are consistent with increased MC1R activity and melanism pigment synthesis (Pointer and Mundy, 2008). In the domestic swan geese, three SNPs in TYR and one in MITF have been reported to be associated with white plumage (Wang et al., 2014). Recently, Wen et al. (2021) reported, in a genomic level examination of plumage color in domestic geese, an 18 bp deletion in an intron region of KIT (NW_013185664.1, 11,785,718–11,785,736 bp) that was associated with white feather color.
Although TYR, MITF and KIT have been found to be associated with differences in feather coloration in domestic geese (Wang et al., 2014; Wen et al., 2021), a full understanding of the genetic basis of feather color formation in this species remains incomplete. With the unprecedented development of high-throughput sequencing, it has become possible to examine the genetic basis of differences in feather color at the genomic level (Zhou et al., 2018).
In this study, we performed whole-genome pooled sequencing (Pool-Seq) on three populations of swan geese with wild type and white-colored feathers. By identifying genomic regions that experienced selective sweeps, we aimed to identify genes that have experienced artificial selection and thus might explain the change in feather color in domesticated geese.
Materials and Methods
Whole-Genome Pooled Sequencing of Goose DNA Samples
A total of 117 feral gray (Anser cygnoides, 60 females, 57 males, group gray; Figure 1A), 25 feral white (Anser cygnoides, 10 females, 15 males, group White_1; Figure 1B), and 87 domesticated white (Anser cygnoides domesticus, 52 females, 35 males, group White_2; Figure 1C) geese were sampled. These samples were collected from large populations to minimize genetic relationships. We choose our sample sizes for this Pool-seq study based on previous reports on Darwin’s finch (sample size range 8–35) and monarch butterflies (sample size range 9–101) (Zhan et al., 2014; Lamichhaney et al., 2015). The accuracy of Pool-Seq increases with larger numbers of individuals included in the pool (Futschik and Schlötterer, 2010; Gautier et al., 2013). This suggests that our samples should be sufficient to identify SNPs and genes associated with feather color. A subset of SNPs identified from Pool-Seq were validated by Sanger sequencing to assess the accuracy of estimating allele frequencies (AFs) using Pool-Seq. Blood samples were collected by venipuncture. Gray and White_1 geese were acquired from a population maintained at the Xianghai breeding base in Jilin city, Jilin province, China. White_2 geese, belonging to the Huoyan breed, were obtained from the Liaoyang Animal Science Research Institute, Liaoning province, China. Geese in the group gray, with wild-type feather color, were the offspring of a mating between a population of male wild geese and female domestic geese. After several generations of breeding, a sub-population of feral white (White_1) geese appeared among the feral gray geese. Although fed by humans, unlike the domesticated white geese (White_2), both the feral gray and feral white geese possess flight abilities similar to those of wild geese, which is considered to be a signature of feralization (Gering et al., 2019).
Genomic DNA was extracted individually from blood samples of each goose using a Blood Genome DNA Extraction Kit (TIANGEN, DP348) following the manufacturer’s instructions. Equimolar quantities (3 μg/ml) of DNA from each individual were pooled to establish the three sequencing libraries. The first pooled sample was from 117 feral gray geese, the second from 25 feral white geese and the third from 87 domestic white geese. The concentrations and purity of genomic DNA were checked before library construction. Libraries were generated via adapter ligation and DNA cluster preparation and subjected to 150 bp paired-end sequencing on an Illumina HiSeq 4000 platform. Sequencing depth of each library was at least 30×. Library construction and genome sequencing was conducted by the Beijing Genomics Institute Co., Ltd. (Shenzhen, China).
Data Processing, Mapping and SNP Calling
We applied the PoolParty pipeline (Micheletti and Narum, 2018), which was designed for pool sequencing, to analyze the sequence data. The module PPAlign was used to align each read to the reference genome and for SNP calling. The parameters of module PPAlign were: “THREADZ = 32 BQUAL = 20 MAPQ = 5 SNPQ = 20 MINLENGTH = 25 INWIN = 3 MAF = 0.05 KMEM = Xmx4g MINDP = 10”. Briefly, BBDuk1 was used to obtain clean data by trimming primer dimers and adapter sequences from the reads, discarding bases with quality lower than Q20 and reads with lengths less than 25 bp. BWA-MEM (Li and Durbin, 2009) was then used to map the clean data to the goose reference genome (AnsCyg_PRJNA183603_v1.0)2 (Lu et al., 2015).
Prior to SNP calling, SAMBLASTER3 was used to mark duplicate read pairs and compress the alignment to eliminate any bias generated during the PCR amplification for library preparation and/or sequencing (Faust and Hall, 2014). Aligned results were then sorted by Picard Tools4 () and ambiguously mapped or unaligned reads were removed with SAMtools (Li et al., 2009). BCFtools (Li, 2011) was then used to call and filter the SNPs into a VCF file. Filtered alignments were combined in mpileup format for downstream analyses. SNPs with sequencing depth < 10 folds, quality < 20, minor allele frequency (MAF) < 0.05 or within 15 bp of indel were discarded.
Variant Discovery and SNP Annotation
SNP annotation and the functional consequences of sequence variants were predicted using the Ensembl Variant Effect Predictor (VEP) tool using Ensembl database version 103 with the input VCF file (McLaren et al., 2016). Annotated results of VEP included transcripts, proteins, regulatory regions, and phenotype (McLaren et al., 2016). We grouped loss-of-function (LoF) variants into four categories (1, stop-gain and stop-loss; 2, frameshift indel; 3, donor and acceptor splice-site; and 4, initiator codon variants) (Sveinbjornsson et al., 2016). Marker coverage for each gene included 10 kb of upstream and downstream flanking region (Potter et al., 2010). We focused on LoF variant annotation results for the downstream analysis.
Sanger Sequencing Validation of SNP Allele Frequencies (AFs)
SNP AFs were calculated from the read depths of each allele in the Pool-Seq data. To confirm the accuracy of AFs estimated from the Pool-Seq data, we performed a Kendall W’s coordination coefficient test on a subset of the SNPs (28 loci) (Dodge and Commenges, 2006). Of these SNPs, 15 SNPs (SNP01, SNP06 and SNP11-23 in Supplementary Table 1) were selected as they had the lowest P-values in the comparison of the Gray and White_2 groups by Fisher’s exact test based on read depth of alleles. Eight SNPs (SNP02-05 and SNP7-10 in Supplementary Table 1) were selected as they were adjacent to SNP01 and SNP06 and could be amplified with the same primer pairs used for them. Five SNPs (SNP24-28 in Supplementary Table 1) located in four genes (KITLG, MITF, TYRO3, and KIT) were also selected as these genes had previously been reported to be associated with the regulation of feather or coat color (Wehrle-Haller, 2003; Zhu et al., 2009; Zhou et al., 2018; Wu et al., 2019). The SNP alleles selected for validation were genotyped in all 229 individual geese by Sanger sequencing and the AFs were calculated from the genotype data. The two estimates of AFs, which were obtained from Pool-Seq and Sanger sequencing data, were compared using the Kendall W’s coordination coefficient test. Chi-square tests were performed to test the significance of the associations between the five SNPs in the color-related genes (KITLG, MITF, TYRO3, and KIT) and feather color phenotype. Primers used for the amplification of the selected SNPs are listed in Supplementary Table 1.
Detection of Selective Sweeps
To accurately detect genomic regions in geese that had experienced selection during domestication and to estimate the patterns of genetic diversity across the goose genome, we conducted selective sweep analyses including the fixed index (FST) and pooled heterozygosity (Hp) approaches (Rubin et al., 2010; Micheletti and Narum, 2018). FST in 10-kb non-overlapping sliding windows were calculated using the “fst-sliding.pl” module in Popoolation2 (Kofler et al., 2011), according to Weir and Cockerham’s method (Weir and Cockerham, 1984). The global parameters of FST approach were: “MINCOV = 10 MAXCOV = 100 MAF = 0.05.” Hp and negative Z-transformed Hp (−ZHp) were calculated using a custom python3 script in 10-kb non-overlapping sliding windows. The Hp approach determines, for each pool and SNP, the numbers of reads corresponding to the most (nMAJ) and least (nMIN) abundant alleles. For each window in each breed pool, the heterozygosity score of the pool was calculated as:
Where nMAJ and nMIN represent the numbers of reads corresponding to the most and least abundant allele. Individual Hp values were then Z-transformed as follows:
Windows with less than 10 SNPs were discarded to avoid spurious signals. Windows located in the top 3% of the FST distribution and top 3% of the −ZHp distribution were regarded as candidate regions for selective sweeps (Wang et al., 2015). Genes overlapping these regions were identified using Ensembl genome annotation.
Genomic regions that might have experienced selective sweeps were identified through three steps: (1) windows in the top 3% of the FST distributions of both the Gray vs. White_1 and the Gray vs. White_2 comparisons were identified; (2) windows in the top 3% of the −ZHp distributions of both the White_1 and the White_2 populations were identified; (3) the intersection of region identified in (1) and (2) were considered to have experienced a selective sweep. Genes located in these overlapped regions might be involved in the change of goose feather color.
Gene Ontology (GO) and KEGG Pathway Enrichment Analysis
To determine the possible function of genes that were located in the selective sweep regions, we identified orthologous human genes using the BioMart online tool5. The orthologous genes were then uploaded into the DAVID online tool to test for enrichment in gene ontology (GO) terms (Huang et al., 2009). KEGG pathway analysis was conducted using the online KOBAS tool (Xie et al., 2011). A Fisher’s exact test was then used to determine the significance of the enrichments of the GO terms and KEGG pathways, with a significant level of P < 0.05.
Results
Statistics of the Genome Resequencing Data
A total of 148.26 Gb clean data was obtained from the three Pool-Seq libraries (Table 1). Mapping rates for the libraries varied between 98.14 and 98.23%, with the final effective mapping depths ranging from 44.09- to 44.13-fold. The Q20 rates for the three libraries were all over 98%. An average of 8,476,172 SNPs was identified in each library.
TABLE 1.
Parameter | Gray | White_1 | White_2 |
Clean data (Gb) | 49.42 | 49.40 | 49.44 |
Reads (M) | 329.45 | 329.30 | 329.59 |
Map reads rate (%) | 98.14 | 98.20 | 98.23 |
Q20 rate (%) | 98.08 | 98.14 | 98.17 |
Sequencing depth | 44.11 | 44.09 | 44.13 |
Total SNPs | 8,785,296 | 8,680,731 | 7,962,489 |
Sanger Sequencing Validation
To assess the reliability of estimating allele frequencies of SNPs using the population genomic sequencing (Pool-Seq) data, we genotyped 28 SNPs from an average of 210 individuals using Sanger sequencing (Supplementary Tables 2, 3). AFs calculated from the Sanger sequencing data, based on individual amplifications and sequencing, were in accord with the AFs calculated from the Pool-Seq data. Kendall W’s coordination coefficients for the comparisons of the AFs estimated from the Pool-Seq and Sanger genotypes for the Gray, White_1 and White_2 populations were 0.96, 0.97 and 0.94 (P < 0.05), respectively, showing that there is a good concordance between the results obtained using the two different methods.
Among the 28 SNPs examined above, five were SNPs that are located in four genes (KITLG, MITF, TYRO3, and KIT) previously reported to be associated with feather or coat color (Wehrle-Haller, 2003; Zhu et al., 2009; Zhou et al., 2018; Wu et al., 2019). Of these five SNPs, two are located in the 3′ UTR of KITLG (NW_013185706.1: G232853A and NW_013185706.1: C232854T), one in the 5′ UTR of MITF (NW_013185692.1: G4400553C), one in the TYRO3 (S772G) coding region and one in the KIT (T887A) coding region (Table 2). Results from a Chi-square test showed an extremely significant association between the SNP genotypes and feather color phenotypes (P < 0.001) for these SNPs.
TABLE 2.
SNP informationa | Gene | Genotype | Numbers of individualsb | χ2 value | ||
Gray | White_1 | White_2 | ||||
NW_013185706.1: 3′UTR_G232853A | KITLG | GG | 37 | 22 (88%) | 84 (100%) | 102.89 |
GA | 77 (68%) | 3 | 0 | (P < 0.001) | ||
AA | 0 | 0 | 0 | |||
NW_013185706.1: 3′UTR_C232854T | KITLG | CC | 37 | 22 (88%) | 84 (100%) | 102.89 |
CT | 77 (68%) | 3 | 0 | (P < 0.001) | ||
TT | 0 | 0 | 0 | |||
NW_013185692.1: 5′UTR_G4400553C | MITF | GG | 83 (74%) | 6 | 14 | 80.44 |
GC | 29 | 17 (68%) | 49 (59%) | (P < 0.001) | ||
CC | 0 | 2 | 20 | |||
NW_013185657.1: cds_A2638G:S772G | TYRO3 | AA | 31 | 20 (80%) | 86 (100%) | 106.57 |
AG | 76 (71%) | 5 | 0 | (P < 0.001) | ||
GG | 0 | 0 | 0 | |||
NW_013185664.1: cds_A2659G:T887A | KIT | AA | 0 | 0 | 0 | 93.80 |
AG | 67 (66%) | 2 | 0 | (P < 0.001) | ||
GG | 35 | 23 (92%) | 78 (100%) |
aSNP information is displayed in the following format: NCBI GenBank accession number: nucleotide position in the goose reference genome (cds, 5′UTR or 3′UTR) + nucleotide in the reference genome (reference nucleotide) + SNP position in the transcript + mutant nucleotide: amino acid in the goose reference transcript + codon position in the transcript + amino acid mutation or no change.
bNumbers of individuals (Gray| White_1| White_2) for which reliable genotypes could be inferred from the Sanger sequencing data. The percentage in parenthesis is the percentage of the genotype in the number of verified individuals.
Selective Sweep Analysis
Hp and −ZHp distributions are presented in Supplementary Figure 1. The selective sweep analyses identified (1) 317 regions in the top 3% of the FST distributions of the intersection of the Gray vs. White_1 (FST value, mean = 0.119, range 0.095–0.457) and the Gray vs. White_2 (FST value, mean = 0.258, range 0.202–0.727) comparisons and (2) 253 regions in the top 3% of the −ZHp distributions of the intersection of the White_1 (−ZHp value, mean = 2.835, range 2.014–4.952) and White_2 (−ZHp value, mean = 2.904, range 2.317–3.755) populations (Figure 2). A total of 99 genes were identified in the 317 regions identified by the FST distributions and 103 genes identified in the 253 regions identified by the −ZHp distributions (Supplementary Tables 4, 5). Among the 99 genes identified from the FST distributions, four (SLC16A2, AP3B1, SMARCA2, and VAMP7) have previously been associated with animal coloration (Table 3). Similarly, 5 of the 103 genes from the −ZHp distributions (SLC16A2, ROR2, CSNK1G3, CCDC112, and VAMP7) were previously associated with animal coloration (Table 3).
TABLE 3.
Method | Scaffolda | Gene symbol | Summary of gene function |
FST (Top 3%) | NW_013185770.1 | AP3B1 | Melanin formation (Jing et al., 2014) |
NW_013185722.1 | SLC16A2 | Pigment-related (Baxter et al., 2019) | |
NW_013185909.1 | SMARCA2 | Melanin formation (Mehrotra et al., 2014) | |
NW_013185915.1 | VAMP7 | Melanin formation (Yatsu et al., 2013) | |
Hp (Top 3%) | NW_013185722.1 | SLC16A2 | Pigment-related (Baxter et al., 2019) |
NW_013185840.1 | ROR2 | Melanin formation (O’Connell et al., 2013) | |
NW_013185881.1 | CSNK1G3 | Pigment-related (Al Robaee et al., 2020) | |
NW_013185883.1 | CCDC112 | Pigment-related (Tian et al., 2014) | |
NW_013185915.1 | VAMP7 | Melanin formation (Yatsu et al., 2013) |
aAnsCyg_PRJNA183603_v1.0 primary assembly.
Gene names in bold have LoF variation according to VEP annotation.
More importantly, 27 regions, which included 17 genes, were found in the overlap of the 317 FST and 253 −ZHp top 3% regions, including two, SLC16A2 and VAMP7 that had previously been associated with animal coloration (Table 4). The 15 novel genes identified here are: LOC106038556, LOC106038575, LOC106038574, LOC106038594, LOC106038573, RLIM, LOC106038604, KIAA2022, ST8SIA4, LOC106044163, TRPM6, TICAM2, LOC106047489, LOC106047492, and LOC106047519 (Table 4). LoF (loss of function) variants were found for 9 of the genes (8 detected by both the FST and the Hp approaches (LOC106038574, LOC106038604 LOC106044163, TICAM2, LOC106047489, VAMP7, LOC106047492, and LOC106047519) and one by only Hp (CCDC112) in the selective sweep regions of geese with different feather colors (or in the 10 kb region upstream and downstream of the sweep regions) (Supplementary Table 6).
TABLE 4.
Scaffolda | Positionb | Gene symbol | Gene description |
NW_013185722.1 | 261,870 | LOC106038556 | Homeobox protein CDX-4-like |
NW_013185722.1 | 288,464 | LOC106038575 | Uncharacterized LOC106038575 |
NW_013185722.1 | 301,490 | LOC106038574 | Pre-mRNA 3′-end-processing factor FIP1-like |
NW_013185722.1 | 338,779 | LOC106038594 | Ligand of Numb protein X 2-like |
NW_013185722.1 | 371,156 | LOC106038573 | Uncharacterized LOC106038573 |
NW_013185722.1 | 502,677 | SLC16A2 | Solute carrier family 16 member 2 |
NW_013185722.1 | 517,620 | RLIM | Ring finger protein, LIM domain interacting |
NW_013185722.1 | 537,179 | LOC106038604 | Uncharacterized LOC106038604 |
NW_013185722.1 | 548,033 | KIAA2022 | Neurite extension and migration factor |
NW_013185807.1 | 123,1519 | ST8SIA4 | ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 4 |
NW_013185817.1 | 975,069 | LOC106044163 | Proprotein convertase subtilisin/kexin type 5-like |
NW_013185817.1 | 143,0086 | TRPM6 | Transient receptor potential cation channel subfamily M member 6 |
NW_013185883.1 | 304,453 | TICAM2 | Toll like receptor adaptor molecule 2 |
NW_013185915.1 | 689,355 | LOC106047489 | SLAIN motif-containing protein-like |
NW_013185915.1 | 717,937 | VAMP7 | Vesicle associated membrane protein 7 |
NW_013185915.1 | 737,829 | LOC106047492 | DNA-directed RNA polymerases I and III subunit RPAC2-like |
NW_013185915.1 | 745,589 | LOC106047519 | Endothelin B receptor-like |
aSame as Table 3.
bStart position of the gene in the genome.
Eight gene names in bold have LoF variants according to VEP annotation.
GO and KEGG Pathway Enrichment Analysis
We conducted GO and KEGG pathway analyses of the 17 genes identified in the selective sweep regions. These genes were found to be significantly enriched for the GO term “late endosomal membrane category” (GO: 0031902, P < 0.05) and in three pathways, “RNA polymerase,” “SNARE interactions in vesicular transport” and “cytosolic DNA-sensing pathway” (P < 0.05, Supplementary Table 7).
Discussion
In this study, we performed whole-genome Pool-Seq on three populations of geese with two different colors of feathers to identify SNPs, and genes that might be responsible for these differing phenotypes. The Kendall W’s coefficients for the AFs calculated from the Pool-Seq and Sanger sequencing data indicated a good correlation between them, which suggests that our Pool-Seq data is adequate for identifying loci that are differentiated between the goose phenotypes. Selective sweep analyses of this SNP data was used to identify genomic regions that show signatures of selection during the domestication of geese. This lead to the identification of 17 genes located in candidate regions identified by both the FST and Hp approaches, suggesting a high probability that selection occurred on these genes and that they might be associated with the change in feather color seen in these geese. VEP annotation of these 17 genes identified eight with loss-of-function (LoF) alleles potentially involved in regulating feather color.
Among the 17 identified genes (Table 4), three (VAMP7, SLC16A2, and LOC106047519) have previously been associated with the regulation of coat color in animals (Imokawa et al., 1992; Yatsu et al., 2013; Baxter et al., 2019). VAMP7, vesicle associated membrane protein 7, is localized to Tyrp1-containing vesicles/organelles and acts as part of the SNARE machinery with syntaxin-3 and SNAP23 on melanosomes to regulate Tyrp1 transport in mouse melanocytes (Yatsu et al., 2013). VAMP7 may play a key role in melanin formation and thus influence goose feather color. SLC16A2 has an effect on pigmentation phenotypes in the zebrafish, and has the GO term “pigmentation” annotated in the Zebrafish Information Network6 database (Baxter et al., 2019). LOC106047519 belongs to the ETB-R gene family, which also includes the Endothelin B receptor (EDNRB), and has been described as an EDNRB-like gene (Kanehisa, 1997; Kanehisa and Goto, 2000). EDNRB is reported to be associated with the development of cells of the melanocytic lineage (Imokawa et al., 1992), suggesting that LOC106047519 might also perform a function similar to EDNRB to regulate feather color. Our VEP analysis identified LoF mutations in VAMP7 and LOC106047519, but not in SLC16A2. These results suggest that VAMP7 and LOC106047519 might not only regulate pigmentation in the previously investigated animals but also play a role in the change in feather color in the goose.
Using either the FST or the Hp approach we identified five other genes (AP3B1, SMARCA2, ROR2, CSNK1G3, and CCDC112) that have been reported to be associated with animal coloration (Table 3). Substitutions in AP3B1 cause distinct phenotypes in the pigmented cells in mouse eyes and possibly plays a role in organelle biogenesis associated with melanosomes (Jing et al., 2014). SMARCA2, a member of the SWI/SNF family, is involved in melanocyte differentiation and melanoma (Mehrotra et al., 2014; Markiewicz and Idowu, 2020). ROR2 is involved in the formation of melanoma in humans, suggesting a role in melanin formation (O’Connell et al., 2013). Expression of CSNK1G3, a gene related to human vitiligo, is significantly reduced in C57BL/6 black mice with tyrosinase-induced depigmented skin (Ocampo-Candiani et al., 2018; Al Robaee et al., 2020). CCDC112 regulates pigmentation and the expression level of this gene differs between Silkie and White Leghorn chickens (Tian et al., 2014). We found a LoF mutation in CCDC112 in gray that might partly explain the difference in feather color in geese. Our results suggest that it is possible that these five genes also affect feather color in geese.
We also focused on four genes (KITLG, MITF, TYRO3, and KIT) that were previously reported to be associate with feather color (Wehrle-Haller, 2003; Zhu et al., 2009; Zhou et al., 2018; Wu et al., 2019). The SNP genotypes for these genes were also validated by Sanger sequencing (Table 2). A changes in an untranslated region (UTRs) can lead to changes in the expression of genes (Barrett et al., 2012). Here, we identified three SNPs, two located in the 3′ UTR of KITLG and one in the 5′ UTR of MITF, which are significantly associated with feather color phenotypes in our geese. This suggests that these three SNPs affect the expression of KITLG and MITF resulting in a change in feather color. Non-synonymous mutations are more likely to affect the biological function of a gene. Here, we identified two non-synonymous substitutions in KIT (T887A) and TYRO3 (S772G) that are significantly associated with feather color phenotypes, indicating that they may regulate goose feather color.
GO and KEGG enrichment analyses of the 17 candidate genes in the most significant sweep areas are significantly enriched in the GO term late endosome membrane (Supplementary Table 7). Two of the candidate genes (VAMP7 and TICAM2) are associated with this term. We also identified three pathways (RNA polymerase, SNARE interactions in vesicular transport and cytosolic DNA-sensing pathway) that are significantly enriched, where VAMP7 is also involved with SNARE interactions in vesicular transport. Although the enriched GO term and the pathways do not seem to directly correlate with animal coloration, it is still possible that the genes involved in them could regulate feather coloration in geese.
In conclusion, we identified 26 genes (17 detected by both the FST and Hp approaches, five by either FST or Hp and four previously reported color-related genes) from our genomic Pool-Seq data that might be responsible for the change in feather color that occurred during the domestication of geese (Anser cygnoides). Among these 26 genes, 12 have previously been found to be associated with animal coloration in other studies. The roles of the other genes in feather coloration requires further investigation. Additional studies, including functional experimentation, are needed to confirm the roles of these genes, and the consequence of the mutations caused by the SNPs, on phenotypic variation in feather color in geese.
Data Availability Statement
Whole-genome sequencing data reported in this study were deposited into the NCBI Sequence Read Archive under the accession number PRJNA532466 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA532466/).
Ethics Statement
The animal study was reviewed and approved by the Animal Care and Use Committee of Shenyang Agricultural University. Written informed consent was obtained from the owners for the participation of their animals in this study.
Author Contributions
SR, XL, CF, and RL performed the experiments. SR and GL analyzed the data. JZ, YS, and SZ collected the samples. SR, GL, SS, DI, and ZW wrote the manuscript. SZ and ZW designed the study and supervised the work. All authors contributed to the article and approved the submitted version.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Funding. This work was supported by grants from the National Natural Science Foundation of China (No. 31672274), the Organization Department of Liaoning Provincial Committee (No. XLYC1907018), and the Educational Department of Liaoning Province of China (Climbing Scholar).
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.650013/full#supplementary-material
References
- Abolins-Abols M., Kornobis E., Ribeca P., Wakamatsu K., Peterson M. P., Ketterson E. D., et al. (2018). Differential gene regulation underlies variation in melanic plumage coloration in the dark-eyed Junco (Junco hyemalis). Mol. Ecol. 27 4501–4515. 10.1111/mec.14878 [DOI] [PubMed] [Google Scholar]
- Al Robaee A. A., Alzolibani A. A., Rasheed Z. (2020). Autoimmune response against tyrosinase induces depigmentation in C57BL/6 black mice. Autoimmunity 53 459–466. 10.1080/08916934.2020.1836489 [DOI] [PubMed] [Google Scholar]
- Albarella U. (2005). Alternate fortunes? The role of domestic ducks and geese from Roman to Medieval times in Britain. Doc. Archaeobiol. 3 249–258. [Google Scholar]
- Barrett L. W., Fletcher S., Wilton S. D. (2012). Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements. Cell. Mol. life Sci. 69 3613–3634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baxter L. L., Watkins-Chow D. E., Pavan W. J., Loftus S. K. (2019). A curated gene list for expanding the horizons of pigmentation biology. Pigment Cell Melanoma Res. 32 348–358. 10.1111/pcmr.12743 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bed’hom B., Vaez M., Coville J.-L., Gourichon D., Chastel O., Follett S., et al. (2012). The lavender plumage colour in Japanese quail is associated with a complex mutation in the region of MLPH that is related to differences in growth, feed consumption and body temperature. BMC Genomics 13:442. 10.1186/1471-2164-13-442 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buckland R. B., Gérard G. (2002). “Origins and Breeds of Domestic Geese,” in Goose Production, ed. Buckland R. (Rome: Food & Agriculture Org; ). [Google Scholar]
- D’Alba L., Kieffer L., Shawkey M. D. (2012). Relative contributions of pigments and biophotonic nanostructures to natural color production: a case study in budgerigar (Melopsittacus undulatus) feathers. J. Exp. Biol. 215 1272–1277. 10.1242/jeb.064907 [DOI] [PubMed] [Google Scholar]
- Delhey K. (2015). The colour of an avifauna: a quantitative analysis of the colour of Australian birds. Sci. Rep. 5:18514. 10.1038/srep18514 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dodge Y., Commenges D. (2006). The Oxford Dictionary of Statistical Terms. New York, NY: Oxford University Press on Demand. [Google Scholar]
- Emaresi G., Ducrest A. L., Bize P., Richter H., Simon C., Roulin A. (2013). Pleiotropy in the melanocortin system: expression levels of this system are associated with melanogenesis and pigmentation in the tawny owl (Strix aluco). Mol. Ecol. 22 4915–4930. 10.1111/mec.12438 [DOI] [PubMed] [Google Scholar]
- Faust G. G., Hall I. M. (2014). SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics 30 2503–2505. 10.1093/bioinformatics/btu314 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Futschik A., Schlötterer C. (2010). The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics 186 207–218. 10.1534/genetics.110.114397 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galván I., Solano F. (2016). Bird integumentary melanins: biosynthesis, forms, function and evolution. Int. J. Mol. Sci. 17:520. 10.3390/ijms17040520 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao G., Xu M., Bai C., Yang Y., Li G., Xu J., et al. (2018). Comparative genomics and transcriptomics of Chrysolophus provide insights into the evolution of complex plumage coloration. Gigascience 7:giy113. 10.1093/gigascience/giy113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao G., Zhao X., Li Q., He C., Zhao W., Liu S., et al. (2016). Genome and metagenome analyses reveal adaptive evolution of the host and interaction with the gut microbiota in the goose. Sci. Rep. 6:32961. 10.1038/srep32961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gautier M., Foucaud J., Gharbi K., Cézard T., Galan M., Loiseau A., et al. (2013). Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping. Mol. Ecol. 22 3766–3779. 10.1111/mec.12360 [DOI] [PubMed] [Google Scholar]
- Gering E., Incorvaia D., Henriksen R., Conner J., Getty T., Wright D. (2019). Getting back to nature: feralization in animals and plants. Trends Ecol. Evol. 34 1137–1151. 10.1016/j.tree.2019.07.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang D. W., Sherman B. T., Lempicki R. A. (2009). Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37 1–13. 10.1093/nar/gkn923 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Imokawa G., Yada Y., Miyagishi M. (1992). Endothelins secreted from human keratinocytes are intrinsic mitogens for human melanocytes. J. Biol. Chem. 267 24675–24680. [PubMed] [Google Scholar]
- Jing R., Dong X., Li K., Yan J., Chen X., Feng L. (2014). The Ap3b1 gene regulates the ocular melanosome biogenesis and tyrosinase distribution differently from the Hps1 gene. Exp. Eye Res. 128 57–66. 10.1016/j.exer.2014.08.010 [DOI] [PubMed] [Google Scholar]
- Kanehisa M. (1997). A database for post-genome analysis. Trends Genet. 13 375–376. [DOI] [PubMed] [Google Scholar]
- Kanehisa M., Goto S. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28 27–30. 10.1093/nar/28.1.27 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kofler R., Pandey R. V., Schlötterer C. (2011). PoPoolation2: identifying differentiation between populations using sequencing of pooled DNA samples (Pool-Seq). Bioinformatics 27 3435–3436. 10.1093/bioinformatics/btr589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamichhaney S., Berglund J., Almén M. S., Maqbool K., Grabherr M., Martinez-Barrio A., et al. (2015). Evolution of Darwin’s finches and their beaks revealed by genome sequencing. Nature 518 371–375. 10.1038/nature14181 [DOI] [PubMed] [Google Scholar]
- Li H. (2011). A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27 2987–2993. 10.1093/bioinformatics/btr509 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Durbin R. (2009). Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25 1754–1760. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., et al. (2009). The sequence alignment/map format and SAMtools. Bioinformatics 25 2078–2079. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu L., Chen Y., Wang Z., Li X., Chen W., Tao Z., et al. (2015). The goose genome sequence leads to insights into the evolution of waterfowl and susceptibility to fatty liver. Genome Biol. 16:89. 10.1186/s13059-015-0652-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Markiewicz E., Idowu O. C. (2020). Melanogenic difference consideration in ethnic skin type: a balance approach between skin brightening applications and beneficial sun exposure. Clin. Cosmet. Invest. Dermatol. 13:215. 10.2147/CCID.S245043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGraw K. J. (2006). Pterins, porphyrins, and Psittacofulvins. Bird Color. Mech. Meas. 1 354–398. [Google Scholar]
- McGraw K. J., Safran R. J., Evans M. R., Wakamatsu K. (2004). European barn swallows use melanin pigments to color their feathers brown. Behav. Ecol. 15 889–891. 10.1093/beheco/arh109 [DOI] [Google Scholar]
- McLaren W., Gil L., Hunt S. E., Riat H. S., Ritchie G. R. S., Thormann A., et al. (2016). The ensembl variant effect predictor. Genome Biol. 17 1–14. 10.1186/s13059-016-0974-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mehrotra A., Mehta G., Aras S., Trivedi A., de la Serna I. (2014). SWI/SNF chromatin remodeling enzymes in melanocyte differentiation and melanoma. Crit. Rev. Eukaryot. Gene Expr 24 151–161. 10.1615/CritRevEukaryotGeneExpr.v24.i2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Micheletti S. J., Narum S. R. (2018). Utility of pooled sequencing for association mapping in nonmodel organisms. Mol. Ecol. Resour. 18 825–837. 10.1111/1755-0998.12784 [DOI] [PubMed] [Google Scholar]
- Mundy N. I., Badcock N. S., Hart T., Scribner K., Janssen K., Nadeau N. J. (2004). Conserved genetic basis of a quantitative plumage trait involved in mate choice. Science 303 1870–1873. 10.1126/science.1093834 [DOI] [PubMed] [Google Scholar]
- Ocampo-Candiani J., Salinas-Santander M., Trevino V., Ortiz-López R., Ocampo-Garza J., Sanchez-Dominguez C. N. (2018). Evaluation of skin expression profiles of patients with vitiligo treated with narrow-band UVB therapy by targeted RNA-seq. An. Bras. Dermatol. 93 843–851. 10.1590/abd1806-4841.20187589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Connell M. P., Marchbank K., Webster M. R., Valiga A. A., Kaur A., Vultur A., et al. (2013). Hypoxia induces phenotypic plasticity and therapy resistance in melanoma via the tyrosine kinase receptors ROR1 and ROR2. Cancer Discov. 3 1378–1393. 10.1158/2159-8290 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poelstra J. W., Vijay N., Hoeppner M. P., Wolf J. B. W. (2015). Transcriptomics of colour patterning and coloration shifts in crows. Mol. Ecol. 24 4617–4628. 10.1111/mec.13353 [DOI] [PubMed] [Google Scholar]
- Pointer M. A., Mundy N. I. (2008). Testing whether macroevolution follows microevolution: Are colour differences among swans (Cygnus) attributable to variation at the MC1R locus? BMC Evol. Biol. 8:249. 10.1186/1471-2148-8-249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Potter C., Cordell H. J., Barton A., Daly A. K., Hyrich K. L., Mann D. A., et al. (2010). Association between anti-tumour necrosis factor treatment response and genetic variants within the TLR and NFκB signalling pathways. Ann. Rheum. Dis. 69 1315–1320. 10.1136/ard.2009.117309 [DOI] [PubMed] [Google Scholar]
- Rubin C.-J., Zody M. C., Eriksson J., Meadows J. R. S., Sherwood E., Webster M. T., et al. (2010). Whole-genome resequencing reveals loci under selection during chicken domestication. Nature 464 587–591. 10.1038/nature08832 [DOI] [PubMed] [Google Scholar]
- Sossinka R. (1982). Domestication in birds. Avian Biol. 6 373–403. [Google Scholar]
- Sveinbjornsson G., Albrechtsen A., Zink F., Gudjonsson S. A., Oddson A., Másson G., et al. (2016). Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nat. Genet. 48:314. 10.1038/ng.3507 [DOI] [PubMed] [Google Scholar]
- Theron E., Hawkins K., Bermingham E., Ricklefs R. E., Mundy N. I. (2001). The molecular basis of an avian plumage polymorphism in the wild: a melanocortin-1-receptor point mutation is perfectly associated with the melanic plumage morph of the bananaquit. Coereba Flaveola. Curr. Biol. 11 550–557. 10.1016/S0960-9822(01)00158-0 [DOI] [PubMed] [Google Scholar]
- Tian M., Hao R., Fang S., Wang Y., Gu X., Feng C., et al. (2014). Genomic regions associated with the sex-linked inhibitor of dermal melanin in Silkie chicken. Front. Agric. Sci. Eng 1:242. 10.15302/J-FASE-2014018 [DOI] [Google Scholar]
- Wang C., Wang H., Zhang Y., Tang Z., Li K., Liu B. (2015). Genome-wide analysis reveals artificial selection on coat colour and reproductive traits in Chinese domestic pigs. Mol. Ecol. Resour. 15 414–424. 10.1111/1755-0998.12311 [DOI] [PubMed] [Google Scholar]
- Wang Y., Li S. M., Huang J., Chen S. Y., Liu Y. P. (2014). Mutations of TYR and MITF genes are associated with plumage colour phenotypes in geese. Asian Austr J. Anim. Sci. 27 778–783. 10.5713/ajas.2013.13350 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wehrle-Haller B. (2003). The role of Kit-ligand in melanocyte development and epidermal homeostasis. Pigment Cell Res. 16 287–296. 10.1034/j.1600-0749.2003.00055.x [DOI] [PubMed] [Google Scholar]
- Weir B. S., Cockerham C. C. (1984). Estimating F-statistics for the analysis of population structure. Evolution (N.Y.) 38 1358–1370. 10.2307/2408641 [DOI] [PubMed] [Google Scholar]
- Wen J., Shao P., Chen Y., Wang L., Lv X., Yang W., et al. (2021). Genomic scan revealed KIT gene underlying white/gray plumage color in Chinese domestic geese. Anim. Genet 52 356–360. 10.1111/age.13050 [DOI] [PubMed] [Google Scholar]
- Wu Z., Deng Z., Huang M., Hou Y., Zhang H., Chen H., et al. (2019). Whole genome resequencing identifies KIT new alleles that affect coat color phenotypes in pigs. Front. Genet. 10:218. 10.3389/fgene.2019.00218 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie C., Mao X., Huang J., Ding Y., Wu J., Dong S., et al. (2011). KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res. 39 W316–W322. 10.1093/nar/gkr483 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yatsu A., Ohbayashi N., Tamura K., Fukuda M. (2013). Syntaxin-3 is required for melanosomal localization of Tyrp1 in melanocytes. J. Invest. Dermatol. 133 2237–2246. 10.1038/jid.2013.156 [DOI] [PubMed] [Google Scholar]
- Zeuner F. E. (1963). A History of Domesticated Animals. London: Hutchinson. [Google Scholar]
- Zhan S., Zhang W., Niitepold K., Hsu J., Haeger J. F., Zalucki M. P., et al. (2014). The genetics of monarch butterfly migration and warning colouration. Nature 514 317–321. 10.1038/nature13812 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou Z., Li M., Cheng H., Fan W., Yuan Z., Gao Q., et al. (2018). An intercross population study reveals genes associated with body size and plumage color in ducks. Nat. Commun. 9:2648. 10.1038/s41467-018-04868-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu S., Wurdak H., Wang Y., Galkin A., Tao H., Li J., et al. (2009). A genomic screen identifies TYRO3 as a MITF regulator in melanoma. Proc. Natl. Acad. Sci. U.S.A. 106 17025–17030. 10.1073/pnas.0909292106 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Whole-genome sequencing data reported in this study were deposited into the NCBI Sequence Read Archive under the accession number PRJNA532466 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA532466/).