Skip to main content
Genome Biology and Evolution logoLink to Genome Biology and Evolution
. 2023 Nov 3;15(11):evad183. doi: 10.1093/gbe/evad183

Evolution, Inheritance, and Strata Formation of the W Chromosome in Duck (Anas platyrhynchos)

Hongchang Gu 1,2, Junhui Wen 3, Xiurong Zhao 4, Xinye Zhang 5, Xufang Ren 6, Huan Cheng 7, Lujiang Qu 8,
Editor: Rachel O'Neill
PMCID: PMC10630070  PMID: 37931036

Abstract

The nonrecombining female-limited W chromosome is predicted to experience unique evolutionary processes. Difficulties in assembling W chromosome sequences have hindered the identification of duck W-linked sequences and their evolutionary footprint. To address this, we conducted three initial contig-level genome assemblies and developed a rigorous pipeline by which to successfully expand the W-linked data set, including 11 known genes and 24 newly identified genes. Our results indicate that the W chromosome expression may not be subject to female-specific selection; a significant convergent pattern of upregulation associated with increased female-specific selection was not detected. The genetic stability of the W chromosome is also reflected in the strong evolutionary correlation between it and the mitochondria; the complete consistency of the cladogram topology constructed from their gene sequences proves the shared maternal coevolution. By detecting the evolutionary trajectories of W-linked sequences, we have found that recombination suppression started in four distinct strata, of which three were conserved across Neognathae. Taken together, our results have revealed a unique evolutionary pattern and an independent stratum evolutionary pattern for sex chromosomes.

Keywords: sex chromosome, sex-specific selection, maternal inheritance, genome assembly, recombination suppression, sequence evolution


Significance.

Sex chromosome evolutionary process of duck is not traceable clearly due to recombination suppression evolved repeatedly in different lineages, and it is very challenging to distinguish ZW chromosome genes accurately. We effectively identified Anas platyrhynchos Z–W orthologs using a strict multistep method. Our combined data set comprised 11 known genes and 24 newly identified genes, together with 27 previously identified W-linked genes in the red junglefowl. We finally found four Z chromosome strata in ducks, where recombination was suppressed progressively over 125 Myr.

Introduction

Males and females within a species share most of their genomes, except for the sex-specific regions of the Y and W chromosomes (Mank 2017), which diverged from the corresponding X and Z chromosomes once recombination was halted (Charlesworth et al. 2005). In birds, recombination suppression around the sex-determining gene (DMRT1) evolved in a common ancestor to all modern birds and eventually resulted in the development of Z–W sex chromosomes that are shared across Neognathae (Fridolfsson et al. 1998; Smith et al. 2009). Once recombination was halted between Z and W, the W chromosomes would experience Hill–Robertson effects, accelerated accumulation of deleterious mutations, and the degradation of W-linked coding content typical of nonrecombining sex-specific chromosomes (Bachtrog and Charlesworth 2002; Berlin et al. 2007; Bachtrog 2008; Kaiser and Charlesworth 2010; Mank 2012; Moghadam et al. 2012; Smeds et al. 2015).

Previous studies have shown that Y chromosomes are only subject to selection for male-specific effects and thus play a large role in male fitness (Lemos et al. 2008; Lange et al. 2009). In the Z–W sex chromosome system, the W chromosome is unique to females, and it is thus expected that genes on the nonrecombining region of the W chromosome (NRW) will regulate female-related phenotypes and that their expression may be sensitive to female-specific selection. The role of female-specific selection in shaping the avian W chromosome remains unclear (Bellott et al. 2017). Although sex-specific selection has been systematically studied in chickens (Moghadam et al. 2012), the comparison of modern duck breeds to chickens is of value as they have also been subjected to artificial selection, but the selection process and its effects may be different (Li et al. 2010; Zhou et al. 2018). After mallards (MDs) were first domesticated more than 2,000 years ago, they experienced subsequent selection to develop meat- and egg-specific breeds (Zhang, Jia, Almeida, et al. 2018).

The W chromosome is inherited through the maternal line and should thus be coinherited with mitochondrial DNA (mtDNA) (Sager and Lane 1972; Berlin and Ellegren 2001; Nishimura et al. 2006; White et al. 2008). Although rare cases of paternal mtDNA transmission have been observed in Homo sapiens (Luo et al. 2018), paternal mtDNA transmission has not been reported in birds (Ellegren 2010), and this shared maternal inheritance should result in identical phylogenies for the mtDNA and W chromosome (Berlin et al. 2004; Berlin and Ellegren 2006).

Previous work has indicated that in many species, recombination suppression occurs over time, resulting in distinct regions with different degrees of X–Y or Z–W divergence, often referred to as strata (Jackson 2011; Kapun and Flatt 2019). In line with this, research on birds has shown evidence for strata that have formed independently in a stepwise process over millions of years (Handley et al. 2004; Nam and Ellegren 2008; Wright et al. 2012). However, information on the evolutionary history of duck sex chromosomes is sparse, and only 12 Z–W orthologs have been used to identify “Anseriform-specific” stratum (Wright et al. 2014), and thus the evolutionary history of recombination suppression in this clade may not yet be fully understood.

Bird sex chromosomes underwent complex, accidental, and inevitable evolution over 130 Myr. To enable a systematic and comprehensive analysis of sex sequence evolution, we have assembled contig-level genomes from the Pacific Biosciences (PacBio) reads of three duck breeds to expand the duck W gene catalog (fig. 1). This expanded catalog allows for a more detailed history of duck W chromosomes. It has provided evidence for up to four evolutionary strata in sex chromosomes, the youngest of which may be Anseriform specific.

Fig. 1.


Fig. 1.

Procedural approach used to identify true Z–W orthologs.

Results

Assembly of Three Contig-Level Genomes

Approximately 46.96, 54.95, and 47.80 Gb of sequence were obtained from the PacBio single-molecule real-time (SMRT) sequence reads, corresponding to MD, Pekin duck (PK), and Shao Xing duck (SX), respectively, and these were used for contig assembly with Falcon and Canu. The final assemblies following correction and polishing using SMRT data and Illumina data are summarized in Table 1. SMRT data for self-correction of contig sequences. Short reads were used to further polish contigs and correct indel and single nucleotide variant (SNV) errors. Three iterations of error correction can greatly improve assembly quality. The statistics of multiple parameters prove the reliability of the assembly results (Table 1).

Table 1.

Falcon and Canu Genome Assembly Results for PK, SX, and MD

Falcon v0.4 Canu v2.0
PK SX MD PK SX MD
Assembly size (MB) 1,100 1,089 1,134 1,231 1,206 1,267
Number of contigs 3,691 4,150 2,959 7,482 7,030 6,693
Contigs N50 (MB) 1.21 1.01 2.27 1.49 1.23 1.82
GC% 41.78 41.67 41.86 42.00 41.94 42.01

Identifying and Expanding W-Linked Coding Content in Duck

A multipronged strategy that utilized genome, transcriptome, and molecular experiments was used to identify previously unknown W-linked coding content in ducks, which greatly expanded the existing catalog as 24 new W-linked genes were identified. Ultimately, 35 W-linked genes were identified, most of which have previously been reported in chickens (Table 2). In addition to protein-coding genes, a few W-linked genes have been annotated as “noncoding RNA” and “small nucleolar RNA.” Although these genes cannot be verified by expression abundance, they are in the same contigs as the other W-linked genes and have been verified using polymerase chain reaction (PCR); the correct PCR bands only appear in females and were thus retained in this study. Gene names and ID are consistent with the annotation file of the reference genome (GCA_003850225.1). The genomic location gives the chromosome(s) where the gene sequences can be found in the duck reference genome (GCA_003850225.1). Interestingly, a potential W-linked gene, HNRNPK, was not found in our BLAST results, and its expression profile did not meet the conditions for sex-limited expression. While HNRNPK has been identified as a W-linked gene in many bird species, we suspect that it may be the result of an incorrect annotation in the reference genome or that the ZW homologous genes are too similar to distinguish; therefore, we chose to remove it from subsequent analysis.

Table 2.

Anas platyrhynchos W-linked Genes and Their Z-Linked Gametologs

Name W-Linked Gene ID1 Z-Linked Gene ID1 Position on Z (MB)
CHD1 (known) 101804420 101801287 49.66
KCMF1 (known) 101805329 101793658 47.53
MIER3 (known) 101793178 101795833 17.14
NIPBL (known) 101793805 101802956 11.35
RASA1 (known) 101805096 101801452 71.34
SPIN1 (known) 101799270
UBAP2 (known) 101790407 101805008 7.84
UBE2R2 (known) 113840994 101795475 7.80
VCP (known) 101793310 101802671 8.79
ZSWIM6 (known) 113842241, 101797738 101802677 18.65
ZFR (known) 101800730 101801470 9.86
MAP3K1 (newly identified) 101801490
GNAQ (newly identified) 101790479 101794123 35.92
SNORD121A (newly identified) 113841760, 113841761
STOML2 (newly identified) 106020385 101802102 8.84
GOLPH3 (newly identified) 113842275, 113842276 101792495 9.77
CZH18ORF25 (newly identified) 101796452 101795462 2.98
KIF2A (newly identified) 101797743 101795197 18.97
GVQW3 (newly identified) 106020314
GPBP1 (newly identified) 101795454 113840191 17.12
SMAD2 (newly identified) 101803787 101805334 2.33
RNF165 (newly identified) 101795545 101792499 2.94
SCAMP1 (newly identified) 113840869 101805238 22.05
PIAS2 (newly identified) 101789923 101800058 2.67
DNAJB5 (newly identified) 101800405
FAM219A (newly identified) 101790734 101803136 8.10
SKOR2 (newly identified) 113842240
SMAD7 (newly identified) 110352107 113840330 2.03
HINT1 (newly identified) 113841927 101798532 43.44
SREK1 (newly identified) 113841302 101800668 19.73
Novel (newly identified) 113841129
RPTC15L (newly identified) 113841554
Novel (newly identified) 113841759
Novel (newly identified) 110352428
Novel (newly identified) 113841183

1LOC (GCA_003850225.1)

Female-Specific Selection and Duck W Chromosome Expression

The evolution of W chromosome expression was assessed. After removing genes with low expression levels (transcripts per million [TPM] < 1), a total of 28 genes remained with which to assess the response of the W chromosome to female-specific selection (fig. 2A). The W-linked genes were expressed similarly in the wild population (MD) and meat-type population (PK), whereas elevated female-selected egg-type populations (SX) were clustered separately (fig. 2). Of the 28 expressed W-linked genes, nine showed greater expression of female-selected SX based on average expression levels, when compared with MD, while 14 loci were expressed at lower levels in SX than in the MD (P < 0.05, χ2 test). There were more upregulated W-linked genes in the PK when compared with the MD, as 16 of the 28 genes were expressed at higher levels, and only eight loci were expressed at lower levels (P < 0.05, χ2 test). Expression patterns similar to those of W-linked genes were also detected for autosomal genes in both sexes. The differences in Euclidean clustering between the W-linked genes and female-biased genes indicate their disparate evolutionary trajectories (fig. 2B and C).

Fig. 2.


Fig. 2.

Heatmap showing hierarchical gene expression clustering variation patterns in the gene expression data. (A) Expressed W-linked genes in females (n = 28); (B) autosomal female-biased genes in females (>2-fold change level across all breeds, n = 46); and (C) autosomal genes averaged across males and females (n = 9,176). Heatmaps are based on the average breed expression.

The contrast in transcript abundance of the W-linked genes between breeds did not appear to reflect the expected expression effects of sex-specific selection; specifically, the RNA-sequencing (RNA-seq) results do not conform to the significant increase in expression in female-specific SX. Thus, the sensitivity of W chromosome expression to changes in the intensity of sex-specific selection may be low.

Stable Matrilineal Inheritance of the W Chromosome and mtDNA

Although not physically linked, the W chromosomes and mitochondrial genomes are hypothesized to be coinherited matrilineally. To test this, we constructed maximum-likelihood (ML) cladograms using ten individuals and their data from both mtDNA and NRW sequences to determine their evolutionary consistency. Both markers yielded a fully resolved tree with identical topology (fig. 3), which is in line with the supposed coinheritance association of W and mtDNA.

Fig. 3.


Fig. 3.

ML trees representing the NRW (left) and mtDNA (right) gene genealogies. Bootstrap percentages for ML trees (100 replicates) are shown above the branches. Different rectangular box areas represent different clusters.

Z Chromosome Formed by Four Independent Recombination Suppression Events

The Z–W gametolog list was expanded in this study and now includes 24 genes (Table 2), more than double that of the previous catalogs. These genes were subsequently used to estimate dS, which was then spatially mapped on the Z chromosome to reconstruct the history of recombination suppression in ducks. Overall, the gametologs were clustered into four groups (fig. 4, Table S1), representing putative strata. Importantly, dS estimates differed significantly among the putative strata (P < 0.05, t-test). The k-means clustering of dS into four groups returned identical stratum boundaries.

Fig. 4.


Fig. 4.

Evolutionary history of the recombination suppression strata for the duck Z chromosome. The distribution on the Z-linked physical position of the synonymous divergence estimates (dS) for gametologs is shown. The 95% confidence intervals are based on 1,000 bootstrap replicates. AZ, GZ, MZ, OZ, and TZ indicate A. platyrhynchos, G. gallus, M. gallopavo, O. jamaicensis, and T. guttata Z-linked genes, respectively. AW, GW, MW, and OW indicate A. platyrhynchos, G. gallus, M. gallopavo, and O. jamaicensis W-linked genes, respectively. The physical Z-linked positions of the loci on the gene trees are indicated by *.

Using a molecular clock based on the mutation rate of avian Z–W orthologs, we estimated the period during which recombination suppression occurred in each stratum. Divergence time estimates were clustered into four groups based on the stratum, corresponding to 88–125 Ma (Stratum I, dS = 0.332–0.473), 78 Ma (Stratum II, dS = 0.297), 37–70 Ma (Stratum III, dS = 0.140–0.269), and 15–51 Ma (Stratum IV, dS = 0.060–0.195; fig. 4).

Discussion

Recombination suppression has occurred in avian sex chromosomes over approximately 130 Myr (Fridolfsson et al. 1998; Pigozzi 1999; Ellegren and Carmichael 2001; Wright et al. 2012). The female-limited W chromosome is a useful contrast to male-limited Y chromosomes and allows us to understand the consequences of female-specific selection and maternal inheritance (Rice 1984; Connallon et al. 2010; Smeds et al. 2015). However, difficulties in identifying W-linked genes have limited our understanding of this process. A sufficient data set of W-linked contents is expected to provide a complete temporal overview of sex-linked evolution. Based on the assembled genome contigs from third-generation sequencing and 12 known W-linked genes (Wright et al. 2014), we have developed a coherent method by which to identify new W-linked genes and their Z-linked orthologs, including BLAST-based sequence alignment, expression-based verification, and PCR-based molecular verification. The expanded data set contains 35 W-linked genes and 24 gametologs; not every W-linked gene could be matched with a Z-linked ortholog in this study. In addition to missing annotation information for related genes, one of the reasons for this is that we adopted strict standards to reduce false positives and thus enable the identification of true gametologs.

Expression of W Genes

The domestic duck originated from MD more than 2,200 years ago. Following domestication, some duck breeds were selected explicitly for increased female fecundity (eggs, representing increased female-specific selection), whereas others were selected only for growth traits (meat, representing relaxed female-specific selection) (Zhang, Jia, Almeida, et al. 2018; Zhou et al. 2018). We used three duck populations with varied female-specific selections to explore the effects of selective power on the expression of W-linked loci. There were no significant positive correlations identified between the magnitude of female-specific selection and transcriptome abundance based on the identified W-linked genes; in brief, duck W chromosome expression was not found to be sensitive to the selection. We did not observe the same increase in expression in the egg-selected breed previously observed in the chickens from layer lines (Moghadam et al. 2012), and there may be several reasons for this. Firstly, increased expression in chickens was observed in ed19, a developmental stage previously shown to be under strong selection in females (Schoenmakers et al. 2009). Importantly, this is not the case for expression data taken from other stages of development (Mank et al. 2010). It is possible that expression from ed25, the age sampled in ducks, may be less important for female fecundity than that from ed19. Secondly, egg production and other female fitness traits can be achieved by organisms in different genomic ways like gene sequence variation, and this time W expression is not involved. Additionally, it is possible that selection for egg production in ducks is less powerful than artificial selection in industrial chicken breeds that produce eggs in orders of magnitude more than the meat or wild populations. To avoid expression interference caused by environmental or acquired factors, duck samples at the late embryonic stage were considered. Strictly speaking, this sample selection method may limit our results of expression stability to the early developmental stage of ducks.

Coinheritance of W Chromosome and Mitochondrial Genome

The unique maternal inheritance pattern of sex-limited W chromosomes is also followed by mtDNA, although the evolutionary process is uncoupled. Maternal coinheritance of both the W chromosome and mtDNA has evolved along almost the same path. In some rare cases, paternal mitochondria infiltrate the ovum and pass on their mtDNA to the offspring during fertilization; thus, mtDNA is not inherited exclusively from the mother (Luo et al. 2018). These occasional paternal contributions of mtDNA or recombination in either molecule would weaken the “evolutionary tacit understanding.” Our results that the evolutionary tree of NRW and mtDNA has an identical topological structure and gene genealogies indicate that there is a coinheritance association between these two female-specific chromosomes and their evolutionary stability in the duck genome, and this evidence is consistent with mainstream opinions (Ellegren 2007).

Evolutionary History of Z–W Recombination Suppression

The expanded catalog of Z–W orthologs was used to refine our understanding of sex chromosome recombination suppression by defining strata based on divergence and the molecular clock (Lahn and Page 1999). The results revealed at least four strata in which recombination was halted between the Z and W chromosomes along with the Z chromosome of the duck. The same stepwise suppression pattern was observed in Galliformes and songbirds (Wright et al. 2014; Xu et al. 2019).

Our data suggest that recombination ceased first in the 45- to 52-Mb region of the Z chromosome, 88–125 Ma (Stratum I). In rapid succession, recombination suppression occurred in two additional regions: Stratum II arose 78 Ma, and Stratum III experienced recombination suppression 37–70 Ma (Stratum III). Stratum IV was the youngest strata, corresponding to the final suppression event, 15–51 Ma. While previous results identified only three strata, our expanded W catalog has allowed us to refine the youngest regions into two distinct strata (III and IV). The identification of additional Z–W gametologs may thus reveal additional strata.

Stratum IV is the youngest of our recovered strata, and although the results of ML gene trees show that evolutionary patterns are conserved in Strata I–III, the gene topologies in Stratum IV are more variable. In particular, the gene tree of NIPBL suggests that recombination suppression between Z and W predates the divergence of Galliformes and Anseriformes, which is estimated to have occurred 90 Ma (Van Laere et al. 2008), while the gene tree of MADH2 suggests Z–W recombination suppression well after the split event between Galliformes and Anseriformes. This may be due to gene-specific variation in the underlying molecular clock, additional cryptic strata within Stratum IV, or possible gene conversion between some Z–W orthologs, as has previously been observed in chickens. Although Anseriformes-specific Stratum IV has been determined to have a divergence date through the molecular clock, due to inaccurate estimates of the mutation rate and the evidence that six gametologs have dS < 0.01, we conjecture that this is a very recent divergence or even an ongoing recombination suppression event. Previous research (Wright et al. 2014) found evidence of significant inner gene conversion that proves the possibility of ongoing recombination between the gametologs; this indicates that a fourth stratum was hypothesized in Wright's research; our results precisely confirm that the previous study's “Anseriform-specific Stratum III” is actually made up of two distinct strata. While a large number of W-linked genes have been conserved across 90 Myr of avian evolution, by comparing the lengths of the W chromosomes and their gene numbers, it has been found that chickens seem to retain more coding content, possibly corresponding to lower degeneration rates when compared with the duck W chromosome. The results observed are Stratum IV for ducks over longer time spans, possibly explaining this “elimination” effect. Predictably, the W chromosome will degenerate.

Taken together, our results demonstrate the feasibility of in silico subtraction of multiomic data to identify sex-linked genes. Given the multicopy nature and high sequence similarity of avian sex chromosomes (Rogers et al. 2021), this interlocking approach provides a reliable and efficient method by which to detect Z–W orthologs and can be used to study the evolutionary history of sex chromosomes. The W-linked express pattern irrelevant to sex-specific selection at the early development stage and stringent genetic concordance of genomic level indicates the evolutionary stability of the duck sex chromosome system compared with other animals. The use of gamtologs and related genomic and sequence variant information enabled sex chromosome strata to be identified, and this reflected the temporal processes of recombination suppression in ducks. Our results indicate that there are at least four strata spanning approximately 110 Myr, including three conserved strata (Strata I, II, and III) across the Galloanserae and one younger serform-specific Stratum IV.

Material and Methods

Samples

A combined approach based on sequence similarity, RNA-seq data, and PCR was used to identify W-linked genes based on our current knowledge of W-linked genomic material (fig. 1), from two domestic duck breeds and one wild duck breed. The domesticated ducks included one meat-type breed, PK (n = 1), for which female-specific selection has been relaxed as its reproductive performance has been ignored in favor of meat production (Zhang, Jia, Chen, et al. 2018), and one egg-type breed, SX (n = 1), for which elevated female-specific selection has occurred because of its excellent reproductive performance (Zhu et al. 2021). MD (n = 1) is a wild breed that has not been subjected to artificial sex-specific selection. All three female ducks were sampled as adults, and genomic DNA was extracted and sequenced using a combination of two technologies: SMRT sequencing with 42× coverage per individual on the PacBio Sequel platform (Pacific Biosciences) and paired-end sequencing with 60× coverage on the Illumina HiSeq platform. RNA-seq data for expression level verification was obtained from the same populations (MD, PK, and SX), and in each case, the left gonad was collected on embryonic day (ed)25 separately from five males and five females and stored in RNAlater (Invitrogen, Carlsbad, CA, USA) until the total RNA was extracted using the TRIzol reagent (Invitrogen, Carlsbad, CA, USA). Total RNA was sequenced on an Illumina NovaSeq 6000 (Illumina Inc., San Diego, CA, USA) with 150-bp paired-end reads.

Raw SMRT data were filtered using SMRTlink v5.0, with the following criteria: minLength = 50 and minReadScore = 0.8. The Illumina raw data were filtered according to the default parameters of Fastp v0.20.1 (Chen et al. 2018).

Genome Assembly

To expand the data set to validate putative W-linked genes and improve the reliability of the results, the MD, PK, and SX genomes were assembled separately, using Falcon v0.4 and Canu v2.0 (Koren et al. 2017) for the initial assembly. All PacBio SMRT reads were reused to improve the initial assembly with Blasr v5.3.3 (Chaisson and Tesler 2012) and the Arrow tools from GenomicConsensus v2.3.3 (https://github.com/PacificBiosciences/). Illumina paired-end reads were used to polish the corrected contigs with BWA-mem v0.7.17 (Li and Durbin 2009) and Pilon v1.23 (Walker et al. 2014).

Identification, Verification, and Female-Specific Selection of W-Linked Genes

For ducks, a limited catalog of 12 W-linked genes was previously identified, and this was used to identify additional W-linked genes in this study. First, we used BLASTN with “best hit only” with these 12 known genes to identify potential W-linked contigs in the six contig-level genome assemblies and retained only contigs that included the full-length protein-coding regions. We then BLASTed these “putative W contigs” to the reference genome (GCA_003850225.1) and took the intersection of the genes identified in the contigs for the three individuals as the putative W-linked genes.

The RNA-seq data were mapped to the reference genome (GCA_003850225.1) using Hisat2 (Kim et al. 2015), which requires both reads of a pair to map them concordantly to the reference sequence and remove mismatches to avoid mistakenly identifying Z-linked genes with a high similarity to W-linked genes. Stringtie (Pertea et al. 2015) was used to estimate transcript abundance. TPM were calculated for males and females to normalize the expression abundance. Genes with a TPM < 1 in all samples were considered unexpressed and discarded. We define TPMmale < 5 and TPMfemale/TPMmale > 10 as strongly female-biased genes, and genes expressed only in females (TPMmale < 1 and TPMfemale > 1) as female-limited genes. These two types of genes were validated as putative W-linked genes using their gene expression profiles. These putative genes were further verified to be located on the W chromosome through PCR. To be specific, we amplified these genes using DNA samples of two sexes, and primers were designed using Primer5 based on the exon sequence information of these genes. When PCR showed that females obtained the target length band, while males did not have the target band, the gene was considered a W-linked gene. Genes that met the requirements of high sequence similarity, transcriptome verification, and PCR verification were W-linked genes with a high confidence. Global transcriptomic patterns were visualized via hierarchical clustering of the expression levels of the W-linked genes using Euclidean clustering and complete linkage in RStudio.

Sex Chromosome Strata

The identification of strata on the Z chromosome relies on the gametologs of the Z and W chromosomes. Wright et al. identified 12 Z–W gametologs in ducks using an older version of the reference genome (GCA_000355885.1) (fig. 1). To identify additional gametologs, we extracted the longest transcript from the reference genome GTF file and BLASTed our newly identified putative W-linked sequences reciprocally against the Z-linked genes. Gene pairs with the best BLAST hits were retained as gametologs. For multiple Z or W paralogs, we used the method outlined by Wright et al. (2012). In these cases, synonymous divergence (dS) was calculated individually for potential gametologs, and the gametologs pair with the lowest dS was retained to minimize any problems caused by relaxing or diversifying the selection of the gene copies.

We next estimated the dS for each Z–W gametolog as a measure of relative time since recombination suppression. For true gametologs, their coding sequences (CDS) were identified and then translated into protein sequences using the gffread function in Cufflinks V2.2.1. Translated protein sequences were aligned with MUSCLE in MEGA-X (Edgar 2004; Kumar et al. 2018), and poorly aligned regions were removed using visual alignment. The aligned protein sequences were used to guide the alignment of the CDS using the pal2nal.pl script. The ML estimates for synonymous (dS) and nonsynonymous (dN) divergence were calculated using the aligned CDS with the CODEML package of PAML (Yang 2007), and 1,000 bootstraps and pairwise comparisons were set to ensure the accuracy of the results.

In addition to the dS, physical location, and density of the gene on the Z chromosome, we also determined the number and boundary of strata with k-means clustering of the dS estimates into defined groups using RStudio. The k-means function and the flexclust package (An and Mattausch 2013) were used for the analysis, the estimated K = 3 was used as an input parameter, and a t-test was used to verify the accuracy of the results. Divergence dates for gametologs were calculated using a molecular clock based on previous work on Galliformes, showing that the mutation rate for the Z chromosome is 2.6 × 10−9/site/year, and for the W chromosome, it is 1.25 × 10−9/site/year (roughly half the autosomal mutation rate). The sex-specific mutation rates of the Z–W orthologs were determined to be approximately 3.8 × 10−9/site/year, and the divergence time was dS/(3.8 × 10−9) years (Dimcheff et al. 2002; Hedges 2002; Axelsson et al. 2004; Berlin and Ellegren 2006).

To assess the degree of recombination suppression for each stratum and to verify the accuracy of the divergence time estimates, ML gene trees were constructed using MEGA-X (Kumar et al. 2018), and the gametologs of five bird species (Gallus gallus, Meleagris gallopavo, Anas platyrhynchos, Oxyura jamaicensis, and Taeniopygia guttata) and the Z-linked orthologs were used as the outgroup in all cases, following the same methodology as described above. The resulting tree files were visualized using FigTree (http://tree.bio.ed.ac.uk/software/FigTree/).

Evolutionary Pattern of Mitochondria and W Chromosome Coinheritance

The development of a comprehensive W-linked gene catalog allowed us, for the first time, to determine whether the W-linked coding content and mtDNA followed precise maternal coinheritance. The resequencing data for multiple breeds from a previous study were downloaded from NCBI (Zhang, Jia, Almeida, et al. 2018). We randomly chose ten female individuals representing three breeds (MD, n = 3; PK, n = 4; and SX, n = 3) to verify the inheritance pattern, and we expected the genealogies of individuals to be identical between the two types of markers. Specifically, these next-generation sequencing (NGS) data were aligned to the reference genome to obtain single nucleotide polymorphism (SNP) data, and we extracted those corresponding to the W chromosome and mitochondrial sequences. ML cladograms for these SNPs were constructed, corresponding to ten individuals from both mtDNA and W gene sequences. mtDNA was obtained from the reference genome. Protein-coding content was retained to ensure the accuracy of the tree construction.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).

Supplementary Material

evad183_Supplementary_Data

Acknowledgments

We gratefully acknowledge our colleagues in the Poultry Team at the National Engineering Laboratory for Animal Breeding of China Agricultural University, for their assistance on sample collection and helpful comments on the manuscript. This research was supported by High-performance Computing Platform of China Agricultural University. This work was supported by the Beijing Innovation Team of the Modern Agro-industry Technology Research System for Poultry (BJJQ-G01-2022).

Contributor Information

Hongchang Gu, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China; Institute of Animal Husbandry and Veterinary Medicine, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China.

Junhui Wen, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.

Xiurong Zhao, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.

Xinye Zhang, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.

Xufang Ren, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.

Huan Cheng, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.

Lujiang Qu, Department of Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China.

Author Contributions

H.G. studied the conception and design, analyzed and interpreted the sequencing data, and drafted and reversed the manuscript. J.W, X.Z., and X.Z. contributed reagents, materials, and analysis tools. X.R. and H.C. provided some suggestions for the improvement of the study. L.Q. conceived the study, participated in its design and coordination, and approved the publication of the manuscript. All the authors have read and approved the final manuscript.

Data Availability

PacBio SMRT data and Illumina data for genome assembly and polish are accessible at the National Center for Biotechnology Information (NCBI) under BioProject accession numbers SRP393835 and SRP392982. All RNA-seq data are accessible at the National Center for Biotechnology Information (NCBI) under BioProject accession numbers PRJNA978377. The resequencing data for identifying maternal inheritance were deposited in NCBI Sequence Read Archive under accession number SRP125660. The preliminary assembled genome is accessible at the National Center for Biotechnology Information (NCBI) under BioProject accession number PRJNA1017119. The codes generated during the current study are available from the author on reasonable request.

Ethics

Our animal experiments were approved by the Animal Care and Use Committee of China Agricultural University (Approval ID: XXCB-20090209). All the animals were fed and handled according to the regulations and guidelines established by this committee, and all efforts were made to minimize suffering.

Literature Cited

  1. An F, Mattausch HJ. 2013. K-means clustering algorithm for multimedia applications with flexible HW/SW co-design. J Syst Archit. 59:155–164. [Google Scholar]
  2. Axelsson E, Smith NGC, Sundstrom H, Berlin S, Ellegren H. 2004. Male-biased mutation rate and divergence in autosomal, Z-linked and W-linked introns of chicken and Turkey. Mol Biol Evol. 21:1538–1547. [DOI] [PubMed] [Google Scholar]
  3. Bachtrog D. 2008. The temporal dynamics of processes underlying Y chromosome degeneration. Genetics 179:1513–1525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bachtrog D, Charlesworth B. 2002. Reduced adaptation of a non-recombining neo-Y chromosome. Nature 416:323–326. [DOI] [PubMed] [Google Scholar]
  5. Bellott DW, et al. 2017. Avian W and mammalian Y chromosomes convergently retained dosage-sensitive regulators. Nat Genet. 49:387–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Berlin S, Ellegren H. 2001. Evolutionary genetics—clonal inheritance of avian mitochondrial DNA. Nature 413:37–38. [DOI] [PubMed] [Google Scholar]
  7. Berlin S, Ellegren H. 2006. Fast accumulation of nonsynonymous mutations on the female-specific W chromosome in birds. J Mol Evol. 62:66–72. [DOI] [PubMed] [Google Scholar]
  8. Berlin S, Smith NGC, Ellegren H. 2004. Do avian mitochondria recombine? J Mol Evol. 58:163–167. [DOI] [PubMed] [Google Scholar]
  9. Berlin S, Tomaras D, Charlesworth B. 2007. Low mitochondrial variability in birds may indicate Hill-Robertson effects on the W chromosome. Heredity (Edinb). 99:389–396. [DOI] [PubMed] [Google Scholar]
  10. Chaisson MJ, Tesler G. 2012. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics 13:238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Charlesworth D, Charlesworth B, Marais G. 2005. Steps in the evolution of heteromorphic sex chromosomes. Heredity (Edinb). 95:118–128. [DOI] [PubMed] [Google Scholar]
  12. Chen S, Zhou Y, Chen Y, Gu J. 2018. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:884–890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Connallon T, Cox RM, Calsbeek R. 2010. FITNESS CONSEQUENCES OF SEX-SPECIFIC SELECTION. Evolution 64:1671–1682. [DOI] [PubMed] [Google Scholar]
  14. Dimcheff DE, Drovetski SV, Mindell DP. 2002. Phylogeny of Tetraoninae and other galliform birds using mitochondrial 12S and ND2 genes. Mol Phylogenet Evol. 24:203–215. [DOI] [PubMed] [Google Scholar]
  15. Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Ellegren H. 2007. Molecular evolutionary genomics of birds. Cytogenet Genome Res. 117:120–130. [DOI] [PubMed] [Google Scholar]
  17. Ellegren H. 2010. Evolutionary stasis: the stable chromosomes of birds. Trends Ecol Evol. 25:283–291. [DOI] [PubMed] [Google Scholar]
  18. Ellegren H, Carmichael A. 2001. Multiple and independent cessation of recombination between avian sex chromosomes. Genetics 158:325–331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Fridolfsson AK, et al. 1998. Evolution of the avian sex chromosomes from an ancestral pair of autosomes. Proc Natl Acad Sci U S A. 95:8147–8152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Handley LL, Ceplitis H, Ellegren H. 2004. Evolutionary strata on the chicken Z chromosome: implications for sex chromosome evolution. Genetics 167:367–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hedges SB. 2002. The origin and evolution of model organisms. Nat Rev Genet. 3:838–849. [DOI] [PubMed] [Google Scholar]
  22. Jackson BC. 2011. Recombination-suppression: how many mechanisms for chromosomal speciation? Genetica 139:393–402. [DOI] [PubMed] [Google Scholar]
  23. Kaiser VB, Charlesworth B. 2010. Muller's ratchet and the degeneration of the Drosophila miranda neo-Y chromosome. Genetics 185:339–348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kapun M, Flatt T. 2019. The adaptive significance of chromosomal inversion polymorphisms in Drosophila melanogaster. Mol Ecol. 28:1263–1282. [DOI] [PubMed] [Google Scholar]
  25. Kim D, Langmead B, Salzberg SL. 2015. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 12:357–360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Koren S, et al. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27:722–736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 35:1547–1549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Lahn BT, Page DC. 1999. Four evolutionary strata on the human X chromosome. Science 286:964–967. [DOI] [PubMed] [Google Scholar]
  29. Lange J, et al. 2009. Isodicentric Y chromosomes and sex disorders as byproducts of homologous recombination that maintains palindromes. Cell 138:855–869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lemos B, Araripe LO, Hartl DL. 2008. Polymorphic Y chromosomes harbor cryptic variation with manifold functional consequences. Science 319:91–93. [DOI] [PubMed] [Google Scholar]
  31. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li H-F, et al. 2010. Origin and genetic diversity of Chinese domestic ducks. Mol Phylogenet Evol. 57:634–640. [DOI] [PubMed] [Google Scholar]
  33. Luo S, et al. 2018. Biparental inheritance of mitochondrial DNA in humans. Proc Natl Acad Sci U S A. 115:13039–13044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Mank JE. 2012. Small but mighty: the evolutionary dynamics of W and Y sex chromosomes. Chromosome Res. 20:21–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Mank JE. 2017. Population genetics of sexual conflict in the genomic era. Nat Rev Genet. 18:721–730. [DOI] [PubMed] [Google Scholar]
  36. Mank JE, Nam K, Brunstrom B, Ellegren H. 2010. Ontogenetic complexity of sexual dimorphism and sex-specific selection. Mol Biol Evol. 27:1570–1578. [DOI] [PubMed] [Google Scholar]
  37. Moghadam HK, Pointer MA, Wright AE, Berlin S, Mank JE. 2012. W chromosome expression responds to female-specific selection. Proc Natl Acad Sci U S A. 109:8207–8211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Nam K, Ellegren H. 2008. The chicken (Gallus gallus) Z chromosome contains at least three nonlinear evolutionary strata. Genetics 180:1131–1136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Nishimura Y, et al. 2006. Active digestion of sperm mitochondrial DNA in single living sperm revealed by optical tweezers. Proc Natl Acad Sci U S A. 103:1382–1387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pertea M, et al. 2015. Stringtie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 33:290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Pigozzi LI. 1999. Origin and evolution of the sex chromosomes in birds. Biocell 23:79–95. [PubMed] [Google Scholar]
  42. Rice WR. 1984. Sex-chromosomes and the evolution of sexual dimorphism. Evolution 38:735–742. [DOI] [PubMed] [Google Scholar]
  43. Rogers TF, Pizzari T, Wright AE. 2021. Multi-copy gene family evolution on the avian W chromosome. J Hered. 112:250–259. [DOI] [PubMed] [Google Scholar]
  44. Sager R, Lane D. 1972. Molecular basis of maternal inheritance. Proc Natl Acad Sci U S A. 69:2410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Schoenmakers S, et al. 2009. Female meiotic sex chromosome inactivation in chicken. PLoS Genet. 5:e1000466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Smeds L, et al. 2015. Evolutionary analysis of the female-specific avian W chromosome. Nat Commun. 6:7330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Smith CA, et al. 2009. The avian Z-linked gene DMRT1 is required for male sex determination in the chicken. Nature 461:267–271. [DOI] [PubMed] [Google Scholar]
  48. Van Laere A-S, Coppieters W, Georges M. 2008. Characterization of the bovine pseudoautosomal boundary: documenting the evolutionary history of mammalian sex chromosomes. Genome Res. 18:1884–1895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Walker BJ, et al. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. White DJ, Wolff JN, Pierson M, Gemmell NJ. 2008. Revealing the hidden complexities of mtDNA inheritance. Mol Ecol. 17:4925–4942. [DOI] [PubMed] [Google Scholar]
  51. Wright AE, Harrison PW, Montgomery SH, Pointer MA, Mank JE. 2014. Independent stratum formation on the avian sex chromosomes reveals inter-chromosomal gene conversion and predominance of purifying selection on the W chromosome. Evolution 68:3281–3295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wright AE, Moghadam HK, Mank JE. 2012. Trade-off between selection for dosage compensation and masculinization on the avian Z chromosome. Genetics 192:1433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Xu LH, et al. 2019. Dynamic evolutionary history and gene content of sex chromosomes across diverse songbirds. Nat Ecol Evol. 3:834–844. [DOI] [PubMed] [Google Scholar]
  54. Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 24:1586–1591. [DOI] [PubMed] [Google Scholar]
  55. Zhang Z, Jia Y, Almeida P, et al. 2018. Whole-genome resequencing reveals signatures of selection and timing of duck domestication. Gigascience 7:giy027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Zhang Z, Jia Y, Chen Y, et al. 2018. Genomic variation in Pekin duck populations developed in three different countries as revealed by whole-genome data. Anim Genet. 49:132–136. [DOI] [PubMed] [Google Scholar]
  57. Zhou Z, et al. 2018. An intercross population study reveals genes associated with body size and plumage color in ducks. Nat Commun. 9:2648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Zhu F, et al. 2021. Three chromosome-level duck genome assemblies provide insights into genomic variation during domestication. Nat Commun. 12:5932. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

evad183_Supplementary_Data

Data Availability Statement

PacBio SMRT data and Illumina data for genome assembly and polish are accessible at the National Center for Biotechnology Information (NCBI) under BioProject accession numbers SRP393835 and SRP392982. All RNA-seq data are accessible at the National Center for Biotechnology Information (NCBI) under BioProject accession numbers PRJNA978377. The resequencing data for identifying maternal inheritance were deposited in NCBI Sequence Read Archive under accession number SRP125660. The preliminary assembled genome is accessible at the National Center for Biotechnology Information (NCBI) under BioProject accession number PRJNA1017119. The codes generated during the current study are available from the author on reasonable request.


Articles from Genome Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES