Skip to main content
Molecular Biology and Evolution logoLink to Molecular Biology and Evolution
. 2017 Sep 25;34(12):3186–3204. doi: 10.1093/molbev/msx250

Sequence and Structural Diversity of Mouse Y Chromosomes

Andrew P Morgan 1, Fernando Pardo-Manuel de Villena 1,*
PMCID: PMC5850875  PMID: 29029271

Abstract

Over the 180 My since their origin, the sex chromosomes of mammals have evolved a gene repertoire highly specialized for function in the male germline. The mouse Y chromosome is unique among mammalian Y chromosomes characterized to date in that it is large, gene-rich and euchromatic. Yet, little is known about its diversity in natural populations. Here, we take advantage of published whole-genome sequencing data to survey the diversity of sequence and copy number of sex-linked genes in three subspecies of house mice. Copy number of genes on the repetitive long arm of both sex chromosomes is highly variable, but sequence diversity in nonrepetitive regions is decreased relative to expectations based on autosomes. We use simulations and theory to show that this reduction in sex-linked diversity is incompatible with neutral demographic processes alone, but is consistent with recent positive selection on genes active during spermatogenesis. Our results support the hypothesis that the mouse sex chromosomes are engaged in ongoing intragenomic conflict.

Keywords: sex chromosome evolution, intragenomic conflict, mouse evolution

Introduction

Sex chromosomes have emerged many times in independent plant and animal lineages. The placental mammals share a sex chromosome pair that originated ∼180 Ma (Hughes and Page 2015). In the vast majority of mammal species, the Y chromosome is sex-determining: presence of the Y-encoded protein SRY is sufficient to initiate the male developmental program (Berta et al. 1990). Since their divergence from the ancestral X chromosome, mammalian Y chromosomes have lost nearly all of their ancestral gene content (fig. 1A). Although these losses have occurred independently along different lineages within the mammals, the small subset of genes that are retained in each linage tend to be dosage-sensitive and have housekeeping functions in core cellular processes such as transcription and protein degradation (Bellott et al. 2014; Cortez et al. 2014). Contrary to bold predictions that the mammalian Y chromosome is bound for extinction (Graves 2006), empirical studies of Y chromosomes have demonstrated that most gene loss occurs in early proto-sex chromosomes, and that the relatively old sex chromosomes of mammals are more stable (Bellott et al. 2014). The evolutionary diversity of Y chromosomes in mammals arises from the set of Y-acquired genes, which make up a small fraction of some Y chromosomes and a much larger fraction in others—from 5% in rhesus to 45% in human (Hughes and Page 2015) (fig. 1B). These genes are often present in many copies and are highly specialized for function in the male germline (Lahn and Page 1997; Soh et al. 2014).

Fig. 1.

Fig. 1.

Evolution of mammalian Y chromosomes. (A) Evolution of heteromorphic sex chromosomes. (B) Y chromosomes of mammals. The Y chromosome of therian mammals, characterized by the sex-determining factor SRY, diverged from the X chromosome ∼180 Ma. (The monotremata have a different sex-determining factor, AMH, and an idiosyncratic five-pair sex chromosome system.) Y chromosome sizes and the fraction of sequence occupied by multicopy, Y-acquired genes are shown at the tips of the tree. (C) Structure of the Y chromosome in the C57BL/6J reference strain. The short arm of the Y chromosome (Yq) consists primarily of genes shared with the X chromosome and retained since the sex chromosomes diverged from the ancestral autosome pair. These genes are interspersed with blocks of segmental duplications (light grey). The sex-determining factor Sry is encoded on the short arm. The long arm (Yq) consists of ∼200 copies of a 500 kb repeating unit containing the acquired genes Sly, Ssty1, Ssty2, and Srsy. The sequence in the repeat unit can be roughly divided into three families “red,” “yellow,” and “blue” following (Soh et al. 2014). (D) The X choromosome, unlike the Y chromosome, is acrocentric. Homologs of the acquired genes on the Y chromosome (Slx, Slxl1, Sstx, and Srsx; shown above using colored blocks as on the Y) are present in high copy number but are arranged in tandem chunks, rather than intermingled as on the Y.

The Y chromosome of the house mouse (Mus musculus) stands out among mammalian Y chromosomes both for its sheer size and its unusual gene repertoire. Early molecular studies of the mouse Y chromosome hinted that it consisted of mostly of repetitive sequences, with copy number in the hundreds, and that it was evolving rapidly (Nishioka and Lamothe 1986; Eicher et al. 1989). Unlike other mammalian Y chromosomes, which are dominated by large blocks of heterochromatin (Hughes and Page 2015), the mouse Y chromosome was also known to be large and almost entirely euchromatic. Spontaneous mutations in laboratory stocks allowed the mapping of male-specific tissue antigens and the sex-determining factor Sry to the short arm of the chromosome (Yp) (McLaren et al. 1988), whereas lesions on the long arm (Yq) were associated with infertility and defects in spermatogenesis (Styrna et al. 1991; Burgoyne et al. 1992; Touré et al. 2004).

Sequencing, assembly and annotation of the mouse Y chromosome in the inbred strain C57BL/6 J was finally completed in 2014 after more than a decade of painstaking effort (Soh et al. 2014). Ancestral genes are restricted to Yp and are fewer in number on the mouse Y chromosome than in other studied mammals. Yq was shown to consist of ∼200 copies of a 500 kb unit—the “huge repeat array”—containing the acquired genes Sly, Ssty1, Ssty2, and Srsy (fig. 1C). Sly and its X-linked homologs Slx and Slxl1 are found only in the genus Mus and have sequence similarity to the synaptonemal complex protein SYCP3 (Ellis et al. 2011). Ssty1/2 and Sstx are most similar to members of the spindlin family (Oh et al. 1997) and are present in taxa at least as phylogenetically distant as rats. The coding potential of Srsy and Srsx is unclear, but they have sequence similarity to melanoma-related cancer/testis antigens typified by the human MAGEA family. Their phylogenetic origins remain unresolved. The genes of the huge repeat array are expressed almost exclusively in postmeiotic round spermatids and function in chromatin condensation and sperm maturation (Burgoyne et al. 1992; Touré et al. 2004, 2005; Yamauchi et al. 2009, 2010).

Independent amplification of homologous genes on the X and Y chromosomes is thought to be a byproduct of competition between the X and Y chromosomes for transmission to the next generation. The current consensus favors an unidentified X-linked sex-ratio distorter whose action is suppressed by one or more Y-linked factors (Ellis et al. 2011). Consistent with this hypothesis, SLY acts directly to maintain transcriptional repression of postmeiotic sex chromatin (PSCR; Hendriksen et al. 1995) by recruiting a suite of repressive histone marks (Ellis et al. 2005; Cocquet et al. 2009; Moretti et al. 2016); its action is opposed by SLX and SLXL1. Imbalance between SLY and SLX/SLXL1 tilts the progeny sex ratio in favor of the overexpressing chromosome and causes defects in sperm morphology and sperm count (Touré et al. 2004; Cocquet et al. 2009, 2010). Disruption of PSCR and the related process of meiotic sex chromosome inactivation (MSCI) is also associated with male sterility in intersubspecific hybrids between M. m. domesticus and M. m. musculus (Good et al. 2010; Campbell et al. 2013; Bhattacharyya et al. 2013; Larson et al. 2017). Together these observations suggest that the intragenomic conflict between the sex chromosomes in mouse is played out in postmeiotic spermatids and may have mechanistic overlap with hybrid male sterility.

Intragenomic conflict can have a profound impact on the genetic diversity of sex chromosomes in natural populations. Sex-ratio-distorter systems in Drosophila provide some of the best-known examples (Jaenike 2001; Derome et al. 2004; Kingan et al. 2010). The extent to which diversity on mouse sex chromosomes is influenced by intragenomic conflict remains an open question. The differential impact of selection on mouse X chromosome versus autosomes (the “faster-X” effect) is well-studied, mostly through the lens of speciation (Torgerson and Singh 2003; Kousathanas et al. 2014; Larson et al. 2016, 2017). Larson et al. (2016) used pairwise comparisons between wild-derived strains of M. m. musculus and M. m. domesticus to show that the “faster-X” effect is most prominent in two groups of genes: those expressed primarily in the testis and early in spermatogenesis (before MSCI), and those up-regulated in spermatids (after PSCR). The former set of genes is also prone to aberrant expression in sterile hybrids Larson et al. (2017). By contrast, selective pressures imposed by intragenomic conflict between the sex chromosomes should be exerted in spermatids after the onset of PSCR. Genes with spermatid-specific expression are expected to respond most rapidly, whereas those with broad expression are expected to be constrained by putative functional requirements in other tissues or cell types.

In this manuscript, we take advantage of the relatively recent high-quality assembly of the mouse Y chromosome (Soh et al. 2014) and public sequencing data from a diverse sample of wild mice to perform a survey of sequence and copy-number diversity on the sex chromosomes. We use complementary gene-expression data and annotations to partition the analysis into functionally coherent groups of loci. We find that sequence diversity is markedly reduced on both the X and Y chromosomes relative to expectations for a stationary population. This reduction cannot be fully explained by any of several demographic models fit to autosomal data, but Y-linked diversity in M. m. domesticus is consistent with a recent selective sweep on Y chromosomes. Copy number of genes expressed in spermatids supports the hypothesis that intragenomic conflict between the sex chromosomes during spermiogenesis is an important selective pressure. These analyses broaden our understanding of the evolution of sex chromosomes in murid rodents and support an important role for positive selection in the male germline.

Results

A Survey of Y-Linked Coding Variation in Mouse

Whole-genome or whole-exome sequence data for 91 male mice was collected from published sources (Keane et al. 2011; Doran et al. 2016; Harr et al. 2016; Morgan, Didion, et al. 2016; Neme and Tautz 2016; Sarver et al. 2017). The final set consists of 62 wild-caught mice; 21 classical inbred strains; and 8 wild-derived inbred strains (table 1 and supplementary table S1, Supplementary Material online). The three cardinal subspecies of M. musculus (domesticus, musculus, and castaneus) are all represented, with Mus spretus and Mus spicilegus as close outgroups and Mus caroli, Mus cookii, and Nannomys minutoides as more distant outgroups. Our sample spans the native geographic range of the house mouse and its sister taxa (fig. 2A).

Table 1.

Wild and Laboratory Mice Used in This Study.

Type Population Country Males Females
Wild M. m. domesticus DE 8 1
FR 7 0
IR 5 0
M. m. musculus AF 5 1
CZ 2 5
KZ 2 4
M. m. castaneus IN 3 7
M. spretus ES 4 2
MA 1 0
M. macedonicus MK 1 0
M. spicilegus HU 1 0
M. caroli TH 0 1
M. cookii TH 1 0
Nannomys minutoides KE 1 0
Wild-derived M. m. domesticus IT 1 0
US 1 1
M. m. musculus CZ 1 1
M. m. castaneus TH 1 1
M. spretus ES 1 0
Classical lab 21 4

Fig. 2.

Fig. 2.

Patrilineal and matrilineal phylogeography in a geographically diverse sample from the genus Mus. (A) Sampling locations of mice used in this study. (B) Phylogenetic tree from coding sites on the Y chromosome. Samples are colored according to nominal ancestry; laboratory strains are shown in light grey. (C) Phylogenetic tree from coding sites on the mitochondrial genome. Deep nodes with posterior support < 0.9 indicated with shaded circles.

Single-nucleotide variants (SNVs) and small indels were ascertained in 41.6 kb of sequence on Yp targeted by the Roche NimbleGen exome-capture array. To mitigate the effect of alignment errors and cryptic copy-number variation on our analyses, we discarded sites with evidence heterozygosity; fewer than 60 samples with a called genotype; or evidence of strand bias (see Materials and Methods). In total, we identified 1,136 SNVs and 128 indels, with transition:tranversion ratio 2:1.

One group of inbred strains in our data set—C57BL/6 J (reference genome), C57BL/10 J, C57L/J and C57BR/cdJ—have a known common ancestor in the year 1929, and a common ancestor with the strain C58/J in 1915 (Beck et al. 2000). Assuming an average of three generations per year, the total branch length of the pedigree connecting the C57 and C58 strains is 5,280 generations, during which time three mutations occurred. We used these values to obtain a direct estimate of the male-specific point mutation rate: 1.8×108 (95% Poisson CI 4.5×1094.7×108) bp−1 generation−1. This interval contains the sex-averaged autosomal rate of 5.4×109 bp−1 generation−1 recently estimated from whole-genome sequencing of mutation-accumulation lines (Uchimura et al. 2015). Using the ratio between paternal to maternal mutations in mouse estimated in classic studies from Russell and colleagues (2.78; reviewed in Drost and Lee [1995]), that estimate corresponds to male-specific autosomal rate of 7.9×109 bp−1 generation−1, again within our confidence interval. We note that these estimates assume that selection has been negligible in laboratory colonies.

Phylogeny of Y Chromosomes Recovers Geographic Relationships

Phylogenetic trees for exonic regions of the Y chromosome and mitochondrial genome were constructed with BEAST (fig. 2B). The estimated time to most recent common ancestor (MRCA) of M. musculus Y chromosomes is 900,000 years ago (95% highest posterior density interval [HPDI] 100,0001,800,000) years ago. Within M. musculus, the domesticus subspecies diverges first, although the internal branch separating it from the MRCA of musculus and castaneus is very short. Consistent with several previous studies, we find that the “old” classical inbred strains share a single Y haplogroup within M. m. musculus. This haplogroup is distinct from that of European and central Asian wild mice and is probably of east Asian origin (Bishop et al. 1985; Nagamine et al. 1992; Tucker et al. 1992). Strains related to “Swiss” outbred stocks (FVB/NJ, NOD/ShiLtJ, HR8) and those of less certain American origin (AKR/J, BUB/BnJ) (Beck et al. 2000) have Y chromosomes with affinity to western European populations. M. m. castaneus harbors two distinct and paraphyletic lineages: one corresponding to the Indian subcontinent and another represented only by the wild-derived inbred strain CAST/EiJ (from Thailand). The latter haplogroup corresponds to a southeast Asian lineage identified in previous reports that sampled more extensively from that geographic region (Geraldes et al. 2008; Yang et al. 2011). It remains unclear whether this haplogroup originated in M. m. musculus and displaced the M. m. castaneus Y chromosome in southeast Asia; or instead represents a deep branching within the (large and unsampled) population ancestral to musculus and castaneus in central Asia.

The Y-chromosome tree otherwise shows perfect concordance between clades and geographic locations. Within the M. m. domesticus lineage we can recognize two distinct haplogroups corresponding roughly to western Europe and Iran and the Mediterranean basin, respectively. Similarly, within M. m. musculus, the eastern European mice (from Bavaria, Czech Republic) are well-separated from the central Asian mice (Kazakhstan and Afghanistan). Relationships between geographic origins and phylogenetic affinity are considerably looser for the mitochondrial genome. We even found evidence for interspecific hybridization: one nominally M. spretus individual from central Spain (SP36) carries a M. spretus Y chromosome and a M. m. domesticus mitochondrial genome (arrowhead in fig. 2B). Several previous studies have found evidence for introgression between M. musculus and M. spretus where their geographic ranges overlap (Orth et al. 2002; Song et al. 2011; Liu et al. 2015).

Copy-Number Variation Is Pervasive on the Y Chromosome

We examined copy number along Yp using depth of coverage. Approximately 779 kb (24%) of Yp consists of segmental duplications or gaps in the reference assembly (fig. 1); for duplicated regions we scaled the normalized read depth by the genomic copy number in the reference sequence to arrive at a final copy-number estimate for each individual. All of the known duplications on Yp are polymorphic in laboratory and natural populations (fig. 3A). The distribution of CNV alleles follows the SNV-based phylogenetic tree. Only one CNV region on Yp, adjacent to the centromere, contains a known protein-coding gene (Rbmy). Consistent with a previous report (Ellis et al. 2011), we find that musculus Y chromosomes have more copies of Rbmy than domesticus or castaneus chromosomes.

Fig. 3.

Fig. 3.

Copy-number variation on Yp. Schematic view of copy-number variable regions of the Y chromosome short arm (Yp) superposed on SNV-based phylogenetic tree. All CNVs shown overlap a segmental duplication in the reference sequence (strain C57BL/6 J). One CNV overlaps a known protein-coding gene: an expansion of the ampliconic Rbmy cluster (green) in M. m. musculus. Color scheme for Mus taxa follows figure 2.

The highly repetitive content of Yq precludes a similarly detailed characterization of copy-number variation along this chromosome arm. However, we can estimate the copy number of each of the three gene families present (Sly, Ssty1/2, and Srsy) by counting the total number of reads mapped to each and normalizing for sequencing depth. The hypothesis of X-Y intragenomic conflict predicts that, if expression levels are at least roughly proportional to copy number, amplification of gene families on Yq should be countered by amplification of their antagonistic homologs on the X chromosome (or vice versa.) We tested this hypothesis by comparing the copy number of X- and Y-linked homologs of the Slx/y, Sstx/y, and Srsx/y families in wild mice. Figure 4 shows that copy number on X and Y chromosomes are indeed correlated for Slx/y. The relationship between Slx-family and Sly-family copy number is almost exactly linear (slope = 0.98 [95% CI 0.871.09]; R2=0.87). We note that samples are not phylogenetically independent, so the statistical significance of the regression is exaggerated, but the qualitative result clearly supports previous evidence that conflict between X and Y chromosomes is mediated primarily through Slx and Sly (Cocquet et al. 2012). Size differences estimated from Sly copy number are also concordant with cytological observations that the Y chromosomes of wild-caught M. m. musculus appear much larger than those of M. spicilegus or M. spretus (Bulatova and Kotenkova 1990; Yakimenko et al. 1990).

Fig. 4.

Fig. 4.

Approximate copy number of coamplified gene families on X and Yq. Each dot represents a single individual. Grey dashed line is simple linear regression of Y-linked versus X-linked copy number.

It has recently been shown that two regions of the autosomes—on chromosomes 5 and 14—have a suite of epigenetic marks similar to the sex chromosomes in postmeiotic spermatids (Moretti et al. 2016). These autosomal regions harbor many copies of a family of genes (known alternatively as Speer [Spiess et al. 2003] or α-takusan [Tu et al. 2007]) expressed in spermatids. The copy number of Speer family members is, like Sly, correlated with that of Slx/Slxl1 (supplementary fig. S1, Supplementary Material online). This finding supports the hypothesis that the Speer family may be involved in sex-chromosome conflict in spermatids.

The scale of copy number change within the M. musculus lineage suggests a high underlying mutation rate. We used whole-genome sequence data from a panel of 69 recombinant inbred lines (RILs) from the Collaborative Cross (CC; Srivastava et al. 2017) to estimate the rate of copy-number change on Yq. Each CC line is independently derived from eight inbred founder strains via two generations of outcrossing followed by sibling matings until inbreeding is achieved (Consortium 2012). Distinct CC lines inheriting a Y chromosome from the same founder strain thus share an obligate male ancestor in the recent past, but no more recently than the start of inbreeding (fig. 5A). We estimated read depth in 100 kb bins across Yq and normalized each bin against the median for CC lines inheriting a Y chromosome from the same founder strain. This normalization effectively removes noise from mapping of short reads to repetitive sequence and uncovers CNVs from 6 to 30 Mb in size in 5 CC lines carrying three different Y chromosomes (table 2, supplementary table S2, Supplementary Material online, and fig. 5B). Because the pedigree of each CC line is known, mutation rates—for each Y haplogroup, and overall—can be estimated directly, assuming each new allele corresponds to a single mutational event. Our estimate of 0.30 (95% Poisson CI 0.0980.70) mutations per 100 father-son transmissions is about tenfold higher than ampliconic regions of the human Y chromosome (Repping et al. 2006), and places the mouse Yq among the most labile sequences known in mouse or human (Egan et al. 2007; Itsara et al. 2010; Morgan, Holt, et al. 2016). New Yq alleles also provide opportunities to investigate the effects of Yq copy number on fertility, sperm phenotypes and sex ratio (as in, among others, Styrna et al. 1991; Touré et al. 2004; Yamauchi et al. 2010; Cocquet et al. 2012; Fischer et al. 2016).

Fig. 5.

Fig. 5.

Copy-number variation on Yq in the Collaborative Cross. (A) Pedigree-based estimates of mutation rate on the Y chromosome long arm (Yq). Multiple recombinant inbred lines (RILs) from the Collaborative Cross (CC) panel share the same Y chromosome haplotype, with (filled shape) or without (open shape) a putative de novo CNV. These Y chromosome lineages are separated from their common male ancestor by an unknown number of generations prior to the initiation of the CC (grey dashed lines), plus a known number of generations of CC breeding (solid lines.) Representatives of the founder strains of the CC were sequenced at the Sanger institute; the number of generations separating the Sanger mouse from the common male ancestor is also unknown. (B) Normalized read depth across Yq for CC lines with de novo CNVs on Yq. Points are colored according to founder Y chromosome haplogroup. A representative wild-type line is shown for each mutant.

Table 2.

Pedigree-Based Estimates of Mutation Rates on Yq.

Haplogroup Events N G Rate (events/100 gen)
A/J 0 7 155 0.00 (0.002.4)
C57BL/6J 2 10 236 0.85 (0.103.1)
129S1/SvImJ 0 10 247 0.00 (0.001.5)
NOD/ShiLtJ 0 12 301 0.00 (0.001.2)
NZO/HlLtJ 2 13 326 0.61 (0.0742.2)
CAST/EiJ 0 8 194 0.00 (0.001.9)
PWK/PhJ 1 6 133 0.75 (0.0194.2)
WSB/EiJ 0 3 68 0.00 (0.005.4)
Overall 5 69 1660 0.30 (0.0980.70)

Note.—N, number of CC lines with each Y chromosome haplogroup; G, total number of breeding generations.

Sequence Diversity Is Markedly Reduced on Both Sex Chromosomes

We next used whole-genome sequence data to examine patterns of nucleotide diversity within Mus musculus in nonrepetitive sequence on Yp compared with the autosomes and X chromosome. To do so, we first identified a subset of wild mice without evidence of cryptic relatedness (see Materials and Methods); this left 20 male and 1 female M. m. domesticus (hereafter dom), 9 male and 10 female M. m. musculus (mus) and 3 male and 7 female M. m. castaneus (cas). Analyses of autosomes used both males and females from each population; sex chromosome analyses used males only to avoid introducing technical artifacts associated with differences in sample ploidy. Diversity statistics were calculated from the joint site frequency spectrum (SFS), which in turn was estimated directly from genotype likelihoods rather than hard genotype calls (Korneliussen et al. 2014).

We estimated nucleotide diversity in four classes of sites: intergenic sites (putatively neutral); introns; 4-fold degenerate sites; and 0-fold degenerate sites. Putatively neutral sites are useful for estimating demographic parameters, whereas the latter three classes are useful for assessing the impact of selection. Sites on the sex chromosomes are subject to different selective pressures than autosomal sites, both because they are “exposed” in the hemizygous state in males and because, in mammals, the sex chromosomes are enriched for genes with sex-specific expression patterns. To evaluate these effects, we further subdivided genic sites according to gene-expression patterns inferred from two expression data sets, one in 18 adult tissues and one a time course across spermatogenesis (see Materials and Methods). Genes on the autosomes and X chromosome were classified along two independent axes: testis-specific versus ubiquitously expressed; or expressed early in meiosis, prior to MSCI, versus expressed in postmeiotic spermatids. (Y chromosome genes are not subdivided, since they are few in number and inherited as a single linkage block.) All diversity estimates are shown in supplementary table S3, Supplementary Material online. For putatively neutral sites on the autosomes, our estimates of pairwise diversity (πdom=0.339%,πmus=0.325%,πcas=0.875%) are consistent with previous reports based on overlapping samples (Geraldes et al. 2008; Halligan et al. 2013; Kousathanas et al. 2014; Harr et al. 2016). Within each chromosome type, levels of diversity follow the expected rank order: intergenic sites > introns 4-fold degenerate (synonymous) sites > 0-fold degenerate (nonsynonymous) sites.

For the X chromosome, we further examined the relationship between sequence diversity and local sequence features including recombination rate, X-Y gametologous amplicons, gene sets described above and blocks of conserved synteny with rat (supplementary fig. S2, Supplementary Material online). Diversity is reduced across the entire X chromosome in all three populations, in marked contrast to local “troughs” observed in great apes (Nam et al. 2015). Regression of pairwise diversity (θπ) on distance away from ubiquitously expressed genes, meiosis genes, spermatid genes, and X-Y ampliconic genes was significant only in musculus for ubiquitously expressed genes (t = 6.6, Bonferoni-corrected p=6.8×1011). Similarly—and surprisingly—there was no relationship (t=1.2, p = 0.23) between sequence diversity and recombination rate at 100 kb resolution, as estimated from the Diversity Outbred mouse stock (Morgan et al. 2017; We speculate that characterizing recombination at finer scale from linkage disequilibrium [Auton and McVean 2007] would provide a more powerful test).

In a panmictic population with equal effective number of breeding males and breeding females (i.e., with equal variance in reproductive success between sexes), there are three X chromosomes and a single Y chromosome for every four autosomes. The expected ratios of X:A and Y:A diversity are therefore 3/4 and 1/4, respectively, if mutation rates in males and females are equal (Charlesworth et al. 1987). We estimated X:A and Y:A for putatively neutral sites and find that diversity on both sex chromosomes is markedly reduced relative to expectations in all three populations (table 3). The effect is strongest in M. m. domesticus (X:A =0.244, Y:A =0.0858) and weakest in M. m. musculus (X:A =0.563, Y:A =0.216). The mutation rate is higher in the male than the female germline in most mammals (recently reviewed in Scally [2016]), including mice, which might contribute to differences in observed diversity between chromosomes. We used divergence between mouse and rat at synonymous, one-to-one orthologous sites (drat) on autosomes, X and Y chromosome as a proxy for the long-term average mutation rate, and corrected X:A and Y:A estimates for differences in mutation rate (“corrected” rows in table 3). Even with this correction, X- and Y-linked diversity remains below expectations. Scaled diversity estimates for each class of sites are shown in figure 6. Reduction in X:A diversity has been described previously on the basis of targeted sequencing of a few loci in all three subspecies (Baines and Harr 2007), and for M. m. castaneus on the basis of whole-genome sequencing (Halligan et al. 2013; Kousathanas et al. 2014). A reduction in Y:A has not, to our knowledge, been reported.

Table 3.

Diversity Ratios between Pairs of Chromosome Types Relative to Neutral Expectations, with 95% Confidence Intervals.

Comparison Expected Scaling Population
dom mus cas
X:A 3/4 Raw 0.244 (0.2190.268) 0.563 (0.5060.613) 0.501 (0.4550.538)
Corrected 0.291 (0.2610.319) 0.670 (0.6030.729) 0.597 (0.5420.641)
Y:A 1/4 Raw 0.0858 (0.08050.0911) 0.216 (0.2070.227) 0.128 (0.1220.134)
Corrected 0.0924 (0.08670.0981) 0.233 (0.2230.244) 0.137 (0.1310.144)

note.Both raw diversity and diversity corrected for divergence to rat are shown.

Fig. 6.

Fig. 6.

Scaled nucleotide diversity by population, site class, and chromosome type. First panel from left shows estimates from intergenic sequence; remaining panels are site classes within protein-coding gene boundaries.

Reduction in Sex-Linked Diversity Is Inconsistent with Simple Demographic Models

Sex chromosomes are affected differently than autosomes by both neutral forces, such as changes in population size (Pool and Nielsen 2007), and by natural selection (reviewed in, e.g., Ellegren [2011]). The X chromosomes of humans (Arbiza et al. 2014) and several other primate species (Nam et al. 2015) are substantially less diverse than the demographic histories of these species would predict, as a result of both purifying selection and recurrent selective sweeps. For humans, the pattern extends to the Y chromosome (Sayres et al. 2014). Having observed a deficit of polymorphism on both sex chromosomes in mouse, the central question arising in this paper is: to what extent is sex-chromosome diversity reduced by natural selection? A rich body of literature already exists for the influence of selection on the mouse X chromosome, especially in the context of speciation (Baines and Harr 2007; Good et al. 2008; Teeter et al. 2008; Kousathanas et al. 2014; Larson et al. 2016, 2017), so we directed our focus to the lesser-studied Y chromosome.

To establish an appropriate null against which to test hypotheses about natural selection on the sex chromosomes, we followed an approach similar to Sayres et al. (2014). We fit four simple demographic models to SFS from putatively neutral intergenic sites on the autosomes using the maximum-likelihood framework implemented in ai (Gutenkunst et al. 2009) (fig. 7A). Each model is parameterized by an initial effective population size (N0), a size change (expressed as fraction f of starting size for models involving instantaneous size changes, or the ending population size Ne for the exponential growth models), and a time of onset of size change (τ). The relative fit of each model was quantified using the method of Aikake weights (Akaike 1978). All four models can be viewed as nested in the family of three-epoch, piecewise-exponential histories. In principle, such models are identifiable with sample size of 4 × 3 = 12 or more chromosomes (six diploid individuals; Bhaskar and Song 2014). In practice, more than one model fits each population about equally well (or equally poorly), with the exception of M. m. castaneus, which is best described by the “step-change” model (fig. 7B and C and table 4). Of course the true history of each population is almost certainly more complex than any of our models. Our goal is not to obtain a comprehensive description of mouse population history as such, but rather to pick an appropriate null model against which we can test hypotheses about selection. For domesticus, the stationary model is the most parsimonious; for musculus, the exponential-growth model.

Fig. 7.

Fig. 7.

Inference of demographic histories from autosomal sites. (A) Four simple demographic models fit with ai. Each model is parameterized by one or more of an ancestral effective population size (N0), time of population-size change (τ), change in population size as fraction of initial size (f), and present effective population size (Ne). (B) Observed site frequency spectra by population, with fitted spectra from the four models in panel A. (C) Relative support for each model, quantified by Aikake weight, by population.

Table 4.

Parameter Estimates for Models Shown in Figure 7.

Population Model Parameter
N0 Ne f τ
dom Neutral 162 (2)
Growth 160 (50) 230 (80) 1.53 (0.06)
Step-change 230 (50) 400 (700) 2 (4) 7 (6)
Bottleneck 164 (3) 30 (50) 0.2 (0.3) 0 (2)
mus Neutral 165 (2)
Growth 150 (40) 500 (200) 7 (5)
Step-change 150 (50) 270 (100) 1.8 (0.5) 1 (3)
bottleneck 164 (4) 20 (8) 0.12 (0.05) 1 (1)
cas Neutral 429 (3)
Growth 350 (100) 2000 (1000) 0 (2)
Step-change 300 (100) 800 (700) 3 (1) 1 (4)
Bottleneck 431 (7) 30 (10) 0.08 (0.03) 0.6 (0.9)

Note.Population sizes are given in thousands and times in units of N0; bootstrap standard errors in parentheses.

We also compared our estimates of sex chromosome diversity to predictions from coalescent theory (Pool and Nielsen 2007; Polanski et al. 2017) for the four models considered above. For this analysis, we focused on the ratios X:A and Y:A, which are independent of autosomal effective population size. Results are shown in supplementary figure S3, Supplementary Material online, with observed X:A and Y:A ratios superposed. We summarize some relevant trends here and refer to previous reviews (Pool and Nielsen 2007; Webster and Wilson Sayres 2016) for further details. Qualitatively, both X:A and Y:A are reduced after an instantaneous contraction in population size, eventually recovering to their stationary values after about 4Ne generations. For a bottleneck—a contraction followed by instantaneous recovery to the initial size—X:A and Y:A are at first sharply reduced and then increased relative to a stationary population, again returning to stationary values after about 4Ne generations. With exponential growth, X:A and Y:A are actually increased relative to their stationary values. These patterns are modulated by the breeding sex ratio; X:A increases and Y:A decreases when females outnumber males, and vice versa. In brief, some combination of a male-biased breeding ratio and a very strong (f0.1) population contraction would be required to explain the observed reductions in X:A and Y:A in domesticus, with somewhat milder effects required to explain the reduction in musculus or castaneus. These histories are not consistent with population histories inferred from autosomal SFS. We hypothesize that this discrepancy is explained, at least in part, by selection.

Both Sex Chromosomes Have Been Shaped by Positive Selection in the Male Germline

We used two approaches to investigate the role of selection on the sex chromosomes. First, we used a variant of the McDonald–Kreitman test (McDonald and Kreitman 1991) to obtain a nonparametric estimate of the proportion α of sites fixed by positive selection (loosely, the “evolutionary rate”) in genes with different expression and inheritance patterns (Smith and Eyre-Walker 2002). The rate of adaptive evolution should be faster on the X chromosome when new mutations tend to have greater fitness effect in males than in females, to be on average recessive, or both (Charlesworth et al. 1987). We might expect genes with testis-biased expression or genes expressed during spermatogenesis to be targets of male-specific selection. Consistent with previous work on the “faster-X” effect in mouse (Kousathanas et al. 2014; Larson et al. 2016, 2017), we find that a greater proportion of X-linked than autosomal substitutions are adaptive. The pattern holds in all three populations (fig. 8). In domesticus and musculus, X-linked genes whose expression is biased towards early meiosis or round spermatids evolve faster than X-linked genes with ubiquitous expression or expression across spermatogenesis. By contrast, non-ampliconic Y-linked genes—all expressed during male meiosis—have evolutionary rates closer to autosomal genes, with heterogeneity across populations. Unfortunately, we cannot assess the rate of sequence evolution in ampliconic gene families on the Y chromosome using short-read data.

Fig. 8.

Fig. 8.

Proportion of sites fixed by positive selection (using the nonparametric estimator of Smith and Eyre-Walker [2002], αSEW) according to gene-expression class, chromosome, and population. Error bars represent 95% bootstrap CIs. Ampliconic genes on X and Y are excluded.

Second, we used forward simulations from the models fit to autosomal SFS to explore the possible contribution of natural selection to the SFS of Y chromosomes. We simulated two modes of selection independently: purifying selection on linked deleterious alleles (background selection, BGS; Hudson and Kaplan 1995), and hard selective sweeps on newly arising beneficial alleles. For the BGS model, we varied the proportion of sites under selection α and the mean population-scaled selection coefficient γ=Ns¯; for the sweep model, we varied only the γ for the sweeping allele (Simulation details are provided in Materials and Methods). Posterior distributions for these parameters were inferred using an approximate Bayesian computation (ABC) approach (Pritchard et al. 1999; Beaumont et al. 2002); Bayes factors were used for model comparison. The castaneus population was excluded from these analyses because sample size (only three chromosomes) was not sufficient for calculating some of the summary statistics chosen for ABC.

Results of the ABC procedure are shown in supplementary figure S4, Supplementary Material online. The Y chromosomes of domesticus are best approximated by the selective-sweep model. For musculus the result is less clear: the neutral null model actually provides the best fit, and among models with selection, the BGS model is superior. However, over the parameter ranges used in our simulations, we have limited power to discriminate between different models at the current sample size (n20 chromosomes; supplementary fig. S4B, Supplementary Material online). In the best case—the selective-sweep model—we achieve only 49% recall. This reflects both the constraints of a small sample and the more fundamental limits on model identifiability for a single nonrecombining locus like the Y chromosome.

If a selective sweep did occur on domesticus Y chromosomes, it was moderately strong: we estimate Ns¯=9.29 (50% HPDI 09.88) (table 5). For comparison, Ns¯500 for adaptive alleles in the human lactase gene (LCT), a well-characterized example of recent positive selection (Tishkoff et al. 2007). Posterior distributions of several estimators of nucleotide diversity recapitulate the values observed in real data (supplementary fig. S4D, Supplementary Material online). We note that, because the Y chromosome is inherited without recombination, our estimate of Ns¯ reflects the cumulative selection intensity on the entire chromosome and not necessarily on a single site.

Table 5.

Parameter Estimates from ABC.

Parameter
Model Population Best? α Ns¯
BGS dom 0.675 (0.6241.00) 43.7 (12.296.4)
mus 0.481 (0.1180.553) 2.4 (020.2)
BGS+growth dom 0.647 (0.6430.960) 25.7 (026.2)
mus 0.473 (0.3170.649) 8.39(012.9)
Sweep dom 9.29 (09.88)
mus 0.663 (00.934)

Note.Values are shown as posterior median and 50% highest posterior density interval (HPDI). Best-fitting model for each population indicated by check mark.

Sex-Linked Gene Expression Diverges Rapidly in the Testis

Given the dramatic differences in Y-linked gene content between even closely related Mus taxa, we finally asked whether patterns of gene expression showed similar divergence. In particular, we sought to test the prediction that expression patterns of Y-linked genes diverge more rapidly than autosomal genes in the testis. To that end we reanalyzed published gene expression data from the brain, liver and testis of wild-derived outbred individuals representing seven (sub)species spanning an 8 My evolutionary transect across the murid rodents (Neme and Tautz 2016; fig. 9A). For genes on the autosomes and X chromosome, the great majority of expression variance lies between tissues rather than between (sub)species (PC1 and PC2, cumulative 77.1% of variance explained; fig. 9B). For Y-linked genes, highly enriched for function in the male germline, most variance (PC1, 59.6% of variance explained) naturally lies between the testis and the nongermline tissues.

Fig. 9.

Fig. 9.

Divergence of sex-linked gene expression in murid rodents. (A) Schematic phylogeny of taxa in the multitissue expression data set. Node labels are approximate divergence times (Ma); branch lengths not to scale. (B) Projection of samples onto the top two principal components of expression values for autosomal, X-linked and Y-linked genes. (C) Expression trees computed from rank-correlations between taxa for autosomal (A), X-linked (X) and Y-linked (Y) genes (across columns) for brain, liver and testis (across rows.) (D) Total tree length by chromosome type and tissue. (E) Expression trees as in panel C, with genes partitioned according to expression pattern: testis-specific; ubiquitously expressed; early spermatogenesis (meiosis prior to MSCI); and late spermatogenesis (spermatids). (F) Total tree length by chromosome type and expression pattern.

To quantify divergence in gene expression patterns we computed the rank-correlation (Spearman’s ρ) between species for each tissue type separately for autosomal, X-linked, and Y-linked genes, and constructed trees by neighbor-joining (fig. 9C). We use total tree length as an estimator of expression divergence. The topology of these trees for the autosomes and X chromosome in brain and testis is consistent with known phylogenetic relationships within the Muridae. Consistent with previous comparative analyses of gene expression in mammals (Brawand et al. 2011), we find that expression patterns are most constrained in brain and least constrained in testis (fig. 9D). Expression divergence is equal between autosomes and X chromosome in brain and liver, but greater for X-linked genes in testis. Y-linked expression diverges much more rapidly in all three tissues, but the effect is most extreme in the testis. We caution that the precision of these estimates is limited by the small number of Y-linked relative to autosomal or X-linked genes.

This “faster-X” effect should be limited to functional elements subject to male-specific selection. Genes expressed in the male germline (testis-biased and/or expressed during spermatogenesis) might be enriched for such elements, relative to genes with ubiquitous expression. We therefore estimated expression divergence in autosomal, X- and Y-linked genes with four sets of genes with different expression patterns (fig. 9E). X-linked expression diverges more rapidly than autosomal expression only among genes with testis-biased expression. In contrast to Larson et al. (2016) but in keeping with other predictions (Good and Nachman 2005), we find that the “faster-X” effect on expression is larger for genes expressed late than early in meiosis (fig. 9F). The number of Y-linked genes in each group is too small to permit any strong conclusions.

Discussion

We have shown that nucleotide diversity in M. musculus is reduced on both sex chromosomes relative to expectations for a stationary population, and that the effect appears strongest in M. m. domesticus and weakest in M. m. musculus (table 3). Sex differences in the long-term average mutation rate, estimated from synonymous-sites divergence to rat, are not sufficient to explain the deficit. Because sex chromosomes respond differently than autosomes to changes in population size, we fit several (simple) models of demographic history to autosomal site-frequency spectra (fig. 7) and compared their predictions to observed values. At least for the models we considered (see Supplementary Material), neither gradual nor instantaneous changes in population size—of magnitude feasible given autosomal SFS—can account for the reduction in diversity on both the X and Y chromosomes, even if we allow for a sex ratio different than 1:1 (supplementary fig. S3, Supplementary Material online). Estimates of effective size of each population (from autosomal sites) are in agreement with previous work on house mice (Din et al. 1996; Baines and Harr 2007; Salcedo et al. 2007; Geraldes et al. 2008).

Using demographic histories from autosomes as a baseline, we simulated two modes of selection—background selection and hard selective sweeps—on Y chromosomes of domesticus and musculus. Although discrimination between models was limited by both technical factors and theoretical constraints, we have shown that the Y-linked SFS in domesticus is consistent with a moderately strong selective sweep (supplementary fig. S4, Supplementary Material online). The background selection model is the best-fitting in musculus, but is only 1.4-fold more likely (log10BF=0.16) than the next-best model. We conclude that recent positive selection accounts, at least in part, for the reduction in Y-linked relative to autosomal diversity in domesticus. Furthermore, coding sequences of X-linked genes with germline expression are disproportionately shaped by positive selection (fig. 8). Both X- and Y-linked genes have rapidly diverging expression patterns in the testis, especially in spermatids (fig. 9). Together these findings provide strong support for the idea that positive selection in the male germline is a potent and ongoing force shaping both mammalian sex chromosomes (Mueller et al. 2013; Larson et al. 2016).

To what extent are these pressures a consequence of intragenomic conflict? The reciprocal actions of SLX/SLX1 and SLY on sex-linked gene expression in spermatids establish the conditions for conflict between the X and Y chromosomes that implicates any gene whose expression after meiosis is beneficial for sperm maturation and fertilizing ability. X-linked alleles that meet the functional requirement for postmeiotic expression in the face of repression by SLY—via a stronger promoter, a more stable transcript, a more active protein product, or increased copy number—should be favored by selection (Ellis et al. 2011). The same should be true, in reverse, for successful Y chromosomes.

Although we cannot directly identify the putative target(s) or meccanism(s) of selective sweeps on the Y chromosome, several independent lines of evidence point to the ampliconic genes on Yq active in the X-Y conflict. First, the copy number of Slx/Slxl1 and Sly have increased 3-fold within M. musculus and are correlated across populations (fig. 4), consistent with an “arms race” between the sex chromosomes in which the Y chromosome is the lagging player. The absolute expression of ampliconic X genes and their Yq homologs (in whole testis) increases with copy number across Mus (supplementary fig. S5, Supplementary Material online). Larson et al. (2017) have shown that, in spermatids from reciprocal F1 hybrids between domesticus and musculus that are “mismatched” for Slx/Slxl1 and Sly, global X-chromosome expression is indeed perturbed in the direction predicted by the copy number and actions of SLX/SLX1 and SLY. Second, several independent deletions of Yq in laboratory stocks converge on a similar phenotype, namely low fertility, abnormal sperm morphology due to problems with chromatin compaction, and sex-ratio distortion in favor of females (Styrna et al. 1991; Conway et al. 1994; Touré et al. 2004; Fischer et al. 2016; MacBride et al. 2017). Third, Y chromosomes from musculus—the subspecies with highest Sly copy number—are more successful at introgressing across domesticus–musculus hybrid zone in Europe, and in localities where they do, the census sex ratio is shifted towards males (Macholán et al. 2008). Consomic strains with differing only by their Y chromosomes show similar deviation in the sex ratio from parity (Case et al. 2015). Finally, although modeling predicts moderately strong positive selection on Y, there is little evidence that it occurs within coding sequences of single-copy genes on Yp (fig. 8). This observation permits several explanations but is consistent with the idea that Yp alleles are hitchhiking with favorable alleles on Yq.

It is more difficult to ascertain the contribution of intragenomic conflict to the paucity of diversity on the X chromosome. Although the mammalian X chromosome is enriched for genes with expression in the male germline (e.g., Rice 1984; Mueller et al. 2013), its functional portfolio is considerably more broad than that of the Y chromosome (Bellott et al. 2014, 2017). The X chromosome also has a major role in hybrid sterility in mouse (Forejt and Iványi 1974; Forejt 1996; Payseur et al. 2004; Storchová et al. 2004; Good et al. 2008; Teeter et al. 2008; Campbell et al. 2013; Turner et al. 2014); the Y chromosome does not (Turner et al. 2012; Campbell and Nachman 2014). We corroborate the “faster-X” effect on protein evolution that has been previously described by others (Kousathanas et al. 2014; Larson et al. 2016) and show that it is strongest for genes expressed in the male germline (fig. 8), which are widely scattered across the X chromosome (supplementary fig. S2D, Supplementary Material online). We conclude that selection is pervasive on the mouse X chromosome and reduces diversity chromosome-wide. This stands in contrast to the pattern observed in great apes, which has apparently been driven by a few strong selective sweeps (Hvilsom et al. 2012; Veeramah et al. 2014; Nam et al. 2015).

Many open questions remain with respect to the evolution of mouse Y chromosomes. How many of the hundreds of copies in each gene family retain coding potential? Which copies are functionally equivalent? Does suppression of recombination promote the spread of clusters of genes like Slx, similar to sex-ratio drivers in other species (Jaenike 2001)? What evolutionary trade-offs does success in the sex-chromosome conflict entail, in the context of sperm competition and polyandry in natural populations (Simmons and Fitzpatrick 2012)? Does the conflict lead to oscillations between a male-biased and female-biased population over time, and if so, what is the effect on patterns of diversity on the sex chromosomes? All of these are important avenues of future study as we seek to understand the forces shaping sex chromosomes.

Materials and Methods

Alignment and Variant-Calling

Whole-genome sequencing reads were obtained from the European Nucleotide Archive (PRJEB9450, PRJEB11742, PRJEB14673, PRJEB14167, PRJEB2176, and PRJEB15190) and whole-exome reads from the NCBI Short Read Archive (PRJNA323493). Reads were aligned to the mm10 reference sequence using bwa mem v0.7.15-r1140 (Li 2013) with default parameters. Optical duplicates were marked using samblaster and excluded from downstream analyses. Regions of the Y chromosome accessible for variant calling were identified using the CallableLoci tool in the GATK v3.3-0-g37228af (McKenna et al. 2010). To be declared “callable” within a single sample, sites were required to have depth consistent with a single haploid copy (3<depth<50) and <25% of overlapping reads having mapping quality (MQ) zero. The analysis was restricted to Yp. The final set of callable sites was defined as any site counted as callable within > 10 samples. In total, 2,289,336 bp (77% of the non-gap length of Yp) were deemed callable.

SNVs and short indels on the Y chromosome were ascertained using GATK HaplotypeCaller v3.3-0-g37228af in the intersection of callable regions and exons targeted by the Roche NimbleGen exome-capture array, lifted over to mm10 with CrossMap v0.2.3 and the mm9-to-mm10 chain file from the UCSC Genome Browser (http://hgdownload.soe.ucsc.edu/goldenPath/mm9/liftOver/mm9ToMm10.over.chain.gz; last accessed September 1, 2017). To minimize artifacts from cryptic copy-number variation, X-Y homology, and the like, only biallelic sites with a “homozygous” (i.e., single-copy hemizygous) call in all male samples were used. Sites called in fewer than 60 samples or with strand-bias P value < 0.01 were filtered. Raw VCF files are provided in supplementary file S1, Supplementary Material online.

For the Y chromosome phylogenetic tree shown in figure 2A, data from Collaborative Cross lines carrying A/J, 129S1/SvImJ, NOD/ShiLtJ, NZO/HlLtJ, CAST/EiJ, PWK/PhJ, and WSB/EiJ Y chromosomes were used in place of the inbred strains themselves (see aliases in supplementary table S1, Supplementary Material online). Whole-genome sequence from male representatives of these lines has not (to our knowledge) been published.

Estimation of Site Frequency Spectra and Summary Statistics

Site frequency spectra (SFS) were calculated from genotype likelihoods at callable sites using ANGSD v0.917 (Korneliussen et al. 2014). Genotype likelihoods for the autosomes were calculated under the GATK diploid model after applying base alignment quality (BAQ) recalibration with the recommended settings for bwa alignments (-baq 1 -c 50, effectively discarding evidence from reads aligning at <95% identity). Sites were filtered to have per-individual coverage consistent with the presence of a single diploid copy (3<depth<80), to be nonmissing in at least three individuals per population. Genotype likelihoods for the X and Y chromosomes were calculated under the GATK haploid model with depth filters appropriate for haploid sites (3<depth<40). Only reads with MQ>20 and bases with call quality > 13 were considered, and ampliconic regions (plus a 100 kb buffer on each side) were masked. Site-wise allele frequencies were computed within each population separately, and the joint SFS across nonmissing sites in the three populations was estimated from these frequencies. The consensus genotype from a single Mus spicilegus male was used as the ancestral sequence to polarize alleles as ancestral or derived. Ensembl v87 reference annotations were used to define internecine sites, intronic sites, 0-fold, and 4-fold degenerate sites.

Diversity statistics and neutrality tests were calculated from joint SFS using standard formulae implemented in a custom Python package, sfspy (http://github.com/andrewparkermorgan/sfspy; last accessed September 1, 2017). Uncertainties for autosomal and X-linked sites were obtained by bootstrapping over loci, since the X and autosomes recombine; and for Y-linked sites using the built-in bootstrapping method of ANGSD.

Models of Sex-Chromosome Diversity under Neutral Coalescent

The expected ratio of X-to-autosome (X:A) and Y-to-autosome (Y:A) pairwise diversity was obtained from the formulae derived in Pool and Nielsen (2007). Define the inheritance factors hA = 1, hX=3/4 and hY=1/4; and mutation rates μA,μA,μY. For an instantaneous change in population size of depth f from starting size N, the expected value of X:A is:

θπ,Xθπ,A=hXhAμXμA(f(f1)(112NhXf)g(f(f1)(112NhAf)g.

The expression for Y:A can be written similarly. Note that X:A and Y:A depend only on the ratio between mutation rates on different chromosomes, not the absolute mutation rate. For a bottleneck of depth f, starting g1 generations before the present and ending at g1+g2 generations before the present:

θπ,Xθπ,A=hXhAμXμAexp((fg1+g2)(hXhA)2NhXhAf)(1f+exp(g22NhXf)(f1+exp(g12NhX)))1f+exp(g2NhAf)(f1+exp(g12NhX)).

For a model with exponential growth with rate constant r, we used the approximation provided in Polanski et al. (2017):

θπ,Xθπ,A=hXhAμXμAlog{2rNhX(11N)+1}2Nr.

Unequal sex ratios were modeled by calculating the number of X and Y chromosomes per autosome, given fixed autosomal effective population size, using standard formulae as in Sayres et al. (2014), and passing these into the equations above via the parameter hX or hY.

In supplementary figure S3, Supplementary Material online, we plot X:A against Y:A. For the bottleneck and step-change models, X:A and Y:A vary with time since the onset of size change; these trajectories can be traced clockwise along each curve from t = 0 to t=4N (backwards in time).

Demographic Inference

The four demographic models illustrated in figure 7A were fit to autosomal SFS using ai v1.7 (Gutenkunst et al. 2009). We fit each model separately to each population, using the sum of marginal spectra from 1,000 approximately unlinked, putatively neutral intergenic regions each 100 kb in size, spanning a total of 85.4 Mb of callable sites after removing those missing in one or more populations. Because the depth, duration, and onset of a bottleneck have are confounded in the SFS, we fixed the duration of the bottleneck to be short (0.1Ne generations) and attempted to estimate the remaining two parameters. We additionally constrained the bottleneck model to include recovery to exactly the starting population size.

Convergence of model fits was assessed qualitatively by refitting each model from ten sets of randomly drawn initial values. We confirmed that the best-fitting models shown in figure 7 represent the “modal” result, in that a majority of independent runs reach a solution within five log-likelihood units of the one shown. Parameter estimates should nonetheless be interpreted with caution, as their uncertainties are wide.

Models of Natural Selection

To model the effect of natural selection on Y-linked diversity while accounting for possible nonstationary demographic processes, we used forward simulations implemented in SLiM v2.2.1 (Haller and Messer 2017). For M. m. domesticus, we simulated from a stationary model; for M. m. musculus, from an exponential growth model.

We considered two modes of selection: background selection (BGS) due to purifying selection against deleterious mutations at linked sites; and hard selective sweeps on newly arising beneficial mutations. Relative fitness in SLiM is modeled as 1+s for sex-limited chromosomes. BGS was modeled by introducing mutations whose selection coefficients s were drawn from a mixture of a gamma distribution with mean γ=Ns¯ (100×α% of mutations), and a point mass at zero ((100×(1α)% of mutations.) BGS simulations were run for 10 N generations, sufficient to reach mutation-selection-drift equilibrium. For selective sweeps, the simulation was first run for 10 N generations of burn-in, and then a beneficial variant was introduced with s drawn from a gamma distribution with mean γ=Ns¯. The simulation was then tracked until the beneficial variant was fixed or lost; in the case of loss, the run was restarted from the end of the burn-in period with a new mutation. We confirmed the integrity of simulations by checking that the pairwise diversity achieved by runs with selection coefficients fixed at zero matched the observed neutral values for each population (not shown.) Values of α were drawn from a uniform distribution on (0,1), and values of γ were drawn from a log-uniform distribution on (106,103). Runs were scaled for computational performance.

Simulations were connected to an approximate Bayesian computation (ABC) inference procedure implemented with the R package abc (Csilléry et al. 2012). Briefly, 500,000 simulations were performed for each model. Five summary statistics were calculated from the SFS generated by each simulation: Watterson’s estimator θw; Tajima’s estimators θπ and θζ; Tajima’s D; and Fu and Li’s D. The same set of statistics was computed for the observed joint SFS. The 0.1% of simulations with smallest Euclidean distance to the observed summary statistics were retained, accounting for collinearity between summary statistics using the “neuralnet” method of the function abc:: abc(). Posterior distributions were computed via kernel smoothing over the parameter values of the retained simulations using an Epanechnikov kernel and plug-in bandwidth estimate.

Models were compared via their Bayes factors, calculated using the abc:: postpr() function. To confirm the fidelity of the best-fitting model, summary statistics for pseudo-observed data sets (i.e., simulations from the posterior distributions) were checked against the observed summary statistics.

Size Estimation of Coamplified Regions of Yq and X

Copy number of ampilconic genes on Yq and X was estimated as follows. First, all paralogs in each family were identified by BLAT and BLAST searches using the sequences of canonical family members from Ensembl. These searches were necessary because many member of each family are annotated only as “predicted genes” (gene symbols “GmXXXX”). Based on BLAST results we assigned the Spin2/4 family—with members in several clusters on the proximal X chromosome—as Sstx. Normalized coverage was estimated for each nonoverlapping paralog by counting the total number of reads mapped and dividing by the genome-wide average read depth.

Identification of De Novo CNVs in Collaborative Cross Lines

Whole-genome sequencing reads (2 × 150 bp paired-end) from a single male individual from each 69 distinct Collaborative Cross (CC) lines were obtained from Srivastava et al. (2017). Alignment and quality control was performed as for wild mice. Read depth was estimated in 100 kb bins across the Y chromosome for each individual, and normalized for the effective depth of sequencing in that sample. Unambiguous alignment of 150 bp reads to the highly repetitive sequence on Yq is clearly not possible. However, each of the eight founder Y chromosome haplogroups in the CC produces a characteristic read depth profile when reads are aligned with bwa-mem. We exploited this fact to remove noise from ambiguous read mapping by renormalizing the estimated depth in each bin for each sample agains the median depth in that bin for CC lines sharing the same Y chromosome haplogroup (listed in supplementary table S2, Supplementary Material online). Any remaining deviations in read depth represent variation among lines sharing the same Y chromosome haplogroup, that is, candidate de novo CNVs. CNVs were ascertained by manual inspection of the renormalized read depth profile of each CC line.

Analyses of Gene Expression

Multitissue, Multispecies Data Set

Neme and Tautz (Neme and Tautz 2016) measured gene expression in whole testis from wild-derived outbred mice from several species (fig. 9A) using RNA-seq. Reads were retrieved from the European Nucleotide Archive (PRJEB11513). Transcript-level expression was estimated using kallisto (Bray et al. 2016) using the Ensembl 85 transcript catalog augmented with all Slx/y, Sstx/y, and Srsx/y transcripts identified in (Soh et al. 2014). In the presence of redundant transcripts (i.e., from multiple copies of a coamplified gene family), kallisto uses an expectation-maximization algorithm to distribute the “weight” of each read across transcripts without double-counting. Transcript-level expression estimates were aggregated to the gene level for differential expression testing using the R package tximport. As for the microarray data, “predicted” genes (with symbols “GmXXXX”) on the Y chromosome were assigned to a coamplified family where possible using Ensembl Biomart.

Gene-level expression estimates were transformed to log scale and gene-wise dispersion parameters estimated using the voom() function in the R package limma. Genes with total normalized abundance (length-scaled transcripts per million, TPM) < 10 in aggregate across all samples were excluded, as were genes with TPM > 1 in fewer than three samples.

Spermatogenesis Time Course

Larson et al. (2017) measured gene expression in isolated spermatids of three males from each of four F1 crosses—CZECHII/EiJ × PWK/PhJ; LEWES/EiJ × PWK/PhJ; PWK/PhJ × LEWES/EiJ; and WSB/EiJ × LEWES/EiJ—using RNA-seq. Reads were retrieved from NCBI Short Read Archive (SRP065082). Transcript-level expression was estimated using kallisto (Bray et al. 2016) using the Ensembl 85 transcript catatlog augmented with all Slx/y, Sstx/y, and Srsx/y transcripts identified in (Soh et al. 2014). In the presence of redundant transcripts (i.e., from multiple copies of a coamplified gene family), kallisto uses an expectation-maximization algorithm to distribute the “weight” of each read across transcripts without double-counting. Transcript-level expression estimates were aggregated to the gene level for differential expression testing using the R package tximport. As for the microarray data, “predicted” genes (with symbols “GmXXXX”) on the Y chromosome were assigned to a coamplified family where possible using Ensembl Biomart.

Gene-level expression estimates were transformed to log scale and gene-wise dispersion parameters estimated using the voom() function in the R package limma. Genes with total normalized abundance (length-scaled transcripts per million, TPM) < 10 in aggregate across all samples were excluded, as were genes with TPM > 1 in fewer than three samples.

Definition of Tissue-Specific Gene Sets

The “tissue specificity index” (τ) of Yanai et al. (2005) was used to define tissue- or cell-type-specific gene sets. The index was first proposed for microarray data, and was adapted for RNA-seq as follows. Define Ti to be the mean log-scaled expression of a gene in tissue or cell type i (of N total), as estimated by limma. We require expression values to be strictly positive, so let q=miniTi and define T˜i=Ti+q. Finally, calculate τ as

τ=1N1iN1T˜iT˜max.

The set of testis-biased genes was defined as all those with τ>0.5 and higher expression in testis than in any of the other 17 tissues in the multitissue data set (PRJEB11897; Harr et al. 2016). The set of ubiquitously expressed genes was defined as those with τ<0.25 and whose expression was above the median expression in the highest-expressing tissue. The set of early-meiosis genes was defined as those with τ>0.5 and highest expression in leptotene/zygotene spermatocytes; spermatid genes were defined as those with τ>0.5 and highest expression in round spermatids. We analyzed expression specificity during spermatogenesis separately in the two intrasubspecific F1 crosses, and took the union of the resulting gene sets.

Supplementary Material

Supplementary data are available at Molecular Biology and Evolution online.

Supplementary Material

Supplementary Data

Acknowledgments

The authors thank Jeff Good, Erica Larson, Michael Nachman, Megan Phifer-Rixey, Jacob Mueller, Alyssa Kruger, Marty Ferris, Peter Ellis, and additional anonymous reviewers for many insightful comments and suggestions. This work was supported by the National Institutes of Health (National Institute of Mental Health [F30MH103925], Eunice Kennedy Shriver National Institute of Child Health and Human Development [R01HD065024], National Institute of Allergy and Infectious Diseases [U19AI100625], and Office of the Director [U42OD010924]).

References

  1. Akaike H. 1978. On the likelihood of a time series model. J Roy Stat Soc D. 27(3/4): 217–235. [Google Scholar]
  2. Arbiza L, Gottipati S, Siepel A, Keinan A.. 2014. Contrasting X-linked and autosomal diversity across 14 human populations. Am J Hum Genet. 946: 827–844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Auton A, McVean G.. 2007. Recombination rate estimation in the presence of hotspots. Genome Res. 178: 1219–1227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baines JF, Harr B.. 2007. Reduced X-linked diversity in derived populations of house mice. Genetics 1754: 1911–1921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Beaumont MA, Zhang W, Balding DJ.. 2002. Approximate Bayesian computation in population genetics. Genetics 1624: 2025–2035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Beck JA, Lloyd S, Hafezparast M, Lennon-Pierce M, Eppig JT, Festing MFW, Fisher EMC.. 2000. Genealogies of mouse inbred strains. Nat Genet. 241: 23–25. [DOI] [PubMed] [Google Scholar]
  7. Bellott DW, Hughes JF, Skaletsky H, Brown LG, Pyntikova T, Cho T-J, Koutseva N, Zaghlul S, Graves T, Rock S.. 2014. Mammalian Y chromosomes retain widely expressed dosage-sensitive regulators. Nature 5087497: 494–499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bellott DW, Skaletsky H, Cho T-J, Brown L, Locke D, Chen N, Galkina S, Pyntikova T, Koutseva N, Graves T, et al. 2017. Avian W and mammalian Y chromosomes convergently retained dosage-sensitive regulators. Nat Genet. 493: 387–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Berta P, Hawkins JB, Sinclair AH, Taylor A, Griffiths BL, Goodfellow PN, Fellous M.. 1990. Genetic evidence equating SRY and the testis-determining factor. Nature 3486300: 448–450. [DOI] [PubMed] [Google Scholar]
  10. Bhaskar A, Song YS.. 2014. Descartes’ rule of signs and the identifiability of population demographic models from genomic variation data. Ann Stat. 426: 2469–2493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bhattacharyya T, Gregorova S, Mihola O, Anger M, Sebestova J, Denny P, Simecek P, Forejt J.. 2013. Mechanistic basis of infertility of mouse intersubspecific hybrids. Proc Natl Acad Sci U S A. 1106: E468–E477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bishop CE, Boursot P, Baron B, Bonhomme F, Hatat D.. 1985. Most classical Mus musculus domesticus laboratory mouse strains carry a Mus musculus musculus Y chromosome. Nature 3156014: 70–72. [DOI] [PubMed] [Google Scholar]
  13. Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 4787369: 343–348. [DOI] [PubMed] [Google Scholar]
  14. Bray NL, Pimentel H, Melsted P, Pachter L.. 2016. Near-optimal probabilistic RNA-seq quantification. Nat Biotech. 348: 525–527. [DOI] [PubMed] [Google Scholar]
  15. Bulatova N, Kotenkova E.. 1990. Variants of the Y-chromosome in sympatric taxa of Mus in southern USSR. Boll Zool. 574: 357–360. [Google Scholar]
  16. Burgoyne PS, Mahadevaiah SK, Sutcliffe MJ, Palmer SJ.. 1992. Fertility in mice requires X-Y pairing and a Y-chromosomal “Spermiogenesis” gene mapping to the long arm. Cell 713: 391–398. [DOI] [PubMed] [Google Scholar]
  17. Campbell P, Good JM, Nachman MW.. 2013. Meiotic sex chromosome inactivation is disrupted in sterile hybrid male house mice. Genetics 1933: 819–828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Campbell P, Nachman MW.. 2014. X-y interactions underlie sperm head abnormality in hybrid male house mice. Genetics 1964: 1231–1240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Case LK, Wall EH, Osmanski EE, Dragon JA, Saligrama N, Zachary JF, Lemos B, Blankenhorn EP, Teuscher C.. 2015. Copy number variation in Y chromosome multicopy genes is linked to a paternal parent-of-origin effect on CNS autoimmune disease in female offspring. Genome Biol. 161: 28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Charlesworth B, Coyne JA, Barton NH.. 1987. The relative rates of evolution of sex chromosomes and autosomes. Am Nat. 1301: 113–146. [Google Scholar]
  21. Cocquet J, Ellis PJI, Mahadevaiah SK, Affara NA, Vaiman D, Burgoyne PS.. 2012. A genetic basis for a postmeiotic X versus Y chromosome intragenomic conflict in the mouse. PLoS Genet. 89: e1002900. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cocquet J, Ellis PJI, Yamauchi Y, Mahadevaiah SK, Affara NA, Ward MA, Burgoyne PS.. 2009. The multicopy gene sly represses the sex chromosomes in the male mouse germline after meiosis. PLoS Biol. 711: e1000244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Cocquet J, Ellis PJI, Yamauchi Y, Riel JM, Karacs TPS, Rattigan Á, Ojarikre OA, Affara NA, Ward MA, Burgoyne PS.. 2010. Deficiency in the multicopy Sycp3-like X-linked genes Slx and Slxl1 causes major defects in spermatid differentiation. Mol Biol Cell. 2120: 3497–3505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Consortium CC. 2012. The genome architecture of the collaborative cross mouse genetic reference population. Genetics 1902: 389–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Conway SJ, Mahadevaiah SK, Darling SM, Capel B, Rattigan AM, Burgoyne PS.. 1994. Y353/B: a candidate multiple-copy spermiogenesis gene on the mouse Y chromosome. Mamm Genome 54: 203–210. [DOI] [PubMed] [Google Scholar]
  26. Cortez D, Marin R, Toledo-Flores D, Froidevaux L, Liechti A, Waters PD, Grützner F, Kaessmann H.. 2014. Origins and functional evolution of Y chromosomes across mammals. Nature 5087497: 488–493. [DOI] [PubMed] [Google Scholar]
  27. Csilléry K, François O, Blum MGB.. 2012. abc: an R package for approximate Bayesian computation (ABC). Methods Ecol Evol. 33: 475–479. [Google Scholar]
  28. Derome N, Métayer K, Montchamp-Moreau C, Veuille M.. 2004. Signature of selective sweep associated with the evolution of sex-ratio drive in Drosophila simulans. Genetics 1663: 1357–1366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Din W, Anand R, Boursot P, Darviche D, Dod B, Jouvin-Marche E, Orth A, Talwar G, Cazenave PA, Bonhomme F.. 1996. Origin and radiation of the house mouse: clues from nuclear genes. J Evol Biol. 95: 519–539. [Google Scholar]
  30. Doran AG, Wong K, Flint J, Adams DJ, Hunter KW, Keane TM.. 2016. Deep genome sequencing and variation analysis of 13 inbred mouse strains defines candidate phenotypic alleles, private variation and homozygous truncating mutations. Genome Biol. 171: 167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Drost JB, Lee WR.. 1995. Biological basis of germline mutation: comparisons of spontaneous germline mutation rates among drosophila, mouse, and human. Environ Mol Mutagen. 25(S2): 48–64. [DOI] [PubMed] [Google Scholar]
  32. Egan CM, Sridhar S, Wigler M, Hall IM.. 2007. Recurrent DNA copy number variation in the laboratory mouse. Nat Genet. 3911: 1384–1389. [DOI] [PubMed] [Google Scholar]
  33. Eicher EM, Hutchison KW, Phillips SJ, Tucker PK, Lee BK.. 1989. A repeated segment on the mouse Y chromosome is composed of retroviral-related, Y-enriched and Y-specific sequences. Genetics 1221: 181–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Ellegren H. 2011. Sex-chromosome evolution: recent progress and the influence of male and female heterogamety. Nat Rev Genet. 123: 157–166. [DOI] [PubMed] [Google Scholar]
  35. Ellis PJI, Bacon J, Affara NA.. 2011. Association of Sly with sex-linked gene amplification during mouse evolution: a side effect of genomic conflict in spermatids? Hum Mol Genet. 2015: 3010–3021. [DOI] [PubMed] [Google Scholar]
  36. Ellis PJI, Clemente EJ, Ball P, Touré A, Ferguson L, Turner JMA, Loveland KL, Affara NA, Burgoyne PS.. 2005. Deletions on mouse Yq lead to upregulation of multiple X- and Y-linked transcripts in spermatids. Hum Mol Genet. 1418: 2705–2715. [DOI] [PubMed] [Google Scholar]
  37. Fischer M, Kosyakova N, Liehr T, Dobrowolski P.. 2016. Large deletion on the Y-chromosome long arm (Yq) of C57bl/6jbomtac inbred mice. Mamm Genome. 28: 1–7. [DOI] [PubMed] [Google Scholar]
  38. Forejt J. 1996. Hybrid sterility in the mouse. Trends Genet. 1210: 412–417. [DOI] [PubMed] [Google Scholar]
  39. Forejt J, Iványi P.. 1974. Genetic studies on male sterility of hybrids between laboratory and wild mice (Mus musculus L.). Genet Res. 242: 189–206. [DOI] [PubMed] [Google Scholar]
  40. Geraldes A, Basset P, Gibson B, Smith KL, Harr B, Yu HT, Bulatova N, Ziv Y, Nachman MW.. 2008. Inferring the history of speciation in house mice from autosomal, X-linked, Y-linked and mitochondrial genes. Mol Ecol. 1724: 5349–5363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Good JM, Dean MD, Nachman MW.. 2008. A complex genetic basis to X-linked hybrid male sterility between two species of house mice. Genetics 1794: 2213–2228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Good JM, Giger T, Dean MD, Nachman MW.. 2010. Widespread over-expression of the X chromosome in sterile F1 hybrid mice. PLoS Genet. 69: e1001148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Good JM, Nachman MW.. 2005. Rates of protein evolution are positively correlated with developmental timing of expression during mouse spermatogenesis. Mol Biol Evol. 224: 1044–1052. [DOI] [PubMed] [Google Scholar]
  44. Graves JAM. 2006. Sex chromosome specialization and degeneration in mammals. Cell 1245: 901–914. [DOI] [PubMed] [Google Scholar]
  45. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD.. 2009. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 510: e1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Haller BC, Messer PW.. 2017. SLiM 2: flexible, interactive forward genetic simulations. Mol Biol Evol. 341: 230–240. [DOI] [PubMed] [Google Scholar]
  47. Halligan DL, Kousathanas A, Ness RW, Harr B, Eöry L, Keane TM, Adams DJ, Keightley PD.. 2013. Contributions of protein-coding and regulatory change to adaptive molecular evolution in murid rodents. PLoS Genet. 912: e1003995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Harr B, Karakoc E, Neme R, Teschke M, Pfeifle C, Pezer Ž, Babiker H, Linnenbrink M, Montero I, Scavetta R, et al. 2016. Genomic resources for wild populations of the house mouse, Mus musculus and its close relative Mus spretus. Sci Data. 3: 160075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Hendriksen PJ, Hoogerbrugge JW, Themmen AP, Koken MH, Hoeijmakers JH, Oostra BA, van der Lende T, Grootegoed JA.. 1995. Postmeiotic transcription of X and Y chromosomal genes during spermatogenesis in the mouse. Dev Biol. 1702: 730–733. [DOI] [PubMed] [Google Scholar]
  50. Hudson RR, Kaplan NL.. 1995. The coalescent process and background selection. Philos Trans R Soc Lond B Biol Sci. 3491327: 19–23. [DOI] [PubMed] [Google Scholar]
  51. Hughes JF, Page DC.. 2015. The biology and evolution of mammalian Y chromosomes. Ann Rev Genet. 49: 507–527. [DOI] [PubMed] [Google Scholar]
  52. Hvilsom C, Qian Y, Bataillon T, Li Y, Mailund T, Sallé B, Carlsen F, Li R, Zheng H, Jiang T, et al. 2012. Extensive X-linked adaptive evolution in central chimpanzees. Proc Natl Acad Sci U S A. 1096: 2054–2059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Itsara A, Wu H, Smith JD, Nickerson DA, Romieu I, London SJ, Eichler EE.. 2010. De novo rates and selection of large copy number variation. Genome Res. 2011: 1469–1481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Jaenike J. 2001. Sex chromosome meiotic drive. Annu Rev Ecol Syst. 321: 25–49. [Google Scholar]
  55. Keane TM, Goodstadt L, Danecek P, White MA, Wong K, Yalcin B, Heger A, Agam A, Slater G, Goodson M, et al. 2011. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 4777364: 289–294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Kingan SB, Garrigan D, Hartl DL.. 2010. Recurrent selection on the winters sex-ratio genes in Drosophila simulans. Genetics 1841: 253–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Korneliussen TS, Albrechtsen A, Nielsen R.. 2014. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15: 356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Kousathanas A, Halligan DL, Keightley PD.. 2014. Faster-X adaptive protein evolution in house mice. Genetics 1964: 1131–1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Lahn BT, Page DC.. 1997. Functional coherence of the human Y chromosome. Science 2785338: 675–680. [DOI] [PubMed] [Google Scholar]
  60. Larson EL, Vanderpool D, Keeble S, Zhou M, Sarver BAJ, Smith AD, Dean MD, Good JM.. 2016. Contrasting levels of molecular evolution on the mouse X chromosome. Genetics 2034: 1841–1857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Larson EL, Keeble S, Vanderpool D, Dean MD, Good JM.. 2017. The composite regulatory basis of the large X-effect in mouse speciation. Mol Biol Evol. 34(2): 282–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Li H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv: 13033997.
  63. Liu KJ, Steinberg E, Yozzo A, Song Y, Kohn MH, Nakhleh L.. 2015. Interspecific introgressive origin of genomic diversity in the house mouse. Proc Natl Acad Sci U S A. 1121: 196–201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. MacBride MM, Navis A, Dasari A, Perez AV.. 2017. Mild reproductive impact of a Y chromosome deletion on a C57bl/6j substrain. Mamm Genome. 28(5–6): 155–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Macholán M, Baird SJ, Munclinger P, Dufková P, Bímová B, Piálek J.. 2008. Genetic conflict outweighs heterogametic incompatibility in the mouse hybrid zone? BMC Evol Biol. 8: 271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. McDonald JH, Kreitman M.. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 3516328: 652–654. [DOI] [PubMed] [Google Scholar]
  67. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 209: 1297–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. McLaren A, Simpson E, Epplen JT, Studer R, Koopman P, Evans EP, Burgoyne PS.. 1988. Location of the genes controlling H-Y antigen expression and testis determination on the mouse Y chromosome. Proc Natl Acad Sci U S A. 8517: 6442–6445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Moretti C, Vaiman D, Tores F, Cocquet J.. 2016. Expression and epigenomic landscape of the sex chromosomes in mouse post-meiotic male germ cells. Epigenetics Chromatin 9: 47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Morgan AP, Didion JP, Doran AG, Holt JM, McMillan L, Keane TM, Villena FPMd.. 2016. Genome report: whole genome sequence of two wild-derived Mus musculus domesticus inbred strains, LEWES/EiJ and ZALENDE/EiJ, with different diploid numbers. G3 612: 4211–4216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Morgan AP, Gatti DM, Najarian ML, Keane TM, Galante RJ, Pack AI, Mott R, Churchill GA, Villena FPMd.. 2017. Structural variation shapes the landscape of recombination in mouse. Genetics 2062: 603–619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Morgan AP, Holt JM, McMullan RC, Bell TA, Clayshulte AM-F, Didion JP, Yadgary L, Thybert D, Odom DT, Flicek P, et al. 2016. The evolutionary fates of a large segmental duplication in mouse. Genetics 2041: 267–285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Mueller JL, Mahadevaiah SK, Park PJ, Warburton PE, Page DC, Turner JMA.. 2008. The mouse X chromosome is enriched for multicopy testis genes showing postmeiotic expression. Nat Genet. 406: 794–799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Mueller JL, Skaletsky H, Brown LG, Zaghlul S, Rock S, Graves T, Auger K, Warren WC, Wilson RK, Page DC.. 2013. Independent specialization of the human and mouse X chromosomes for the male germ line. Nat Genet. 459: 1083–1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Nagamine CM, Nishioka Y, Moriwaki K, Boursot P, Bonhomme F, Lau YFC.. 1992. The musculus-type Y chromosome of the laboratory mouse is of Asian origin. Mamm Genome. 32: 84–91. [DOI] [PubMed] [Google Scholar]
  76. Nam K, Munch K, Hobolth A, Dutheil JY, Veeramah KR, Woerner AE, Hammer MF, Mailund T, Schierup MH.. 2015. Extreme selective sweeps independently targeted the X chromosomes of the great apes. Proc Natl Acad Sci U S A. 11220: 6413–6418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Neme R, Tautz D.. 2016. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 5: e09977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Nishioka Y, Lamothe E.. 1986. Isolation and characterization of a mouse Y chromosomal repetitive sequence. Genetics 1132: 417–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Oh B, Hwang SY, Solter D, Knowles BB.. 1997. Spindlin, a major maternal transcript expressed in the mouse during the transition from oocyte to embryo. Development 124: 493–503. [DOI] [PubMed] [Google Scholar]
  80. Orth A, Belkhir K, Britton-Davidian J, Boursot P, Benazzou T, Bonhomme F.. 2002. Natural hybridization between 2 sympatric species of mice, Mus musculus domesticus L. and Mus spretus Lataste. C R Biol. 3252: 89–97. [DOI] [PubMed] [Google Scholar]
  81. Payseur BA, Krenz JG, Nachman MW.. 2004. Differential patterns of introgression across the X chromosome in a hybrid zone between two species of house mice. Evolution 589: 2064–2078. [DOI] [PubMed] [Google Scholar]
  82. Polanski A, Szczesna A, Garbulowski M, Kimmel M.. 2017. Coalescence computations for large samples drawn from populations of time-varying sizes. PLoS One 122: e0170701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Pool JE, Nielsen R.. 2007. Population size changes reshape genomic patterns of diversity. Evolution 6112: 3001–3006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Pritchard JK, Seielstad MT, Perez-Lezaun A, Feldman MW.. 1999. Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol Biol Evol. 1612: 1791–1798. [DOI] [PubMed] [Google Scholar]
  85. Repping S, van Daalen SKM, Brown LG, Korver CM, Lange J, Marszalek JD, Pyntikova T, van der Veen F, Skaletsky H, Page DC, et al. 2006. High mutation rates have driven extensive structural polymorphism among human Y chromosomes. Nat Genet. 384: 463–467. [DOI] [PubMed] [Google Scholar]
  86. Rice WR. 1984. Sex chromosomes and the evolution of sexual dimorphism. Evolution 384: 735–742. [DOI] [PubMed] [Google Scholar]
  87. Salcedo T, Geraldes A, Nachman MW.. 2007. Nucleotide variation in wild and inbred mice. Genetics 1774: 2277–2291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Sarver BAJ, Keeble S, Cosart T, Tucker PK, Dean MD, Good JM.. 2017. Phylogenomic insights into mouse evolution using a pseudoreference approach. Genome Biol Evol. 93: 726–739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Sayres MAW, Lohmueller KE, Nielsen R.. 2014. Natural selection reduced diversity on human Y chromosomes. PLOS Genet. 101: e1004064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Scally A. 2016. Mutation rates and the evolution of germline structure. Philos Trans R Soc Lond B Biol Sci. 371:20150137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Simmons LW, Fitzpatrick JL.. 2012. Sperm wars and the evolution of male fertility. Reproduction 1445: 519–534. [DOI] [PubMed] [Google Scholar]
  92. Smith NGC, Eyre-Walker A.. 2002. Adaptive protein evolution in Drosophila. Nature 4156875: 1022–1024. [DOI] [PubMed] [Google Scholar]
  93. Soh YQS, Alföldi J, Pyntikova T, Brown LG, Graves T, Minx PJ, Fulton RS, Kremitzki C, Koutseva N, Mueller JL, et al. 2014. Sequencing the mouse Y chromosome reveals convergent gene acquisition and amplification on both sex chromosomes. Cell 1594: 800–813. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Song Y, Endepols S, Klemann N, Richter D, Matuschka FR, Shih CH, Nachman MW, Kohn MH.. 2011. Adaptive introgression of anticoagulant rodent poison resistance by hybridization between old world mice. Curr Biol. 2115: 1296–1301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Spiess AN, Walther N, Müller N, Balvers M, Hansis C, Ivell R.. 2003. SPEER: a new family of testis-specific genes from the mouse. Biol Reprod. 686: 2044–2054. [DOI] [PubMed] [Google Scholar]
  96. Srivastava A, Morgan AP, Najarian ML, Sarsani VK, Sigmon JS, Shorter JR, Kashfeen A, McMullan RC, Williams LH, Giusti-Rodríguez P, et al. 2017. Genomes of the mouse collaborative cross. Genetics 2062: 537–556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Storchová R, Gregorová S, Buckiová D, Kyselová V, Divina P, Forejt J.. 2004. Genetic analysis of X-linked hybrid sterility in the house mouse. Mamm Genome. 157: 515–524. [DOI] [PubMed] [Google Scholar]
  98. Styrna J, Klag J, Moriwaki K.. 1991. Influence of partial deletion of the Y chromosome on mouse sperm phenotype. J Reprod Fertil. 921: 187–195. [DOI] [PubMed] [Google Scholar]
  99. Teeter KC, Payseur BA, Harris LW, Bakewell MA, Thibodeau LM, O'Brien JE, Krenz JG, Sans-Fuentes MA, Nachman MW, Tucker PK.. 2008. Genome-wide patterns of gene flow across a house mouse hybrid zone. Genome Res. 181: 67–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Tishkoff SA, Reed FA, Ranciaro A, Voight BF, Babbitt CC, Silverman JS, Powell K, Mortensen HM, Hirbo JB, Osman M, et al. 2007. Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet. 391: 31–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Torgerson DG, Singh RS.. 2003. Sex-linked mammalian sperm proteins evolve faster than autosomal ones. Mol Biol Evol. 2010: 1705–1709. [DOI] [PubMed] [Google Scholar]
  102. Touré A, Clemente EJ, Ellis P, Mahadevaiah SK, Ojarikre OA, Ball PAF, Reynard L, Loveland KL, Burgoyne PS, Affara NA.. 2005. Identification of novel Y chromosome encoded transcripts by testis transcriptome analysis of mice with deletions of the Y chromosome long arm. Genome Biol. 6: R102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Touré A, Szot M, Mahadevaiah SK, Rattigan Á, Ojarikre OA, Burgoyne PS.. 2004. A new deletion of the mouse Y chromosome long arm associated with the loss of Ssty expression, abnormal sperm development and sterility. Genetics 1662: 901–912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Tu S, Shin Y, Zago WM, States BA, Eroshkin A, Lipton SA, Tong GG, Nakanishi N.. 2007. Takusan: a large gene family that regulates synaptic activity. Neuron 551: 69–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Tucker PK, Lee BK, Lundrigan BL, Eicher EM.. 1992. Geographic origin of the Y chromosomes in “old” inbred strains of mice. Mamm Genome. 35: 254–261. [DOI] [PubMed] [Google Scholar]
  106. Turner LM, Schwahn DJ, Harr B.. 2012. Reduced male fertility is common but highly variable in form and severity in a natural house mouse hybrid zone. Evolution 662: 443–458. [DOI] [PubMed] [Google Scholar]
  107. Turner LM, White MA, Tautz D, Payseur BA.. 2014. Genomic networks of hybrid sterility. PLOS Genet 102: e1004162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Uchimura A, Higuchi M, Minakuchi Y, Ohno M, Toyoda A, Fujiyama A, Miura I, Wakana S, Nishino J, Yagi T.. 2015. Germline mutation rates and the long-term phenotypic effects of mutation accumulation in wild-type laboratory mice and mutator mice. Genome Res. 258: 1125–1134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Veeramah KR, Gutenkunst RN, Woerner AE, Watkins JC, Hammer MF.. 2014. Evidence for increased levels of positive and negative selection on the X chromosome versus autosomes in humans. Mol Biol Evol. 319: 2267–2282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Webster TH, Wilson Sayres MA.. 2016. Genomic signatures of sex-biased demography: progress and prospects. Curr Opin Genet Dev. 41: 62–71. [DOI] [PubMed] [Google Scholar]
  111. Yakimenko LV, Korobitsyna KV, Frisman LV, Muntianu AI.. 1990. Cytogenetic and biochemical comparison of Mus musculus and Mus hortolanus. Experientia 4610: 1075–1077. [DOI] [PubMed] [Google Scholar]
  112. Yamauchi Y, Riel JM, Stoytcheva Z, Burgoyne PS, Ward MA.. 2010. Deficiency in mouse Y chromosome long arm gene complement is associated with sperm DNA damage. Genome Biol. 116: R66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Yamauchi Y, Riel JM, Wong SJ, Ojarikre OA, Burgoyne PS, Ward MA.. 2009. Live offspring from mice lacking the Y chromosome long arm gene complement. Biol Reprod. 812: 353–361. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, Domany E, et al. 2005. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 215: 650–659. [DOI] [PubMed] [Google Scholar]
  115. Yang H, Wang JR, Didion JP, Buus RJ, Bell TA, Welsh CE, Bonhomme F, Yu AH-T, Nachman MW, Pialek J, et al. 2011. Subspecific origin and haplotype diversity in the laboratory mouse. Nat Genet. 437: 648–655. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Molecular Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES