Abstract
The human X and Y chromosomes evolved from a pair of autosomes approximately 180 million years ago. Despite their shared evolutionary origin, extensive genetic decay has resulted in the human Y chromosome losing 97% of its ancestral genes while gene content and order remain highly conserved on the X chromosome. Five ‘stratification’ events, most likely inversions, reduced the Y chromosome’s ability to recombine with the X chromosome across the majority of its length and subjected its genes to the erosive forces associated with reduced recombination. The remaining functional genes are ubiquitously expressed, functionally coherent, dosage-sensitive genes, or have evolved male-specific functionality. It is unknown, however, whether functional specialization is a degenerative phenomenon unique to sex chromosomes, or if it conveys a potential selective advantage aside from sexual antagonism. We examined the evolution of mammalian orthologs to determine if the selective forces that led to the degeneration of the Y chromosome are unique in the genome. The results of our study suggest these forces are not exclusive to the Y chromosome, and chromosomal degeneration may have occurred throughout our evolutionary history. The reduction of recombination could additionally result in rapid fixation through isolation of specialized functions resulting in a cost-benefit relationship during times of intense selective pressure.
Subject terms: Genomics, Genome evolution, Genetics, Evolutionary biology
Introduction
The human Y chromosome has lost its ability to recombine with its once homologous partner, the X chromosome, except in its pseudoautosomal regions (PARs) at the termini of the X and Y chromosomes1–3. This has resulted in the majority of the Y chromosome’s gene content being inherited as a unit, known as the human MSY (male-specific region of the Y chromosome)4. Suppression of recombination occurred at five discrete time points, probably caused by inversions, that integrated each segment into the MSY and initiated the degradative processes5 that resulted in wide-spread gene deletion and loss1,2,6,7. These evolutionary strata show a continuum of degeneration that is highly correlated with the age of X-Y gene pairs within each stratum1,2,4. The oldest of which contains only four remaining genes, including the sex-determining factor SRY8. The degenerative nature of the Y chromosome has led some researchers to suggest it may lose all functional genes and become extinct in as little as 5 million years8–10, an evolutionary phenomenon that has been observed in other species11–13. Recent research, however, suggests that the Y chromosome has maintained a stable assortment of genes for the last 25 million years3,14,15 through effective purifying selection on single-copy genes16, and intrachromosomal gene conversion of ampliconic sequences17–21. Despite conflicting views on the terminal fate of the Y chromosome, functional specialization and biased gene retention22–24 on the Y chromosome is believed to be unique in the genome25 and may have played an essential role in Y chromosome degeneration.
The remaining functional genes in the human MSY fall into three classes: X-degenerate, ampliconic, and X-transposed3,4,26. The X-transposed sequences are a result of an X-to-Y transposition that occurred after the divergence of the human and chimpanzee lineages, approximately 3-4 million years ago4,26. These sequences remain 99% identical to their X counterparts4. In contrast, the X-degenerate sequences are single-copy MSY genes that are surviving relics of the ancestral autosomes from which the sex chromosomes evolved4. With the notable exception of SRY, these genes are functionally coherent25, and ubiquitously expressed1,14,17. Their homologous X counterparts also disproportionately escape X-inactivation and are subject to stronger purifying selection than other X-linked genes17. Thus, researchers have suggested that this class of sequences is dosage-sensitive and potentially haplolethal17. The last class of functional genes in the MSY consists of nine protein-coding gene families that have undergone various levels of amplification4. Unlike the ubiquitously expressed X-degenerate genes, the ampliconic gene families are expressed primarily or exclusively in the testes4,18 and rely on intrachromosomal gene conversion to offset the degenerative nature of the MSY18–21. Surviving Y-linked genes were therefore retained through two evolutionary mechanisms: effective purifying selection on single copy dosage-sensitive genes16 and intrachromosomal gene conversion of ampliconic sequences17–21.
Wide-spread gene loss accompanied by preferential retention appears to be a unique phenomenon. A review of genomic evolution, however, suggests that these trends are not unique to the Y chromosome, with the relevant literature rarely being cross-cited27. Recent research suggests that the ancestral vertebrate karyotype was much larger than previously estimated, consisting of an estimated 54 chromosomes28 resulting from two ancestral whole-genome duplication (WGD) events28–31. The majority of genes following a WGD event are rapidly lost or pseudogenized due to loss of function mutations7,32–34. This loss has also been shown to continue on a power scale33,35. Consequently, a large portion of the ancestral vertebrate chromosomes has been subsequently lost through fusion in the descent of the human lineage28,31, explaining the apparent haphazard gene content of most autosomes1. Highly expressed genes36, dosage-sensitive protein complexes34,37, and transcriptional and developmental regulators and signal transducers, however, are preferentially retained30,33,38–40. Furthermore, these genes have been maintained through purifying selection37, a trend that has been observed in ubiquitously expressed genes throughout the genome41–45. The factors that led to the biased retention of ubiquitously expressed single-copy genes, therefore, appear not to be restricted to the evolutionary history of the Y chromosome and have been observed in the events following large scale duplications. The biased acquisition of male-advantage traits on the Y chromosome is a subject of more considerable ambiguity in the context of genomic duplications.
Subfunctionalization has been shown to increase the likelihood a gene will be preserved in duplicate due to partial loss of function mutations in both copies46. This targeted divergence of the duplicates may lead to differential tissue expression of the paralogs34,35,45,47–49 and has been proposed to occur frequently following WGD events50. If the remaining functions are under selective constraint, the duplicates will likely remain in the population47. A lack of genome-wide representation of subfunctionalized gene pairs, however, suggests that this may be a transition phase to neofunctionalization due to an absence of purifying selection on the redundant portions of the gene51, an evolutionary phenomenon known as the subneofunctionalization model52. In 2009, Wilson and Makova suggested that suppression of recombination could be thought of as a duplication event and showed X-Y genes followed similar patterns of evolution following recombination suppression as duplicated paralogs53,54. Following a review of experimental data, they also concluded that the acquisition of unique expression patterns and functions might have contributed to the retention of Y-linked genes. Strong expression reduction has also been implicated in the evolution of Y genes towards testis specificity55. The biased content of male reproductive genes on both sex chromosomes4,56–58, therefore, suggests that subfunctionalization of Y-linked genes could explain the initial retention and accelerated divergence of male-advantage genes, as new evolutionary features typically bear marks of their ancestry50.
The WGD events at the origin of the vertebrate lineage may have had significant impacts on biological complexity and evolutionary novelties of the time due to the large-scale increase in genetic redundancy38. The mechanisms by which this was achieved and the selective pressures resulting in differential chromosome survival remain unknown. If the evolutionary history of the Y chromosome provides a model of genomic evolution, it would suggest that large scale duplication events allowed genes to subfunctionalize and experience periods of relaxed purifying selection through relief from pleiotropic constraints that were operating on single-gene loci46. It has also been hypothesized that the Y chromosome’s long-term fragility may be driven by short-term selective pressures59, the most obvious of which is the accumulation of sexually antagonistic alleles in a non-recombining portion of the genome59–65, a phenomenon that is supported by the transposition of male-advantage genes into the MSY from autosomes1,3,4,14,66–68. The rapid evolution of male reproductive genes45,69–71 and the implication of inversions in local adaption7,72, however, suggest that functional isolation may become selectively favored even in the absence of sexually antagonistic traits under certain circumstances, despite the deleterious effects of reduced recombination. In order to test these hypotheses, and in lieu of the large amount of literature pertaining to expression, we analyzed the nonsynonymous to synonymous mutation rate () of 6,734 human genes with surviving mammalian orthologs in the context of their Gene Ontology (GO) annotations and chromosomal locations to determine if functional specialization and genomic isolation convey a selective advantage, respectively.
Results
Functional diversity of orthologs
Two primary paths to survival have occurred on the Y chromosome. Broadly expressed dosage-sensitive genes have been maintained through purifying selection16, while amplification and gene conversion have supported testis-specific genes17–21. Selection for conservation through amplification and gene conversion of testis-specific sequences and the rapid evolution of male reproductive genes45,69–71 suggest that adaptability may be a selected phenomenon. Evolution of testis-specific functions is also believed to have preceded amplification on the Y chromosome17, suggesting subfunctionalization may have facilitated their initial retention45 and subsequent divergence by relieving redundant portions of the genes from adaptive constraint.
To determine if this is a universal trend, we analyzed the divergence of human genes across their mammalian orthologs to provide a conservative estimate of the degree to which newly duplicated genes may diverge73 following subfunctionalization. Summary statistics can be found in Supplementary Table S1. The results of our analysis suggest that a human gene’s average across its related orthologs and number of GO annotations are positively skewed (skew = 3.92 & 3.06, respectively) (Supplementary Fig. S1). Additionally, average is zero-inflated. This suggests that the majority of orthologous genes are under purifying selection and related to a small set of functions. As a gene’s average value appears to be negatively associated with its number of GO annotations (Fig. 1), a gamma-hurdle model was employed (see methods) to determine the statistical significance of this relationship. Our results suggest that a gene’s average decreases with increasing numbers of functional annotations (p = ) (Supplementary Fig. S2), and the probability that it is entirely conserved increases (p = , odds-ratio = 1.011) (Supplementary Fig. S3).
The results of this analysis suggest that genes with high functional diversity are under more intensive purifying selection than their more functionally specific counterparts. These findings parallel those showing higher levels of purifying selection on broadly expressed essential genes throughout the genome41–45 as well as on the Y chromosome17 and suggest an association between the two. We conclude that functional specificity and reduced expression are associated with relaxed purifying selection, suggesting that subfunctionalization of duplicated paralogs could result in differential tissue expression34,35,45,47–49 and accelerated protein divergence.
Genomic isolation of functional annotations
Next, we were interested in determining if functional isolation can provide a selective advantage in the absence of sexual antagonism. The rapid evolution of male reproductive genes45,69– 71 and the implication of inversions in local adaption7,72 suggest that the localization of functionally related genes may accelerate protein divergence and facilitate adaptability. To determine if localization of genes related to a given function is associated with reduced purifying selection, we analyzed the average of GO annotation’s related sequence comparisons with respect to the genomic distribution of their related genes. Summary statistics can be found in Supplementary Table S2. GO annotations show positively skewed distributions for their number of associated genes (skew = 34.68), number of chromosome arms they are expressed on (skew = 2.58), and average (skew = 2.85) (Supplementary Fig. S4). This suggests that the majority of functional annotations we analyzed are carried out by a limited number of genes, are expressed in specific locations and under purifying selection. Their relationships with one another, however, suggest all three trends are not typically present at the same time.
A GO term’s average value appears to be negatively associated with the number of chromosome arms it is expressed on and its number of related genes (Fig. 2). Due to the positive skewed nature of these distributions (Supplementary Fig. S4), a gamma model with a log link was used to determine if a function’s number of related genes or expressed chromosome arms is significantly associated with its average . Increasing the number of genes or expressed chromosome arms related to a given function, however, increases the probability one value is non-zero. Zero average values in the context of this analysis, therefore, are not informative and were removed from the analysis, negating the need for a hurdle method. The two predictive variables were fit separately to determine their individual effects. The results of our analyses suggest a function’s average decreases with the number of chromosome arms it is expressed on (p = , Supplementary Fig. S5), however, a function’s number of related genes was non-significant (p = 0.05, Supplementary Fig. S6). This suggests that genomic isolation is more strongly associated with relaxed purifying selection than a function’s number of related genes. The non-significance of a function’s number of related genes additionally suggests that low values for annotations expressed on a large number of chromosome arms cannot be attributed to convergence to the genome-wide average alone.
We expect that the majority of functions related to a large number of genes and expressed throughout the genome are higher-level ontology functions. Genes that are beneficial in increased dosage, however, are preferentially retained following duplication events73,74. Thus, large scale duplications may have resulted in the stability of higher-level functions, while relieving more redundant duplicates from adaptive conflict. We conclude that functional isolation is associated with relaxed purifying selection on the genes related to that function, potentially through relief from background selection acting on more highly conserved linked sites75. This finding parallels the accelerated evolution of Y-linked genes following recombination suppression and suggests isolation of functions may accelerate sequence divergence of their related genes through relaxation of purifying selection. These findings provide only a modest estimate of the extent to which protein functions may diverge in isolation when recombination is suppressed or following a WGD event when genetic redundancy is at its peak.
Potential retention of functionally related haplogroups
A GO annotation’s number of associated genes also appears to increase exponentially with the number of chromosome arms it is expressed on (Fig. 3). This relationship was fit using gamma regression and a log link, the results of which were highly significant (p < 0.0005, Supplementary Fig. S7). For a GO annotation’s number of related genes to increase in this manner, the genes on a given chromosome arm must be moderately functionally related. This suggests that the retention of genes following large-scale duplication events may operate at the haplogroup level, a trend that is predicted due to the dosage-sensitive nature of protein complexes. Chromosomes enriched with blocks of functionally related genes that are beneficial in increased dosage would show the highest levels of gene retention. Thus, the functional coherence of the Y chromosome could be attributed to a lower content of functionally related haplogroups that were beneficial in increased dosage on the ancestral autosomes.
Biased retention of orthologs on existing chromosomes
The ancestral vertebrate chromosomes displayed substantial differences in gene number, potentially as a result of more significant gene deletion and loss on chromosomes with a smaller number of resulting genes28. This has led to speculation of systematic biases in the deletion of duplicates on a subset of chromosomes following rediploidization, which may have resulted in the chromosome’s eventual loss28. We were interested in determining if this systematic bias could be attributed to the gene content of the pre-duplicated chromosomes from which they were derived. The results of our chromosome analysis show human chromosome arms have normally distributed numbers of orthologous genes (Shapiro-Wilk 0.97, p = 0.34) and average values (Shapiro-Wilk 0.98, p = 0.72) (Supplementary Table S3 and Fig. S8), and that a chromosome arm’s number of genes and GO annotations are linearly related (p = , adjusted R2 = 0.985, Supplementary Figs. 9 and 10). Density of orthologs on existing chromosome arms, however, was found to be non-normally distributed (Shapiro-Wilk 0.912, p = 0.002). This is due to biased ancestral gene conservation on a subset of chromosomes. Despite differential average rates, selection at the chromosome arm level since the divergence of mammals does not appear to influence the number or density of orthologs on a given chromosome (Supplementary Fig. S11).
In contrast, we found that arms of chromosomes that have retained large clusters of genes resulting from the ancestral WGD events contain a disproportionate number of mammalian orthologs in our analysis. Approximately 35% of genes still exist in duplicate copies39, and several paralogs resulting from WGD events31,32 (also known as ohnologs) were retained as quadruplicates28. These include clusters containing the four HOX regions on chromosomes 2, 7, 12, and 17, as well as the MHC region on chromosome 6 containing ohnologs on chromosomes 1, 9, and 19 that are a result of single pre-duplicated regions28. The gene content of chromosomes 14 and 15 have also been shown to be almost entirely derived from individual pre-duplicated chromosomes28. The arms of these chromosomes show some of the higher levels of mammalian ortholog retention in our analysis, and chromosomes 17 and 19 have the highest ortholog densities in the genome (Table 1 and Supplementary Fig. S12).
Table 1.
Number of Genes | Number of GO Terms | Density | Avg Ka/Ks | |
---|---|---|---|---|
1p | 394 | 3057 | 3.19 | 0.14 |
1q | 354 | 3096 | 2.82 | 0.17 |
2q | 295 | 2710 | 1.99 | 0.14 |
11q | 290 | 2426 | 3.55 | 0.17 |
17q | 289 | 2630 | 4.97 | 0.15 |
5q | 272 | 2401 | 2.05 | 0.14 |
12q | 258 | 2369 | 2.64 | 0.12 |
15q | 231 | 2130 | 2.78 | 0.13 |
7q | 229 | 2352 | 2.31 | 0.16 |
10q | 228 | 2172 | 2.43 | 0.13 |
3q | 218 | 1950 | 2.03 | 0.14 |
6p | 200 | 1682 | 3.34 | 0.16 |
14q | 197 | 1881 | 2.19 | 0.12 |
3p | 197 | 2035 | 2.17 | 0.12 |
2p | 193 | 2015 | 2.06 | 0.14 |
6q | 192 | 1754 | 1.73 | 0.15 |
4q | 188 | 2099 | 1.34 | 0.14 |
9q | 178 | 1857 | 1.87 | 0.15 |
19q | 170 | 1782 | 5.24 | 0.15 |
8q | 168 | 1685 | 1.68 | 0.15 |
19p | 154 | 1362 | 5.88 | 0.14 |
11p | 136 | 1451 | 2.55 | 0.13 |
Xq | 128 | 1322 | 1.35 | 0.13 |
16p | 122 | 1029 | 3.32 | 0.14 |
13q | 121 | 1156 | 1.25 | 0.14 |
16q | 120 | 1296 | 2.24 | 0.13 |
17p | 115 | 1457 | 4.58 | 0.12 |
7p | 114 | 1213 | 1.90 | 0.12 |
22q | 112 | 1399 | 3.13 | 0.13 |
20q | 109 | 1296 | 3.00 | 0.12 |
12p | 99 | 1026 | 2.79 | 0.17 |
8p | 89 | 1085 | 1.97 | 0.15 |
18q | 89 | 1127 | 1.44 | 0.14 |
Xp | 80 | 894 | 1.31 | 0.13 |
4p | 74 | 878 | 1.48 | 0.13 |
10p | 67 | 790 | 1.68 | 0.11 |
5p | 64 | 787 | 1.31 | 0.15 |
9p | 62 | 865 | 1.44 | 0.19 |
20p | 58 | 838 | 2.06 | 0.15 |
21q | 49 | 542 | 1.41 | 0.14 |
18p | 29 | 455 | 1.57 | 0.12 |
Yq | 1 | 10 | 0.02 | 0.09 |
Yp | 1 | 7 | 0.10 | 0.16 |
To determine the extent to which these chromosomes remain functionally related, aside from their conserved gene families, we performed hierarchical clustering of the chromosome arms based on a weighted frequency (see methods) for each GO annotation on a given arm in our dataset. The results of our dendrogram (Fig. 4) indicate several trends in the functional relationships between genes of chromosome arms. The two top-level clusters appear to be differentiated based on the number of related functions retained on the chromosome arms (Table 1), a result that was expected given clustering with Euclidean distance. We additionally found lower level clustering of chromosome arms that include both the ancestral HOX and MHC regions. These include the clustering of chromosome 2q, 12q, and 17q, as well as 6pq, 19pq, and 9q. This suggests that the ancestral WGD events have had a profound impact on the retention and organization of mammalian orthologs throughout the human genome.
As stated earlier, specific classes of genes are preferentially retained following WGD events. Chromosomes that have maintained a large portion of their ancestral genes are therefore a result of the gene content and functional annotations of the pre-duplicated chromosomes from which they were derived. It has been hypothesized that the specialization of the Y chromosome is a result of the number of functional genes initially present on the ancestral autosomes57, a hypothesis supported by the low functional gene density of the X chromosome2,58. In our present analysis, we also found low ortholog density on both the X chromosome, and chromosomes that are orthologous to the chicken Z chromosome (chromosomes 5, 9, and 1858). However, we did not find as significant of a disparity in ortholog density between the X chromosome and the human genome-wide average (2.33) relative to overall gene density comparisons58, suggesting that ancestral gene density may be less heavily influenced by the invasion of interspersed repeats. Biased gene deletion following the ancestral WGD events resulting in low gene density on a subset of chromosomes suggests that several chromosomes were pre-adapted to specialize similar to the sex chromosomes. Furthermore, chromosomal rearrangements would be under less negative selection in these regions.
Discussion
Since its discovery, the perceived functional importance of the Y chromosome has grown exponentially within the scientific community and now may provide further insight into chromosomal evolution following the ancestral WGD events at the origin of the vertebrate lineage. Our present analysis, in conjunction with existing literature, has shown that evolutionary trends believed to be unique to the Y chromosome are observed in the events following large-scale duplications and are still present in mammalian ortholog comparisons. These include higher levels of purifying selection on functionally diverse, ubiquitously expressed genes41–45, as well as reduced purifying selection on genomically isolated protein functions. The biased distribution of ancestral mammalian genes on chromosomes primarily derived from single pre-duplication chromosomes additionally suggests that gene retention was dependent on the gene content of the ancestral chromosome from which they were derived, and this retention may persist over long periods of evolutionary time. The conservation of the functionally coherent, potentially haplolethal X-degenerate sequences through purifying selection16, the rapid evolution of ampliconic sequences expressed primarily in the testes14, and the pseudogenization and loss of redundant sequences are consistent with a large-scale duplication event. This suggests that the Y chromosome may serve as a model for chromosome evolution following a large-scale duplication event.
Examining suppression of recombination on the Y chromosome in the light of large-scale duplication events has essential implications for karyotypic evolution at the onset of the vertebrate lineage. It has been proposed that the ancestral WGD events contributed to the proliferation of vertebrates during the Cambrian period due to the increase in genetic variation and tolerance to environmental conditions38,76. Recent research suggests that all extant vertebrate karyotypes are descendants of an ancestral marine chordate consisting of 17 chromosome pairs that underwent two successive WGDs28. Rapid loss through the fusion of seven chromosomes between duplications and the loss of an additional five chromosomes following the second duplication resulted in an ancestral Amniota karyotype of 49 chromosomes with highly differential gene content28. The smaller size of extant genomes, therefore, suggests a consistent pattern of karyotype reduction following the ancestral WGD events28, and speciation rates have been shown to be strongly correlated with chromosomal evolution rates77.
Duplication events should occur at a fitness cost, and an optimal gene copy number should exist73. Duplicate genes, therefore, would be subjected to three potential evolutionary fates: retention of genes that are beneficial in increased dosage, inactivation of genes that are harmful in increased dosage, and a period of neutral evolution of redundant sequences. Consequently, the observed gene retention on a given chromosome could be attributed to its density of genes that were beneficial in an increased dosage and subsequently retained through purifying selection. Loss of chromosomes due to widespread gene inactivation of detrimental duplicates, however, should have occurred early and ubiquitously, contributing little to the evolutionary novelties and speciation observed at the time. Considering Y chromosome evolution as a potential model for other chromosomes following a large-scale duplication event and high selective pressure would suggest an alternative hypothesis: chromosomal rearrangements may have protected large regions of the genome from gene flow, which allowed isolated genes to diverge until complete reproductive barriers were formed78.
The specialization of SRY as the sex-determining factor appears to have played a significant role in X-Y divergence, as its emergence is correlated with the first stratification event that reduced recombination between the neo-sex chromosomes79. Single gene sex determination alone should not select for recombination suppression80. However, the presence of gonadal dysgenesis in XY individuals with an SRY deletion81 and sterility of XX individuals containing an inactivated copy of SRY82–87 suggests that multiple genes are required to produce fertile offspring. The accumulation of sexually antagonistic alleles in a non-recombining portion of the genome could have provided a sufficient selective advantage that outweighed the deleterious effects of reduced recombination due to their synergistic effects on fertility. Despite Mueller’s ratchet being implicated in the early stages of Y chromosome degeneration88, genetic decay due to strong positive selection resulting in hitchhiking events is believed to be responsible for its extensive divergence and continued degeneration88–91. This is supported by the stepwise repression of recombination1 and the correlation of Y-degeneration with levels of female promiscuity in related species comparisons3,14,15,92. This suggests that strong positive selection on mutually beneficial alleles at linked sites may drive recombination suppression to become selectively favored.
Ohno’s original model of genetic evolution suggesting that newly duplicated genes would be functionally redundant and able to escape purifying selection93 has mixed empirical evidence54,73. The events of large-scale duplications, however, create an environment in which newly duplicated genes or complexes that are beneficial in increased dosage may be retained through purifying selection, while the remainder of duplicates would show a continuum of redundancy. The scale of such duplications would allow a small subset of genes to achieve a beneficial mutation. If one mutation resulted in a novel function or further specialized a gene towards one of its respective functions, the likelihood of the gene being retained would increase46. Our analysis has also shown that increasing its functional specificity may relax purifying selection, resulting in further divergence. In the event that this new, beneficial mutation occurred on a highly redundant chromosome, the additional reduction of purifying selection due to isolation of a function in an environment with little to no background selection may selectively specialize the chromosome. The survival of the remaining neutrally evolving sequences on that chromosome would depend on their acquisition of functionally related beneficial mutations. If the chromosome bearing this specialized function captured a pair of alleles that together significantly increased the organism’s fitness, selection for recombination suppression may result in an inversion becoming prevalent in the population. As observed on the Y chromosome, the resulting chromosome would now contain a complex of genes maintained through purifying selection, as well as a subset of specialized genes that are rapidly evolving resulting in a period of extensive divergence from its homologous counterpart.
The probability that a new inversion captures an advantageous haplotype can be high72; however, for an inversion to become fixed when sexually antagonistic alleles are not present the selective advantage would have to strongly outweigh the negative fitness consequences of reduced recombination94. In 1973, Leigh Van Valen showed that the probability of extinction of a population was constant over time and suggested an evolutionary arms race where survival is dependent on a population’s ability to adapt to changing selective pressures95. During times of intense selective pressure, selection for rapid fixation of a highly advantageous haplotype may have driven recombination suppression to become selectively favored due to the reduced effective population size and increased fixation rate. Recombination suppression events, such as inversions, in the absence of sexual antagonism, would have markedly different evolutionary consequences. This is due in part to inversions only reducing recombination in heterozygotes7. If the inversion is driven to fixation, recombination would resume between the new homologous chromosomes. In isolated populations, this divergence from the ancestral chromosome may have been sufficient to create a reproductive barrier, such as in the divergence of ancestral Equus populations96. As evidenced by the Y chromosome, recombination suppression can also occur progressively80 and may be related to continued selection for newly introduced, functionally related, beneficial alleles. Subsequent inversions resulting from extended periods of intense selective pressure on the associated functions would continue to drive the degeneration of the chromosome through successive hitchhiking events, increasing its long-term fragility.
For the Y chromosome, or any significantly degraded chromosome to go extinct, its functions would need to be replaced elsewhere or no longer under selective constraint to prevent fitness consequences24. Relaxation of selection at the locally adapted sites (e.g., predator/prey adaption), however, would render the genes functionally inert, and selection for recombination resumption between ubiquitously expressed genes would result in fusion events becoming selectively favored. Thus, prolonged strong selection for specialization would have driven a subset of ancestral chromosomes to extinction. As this pertains to the fate of the human Y chromosome, continued selection for localization of sexually antagonistic traits, reduction of female promiscuity resulting in less intensive sexual selection, and its recent stability may suggest it is here to stay. Its continued survival in species still experiencing strong sexual selection, however, may be suspect.
It is worth noting that an apparent contradiction in this logic is the ZW sex-determining chromosomes in avian lineages in which females are the heterogametic sex. Similar to the evolution of the Y chromosome, suppression of recombination in the W chromosome has resulted in significant degeneration of its ancestral gene content97. Those that remain functionally active have been shown to be ubiquitously expressed and are believed to be essential to both sexes97. In contrast, the W chromosome lacks genes coding for female-advantage traits97, suggesting that selection for specialization has not resulted in its degeneration. To reduce the recombination between mutually beneficial, sexually antagonistic alleles, however, an inversion can occur in either chromosome80. DMRT1 has also been implicated as the sex-determining locus in the ZW system and is present only on the Z chromosome98,99, suggesting testes development may function through a dosage-dependent mechanism. In conjunction with a lack of dosage-compensation observed in the ZW system100, the degeneration of the W chromosome is still a result of male-driven positive selection, only on the opposite chromosome.
The chicken Z has been found to be orthologous to portions of human chromosomes 5, 9, and 18 while the human X is orthologous to chicken chromosomes 1 and 458,97. Due to a lack of structural similarity with their respective orthologous regions, researchers have suggested the chicken Z and human X chromosome were not predisposed to become sex chromosomes and their low gene density is a result of convergent evolution58. Chromosomes 5, 9, and 18, however, show similar levels of ortholog density as the X chromosome in our analysis, as well as, some of the most substantial differences in mammalian ortholog content between the arms of individual chromosomes. This phenomenon is most likely a result of the arms being derived from different ancestral chromosomes that underwent fusion events28. Chromosome 9p also contains the ortholog of the believed avian sex-determining locus DMRT1 and is functionally related to the short arm of the X chromosome in our analysis. This may indicate that the convergent evolution of the sex chromosomes is a result of the differential fusion of ancestral chromosomes containing sex-related genes, and the process of sex chromosome evolution further lowered overall gene density to their current state. However, further research needs to be conducted to determine the significance of this relationship.
The results of our analysis in the context of existing literature present a model by which chromosomes, and therefore populations, rapidly evolved at the onset of the vertebrate lineage. The large-scale duplication events allowed a subset of genes to subfunctionalize, thereby reducing pleiotropic constraints and accelerating evolutionary rates. The isolation of these genes on redundant chromosomes further relieved purifying selection, resulting in a period of rapid chromosomal evolution and divergence due to specialization. If this divergence alone did not create a reproductive barrier, the chromosome’s eventual loss due to a change in adaptive pressures would have resulted in differential karyotypes of isolated populations. Thus, the extinction of chromosomes due to specialization is not unique to the Y chromosome, or sex chromosomes in general.
Methods
In the present analysis, we examined the divergence of human genes from their mammalian orthologs with respect to their GO annotations and chromosomal locations. For mammalian genes, the probability a newly arisen nonsynonymous mutation is fixed, relative to what is expected under neutrality, is resolved by the strength of selection101. The nonsynonymous to synonymous mutation rate ( ratio) is commonly utilized as a measurement of this selective strength, with low values suggesting strong purifying selection and high values indicating relaxed purifying selection and/or positive selection101. By definition, human orthologs are identical by descent102 with at least one other species in our analysis. The use of mammalian ortholog comparisons in conjunction with GO annotations, therefore, allowed us to analyze how gene protein functions influence the patterns of selection that led to the differential divergence of ancestral mammalian genes.
Data collection
Divergence data was collected from The Searchable Prototype Experimental Evolutionary Database (SPEED)103. SPEED contains orthologous sequence comparisons of nine species including human (Homo sapiens), chimp (Pan troglodytes), rhesus macaque (Macaca mulatta), mouse (Mus musculus), rat (Rattus norvegicus), dog (Canis familiaris), cow (Bos Taurus), opossum (Momodelphis domestica), and chicken (Gallus gallus) as a true outgroup. Methodology on the identification of orthologous groups and calculation of divergence data can be found elsewhere103.
Orthologous sequence pairs were queried from SPEED for genetic summary information and their related values. Data cleaning was performed using PySpark (Spark version 2.3.1, Python version 2.7.10). Sequence comparisons containing a value of zero or less due to computational error were removed from the analysis. Any sequence comparison set with unusually low values were removed, as they gave spuriously high values. Where multiple comparisons existed, divergence data inconsistencies were resolved by computing a zero-corrected harmonic mean; therefore, more significant weight was given to conservative estimates104, and comparisons containing at least one zero value were assigned a value of zero. Lastly, sequence comparisons that did not include a human comparison with an associated gene name and chromosomal location were excluded. Our resulting dataset included a total of 68,006 comparisons across 10,849 genes.
Gene ontology information was collected from the European Bioinformatics Institute105. The most recent version of human gene ontology annotations (9/19/19) was downloaded and joined to their respective genes. The dataset included 19,395 genes and 18,211 GO terms. The validity of GO terms with IEA (Inferred from Electronic Annotation) evidence codes has been questioned due to their inferential nature106. The quality of IEA terms, however, has significantly improved and rival those inferred by curators107. To alleviate potentially biased numbers of GO annotations on well-studied genes, IEA terms were also retained. After joining with the ortholog dataset and removing genes that lacked annotation, our final dataset included 6,734 annotated genes across 14,121 GO terms. IEA, IDA (inferred from direct assay), ISS (Inferred from Sequence or structural Similarity), IBA (Inferred from Biological aspect of Ancestor), IMP (Inferred from Mutant Phenotype) and TAS (Traceable Author Statement) evidence codes were the primary methods of annotation in our dataset at 30.2%, 20.96%, 13.2%, 12.5%, 10.3%, and 7.3%, respectively.
Data preparation
Single value human gene rates were derived by averaging their respective values across all species comparisons present in the dataset. GO annotation values were obtained by averaging the values of all related genes across all species comparisons. Chromosome arm values were calculated by averaging the values of all genes present on the respective chromosome arm across all species comparisons. Ortholog density was calculated by dividing the number of orthologs present on a given chromosome arm by arm size in Mb. Lastly, the chromosome arms and their related GO annotations were cross-tabulated to obtain the number of times a given function occurs on each arm. Due to their hierarchical nature, GO terms can be broad106. This issue was addressed in a context dependent manner for each analysis. Prior to clustering the chromosome arms based on functional relatedness of their genes, the number of times each GO annotation occurred on each chromosome arm was weighted using an algorithm adapted from Martinez and Reyes-Valdés108. We considered the average frequency of the GO term among j chromosome arms as,
1 |
and defined GO term specificity as the information that its expression provides about the identity of the chromosome arm as
2 |
will give zero if the GO term is expressed on all chromosome arms and max (t) if the function is exclusively expressed on a single chromosome arm. We then assigned a weighted frequency for each GO term on each chromosome arm as the product of the GO term specificity and its frequency on a given chromosome arm.
3 |
Thus, a higher degree of functional similarity would be found between chromosome arms if their shared functions were absent elsewhere in the genome. This method was also applied to the relationship between genes and their related GO terms. Weighted GO term counts were derived for each gene by summing the specificities of their related GO terms in equation (3). However, the weighted GO counts did not alter the distributions or significances of our ortholog regression analyses. Therefore, raw counts were used for ease of interpretability.
A primary goal of our GO annotation regression analyses was to determine if genomic representation influences the selective pressures exerted on a function. Therefore, all ontology terms were retained in this analysis in order to examine our hypothesis that large scale duplication events may relieve pleiotropic constraints in a subset of genes through increased dosage of essential functions. However, where multiple GO terms contained the same set of related genes, only one term was retained to remove redundant data points. 11,020 terms of the original 14,121 were found to have unique sets of related genes. 11,016 of these were found to have non-zero values and were used in regression analyses.
Statistical analysis
All regression and distribution analyses were performed in Python (see version above) using the statsmodels API. Due to the strong positive skew of several variables in our dataset, generalized linear models (GLM) were used where appropriate for hypotheis testing. Fitting positively skewed continuous data with a gamma distribution has been shown to perform comparably or outperform lognormal transformations without the need for manual manipulation of the variables109 and provides a more flexible model when assumptions of ordinary least squares are violated. A log link was used to maintain a non-linear fit with interpretable coefficients while respecting the domain of the gamma function. The exponential of the coefficient for the intercept and predictor variable, therefore, represent the initial predicted outcome value and rate of change for a one-unit increase in the predictor variable, respectively. This methodology was applied to the relationship between an ortholog’s number of GO terms and its ratio, a GO term’s number of expressed chromosome arms and its ratio, a GO term’s number of related genes and its ratio, and a GO term’s number of expressed chromosome arms and number of related genes. Where statistically meaningful zero values were present, a hurdle method was employed to counteract the calculation error introduced. This entails fitting a gamma distribution to all non-zero data, as well as a binomial distribution to the full dataset to determine the influence of the predictor variable on the probability that the dependent variable is zero110. This was applied to the relationship between an orhtolog’s ratio and number of GO terms. The linear relationship of chromosome arm’s number of related genes and GO annotations was fit with ordinary least squares regression without an intercept, as it was nonsensical in the given context. Normality of distributions was determined using the Shapiro-Wilk test which tests the null hypothesis that the data are normally distributed.
Hierarchical clustering of the chromosome arms based on GO annotation content was performed using the cluster package in R (Version 3.5.3). GO term counts were not scaled before distance calculation due to the homogenous nature of the variables. The distance was calculated using Euclidean distance. The linkage measure was determined by obtaining the agglomerative coefficient (amount of clustering structure found) for single, complete, average linkage and Ward’s method using the agnes() function. For our dataset, Ward’s method resulted in the highest agglomerative coefficient and was subsequently used in our clustering analysis. Therefore, multi-node clusters were joined based on minimum increase in within-group variance.
Supplementary information
Acknowledgements
The authors would like to thank Christine Malcom and Neil Miller for help in preparing and reviewing this manuscript. This work was supported in part by Bionexus KC.
Author contributions
J.W., J.M.S. and G.J.W. planned the project. J.W. and J.M.S. performed data collection. J.W. performed data and statistical analysis. J.W. generated all figures. J.W. and G.J.W. wrote the paper. All authors reviewed the manuscript.
Data availability
Requests for compiled data materials pertaining to this manuscript should be addressed to J.W. Inquiries pertaining to the use of raw data from the SPEED database should be submitted to J.M.S. email address jstaley2@ksu.edu.
Code availability
Requests for code should be addressed to J.W.
Competing interests
The authors declare no competing financial interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
is available for this paper at 10.1038/s41598-020-58997-2.
References
- 1.Lahn BT, Page DC. Four evolutionary strata on the human x chromosome. Science. 1999;286:964–967. doi: 10.1126/science.286.5441.964. [DOI] [PubMed] [Google Scholar]
- 2.Ross MT, et al. The DNA sequence of the human x chromosome. Nature. 2005;434:325–337. doi: 10.1038/nature03440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hughes JF, et al. Strict evolutionary conservation followed rapid gene loss on human and rhesus y chromosomes. Nature. 2012;483:82–86. doi: 10.1038/nature10843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Skaletsky H, et al. The male-specific region of the human y chromosome is a mosaic of discrete sequence classes. Nature. 2003;423:825–837. doi: 10.1038/nature01722. [DOI] [PubMed] [Google Scholar]
- 5.Charlesworth B, Charlesworth D. The degeneration of y chromosomes. Philosophical Transactions of the Royal Society B: Biological Sciences. 2000;355:1563–1572. doi: 10.1098/rstb.2000.0717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lemaitre C, et al. Footprints of inversions at present and past pseudoautosomal boundaries in human sex chromosomes. Genome Biology and Evolution. 2009;1:56–66. doi: 10.1093/gbe/evp006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kirkpatrick, M. How and why chromosome inversions evolve. PLoS Biology 8 (2010). [DOI] [PMC free article] [PubMed]
- 8.Graves JAM. Sex chromosome specialization and degeneration in mammals. Cell. 2006;124:901–914. doi: 10.1016/j.cell.2006.02.024. [DOI] [PubMed] [Google Scholar]
- 9.Aitken RJ, Graves JAM. Human spermatozoa: The future of sex. Nature. 2002;415:963. doi: 10.1038/415963a. [DOI] [PubMed] [Google Scholar]
- 10.Graves JA. The rise and fall of SRY. Trends in Genetics. 2002;18:259–264. doi: 10.1016/S0168-9525(02)02666-5. [DOI] [PubMed] [Google Scholar]
- 11.Arakawa Y, Nishida-Umehara C, Matsuda Y, Sutou S, Suzuki H. X-chromosomal localization of mammalian y-linked genes in two XO species of the ryukyu spiny rat. Cytogenetic and Genome Research. 2002;99:303–309. doi: 10.1159/000071608. [DOI] [PubMed] [Google Scholar]
- 12.Just W, et al. Absence of sry in species of the vole ellobius. Nature Genetics. 1995;11:117. doi: 10.1038/ng1095-117. [DOI] [PubMed] [Google Scholar]
- 13.Kuroiwa A, Ishiguchi Y, Yamada F, Shintaro A, Matsuda Y. The process of a y-loss event in an XO/XO mammal, the ryukyu spiny rat. Chromosoma. 2010;119:519–526. doi: 10.1007/s00412-010-0275-8. [DOI] [PubMed] [Google Scholar]
- 14.Hughes JF, et al. Chimpanzee and human y chromosomes are remarkably divergent in structure and gene content. Nature. 2010;463:536–539. doi: 10.1038/nature08700. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hughes JF, et al. Conservation of y-linked genes during human evolution revealed by comparative sequencing in chimpanzee. Nature. 2005;437:100. doi: 10.1038/nature04101. [DOI] [PubMed] [Google Scholar]
- 16.Rozen S, Marszalek JD, Alagappan RK, Skaletsky H, Page DC. Remarkably little variation in proteins encoded by the y chromosome’s single-copy genes, implying effective purifying selection. American Journal of Human Genetics. 2009;85:923–928. doi: 10.1016/j.ajhg.2009.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bellott DW, et al. Mammalian y chromosomes retain widely expressed dosage-sensitive regulators. Nature. 2014;508:494–499. doi: 10.1038/nature13206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rozen S, et al. Abundant gene conversion between arms of palindromes in human and ape y chromosomes. Nature. 2003;423:873–876. doi: 10.1038/nature01723. [DOI] [PubMed] [Google Scholar]
- 19.Teitz LS, Pyntikova T, Skaletsky H, Page DC. Selection has countered high mutability to preserve the ancestral copy number of y chromosome amplicons in diverse human lineages. American Journal of Human Genetics. 2018;103:261–275. doi: 10.1016/j.ajhg.2018.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Marais GAB, Campos PRA, Gordo I. Can intra-y gene conversion oppose the degeneration of the human y chromosome? a simulation study. Genome Biology and Evolution. 2010;2:347–357. doi: 10.1093/gbe/evq026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Connallon T, Clark AG. Gene duplication, gene conversion and the evolution of the y chromosome. Genetics. 2010;186:277–286. doi: 10.1534/genetics.110.116756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bachtrog D. A dynamic view of sex chromosome evolution. Current Opinion in Genetics & Development. 2006;16:578–585. doi: 10.1016/j.gde.2006.10.007. [DOI] [PubMed] [Google Scholar]
- 23.Kaiser VB, Zhou Q, Bachtrog D. Nonrandom gene loss from the drosophila miranda neo-y chromosome. Genome Biology and Evolution. 2011;3:1329–1337. doi: 10.1093/gbe/evr103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bachtrog D. Y chromosome evolution: emerging insights into processes of y chromosome degeneration. Nature reviews. Genetics. 2013;14:113–124. doi: 10.1038/nrg3366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lahn BT, Page DC. Functional coherence of the human y chromosome. Science (New York, N.Y.) 1997;278:675–680. doi: 10.1126/science.278.5338.675. [DOI] [PubMed] [Google Scholar]
- 26.Page DC, Harper ME, Love J, Botstein D. Occurrence of a transposition from the x-chromosome long arm to the y-chromosome short arm during human evolution. Nature. 1984;311:119–123. doi: 10.1038/311119a0. [DOI] [PubMed] [Google Scholar]
- 27.Connallon, T.Débarre, F. & Li, X.-Y. Linking local adaptation with the evolution of sex differences. Philosophical Transactions of the Royal Society B: Biological Sciences 373 (2018). [DOI] [PMC free article] [PubMed]
- 28.Sacerdot C, Louis A, Bon C, Berthelot C, RoestCrollius H. Chromosome evolution at the origin of the ancestral vertebrate genome. Genome Biology. 2018;19:166. doi: 10.1186/s13059-018-1559-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Dehal, P. & Boore, J. L. Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biology 3 (2005). [DOI] [PMC free article] [PubMed]
- 30.Putnam NH, et al. The amphioxus genome and the evolution of the chordate karyotype. Nature. 2008;453:1064–1071. doi: 10.1038/nature06967. [DOI] [PubMed] [Google Scholar]
- 31.Nakatani Y, Takeda H, Kohara Y, Morishita S. Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Research. 2007;17:1254–1265. doi: 10.1101/gr.6316407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wolfe KH. Yesterday’s polyploids and the mystery of diploidization. Nature Reviews Genetics. 2001;2:333. doi: 10.1038/35072009. [DOI] [PubMed] [Google Scholar]
- 33.Blomme T, et al. The gain and loss of genes during 600 million years of vertebrate evolution. Genome Biology. 2006;7:R43. doi: 10.1186/gb-2006-7-5-r43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Innan H, Kondrashov F. The evolution of gene duplications: classifying and distinguishing between models. Nature Reviews Genetics. 2010;11:97–108. doi: 10.1038/nrg2689. [DOI] [PubMed] [Google Scholar]
- 35.Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–1155. doi: 10.1126/science.290.5494.1151. [DOI] [PubMed] [Google Scholar]
- 36.Gout J-F, Kahn D, Duret L, Consortium PP-G. The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution. PLOS Genetics. 2010;6:1000944. doi: 10.1371/journal.pgen.1000944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Birchler JA, Veitia RA. The gene balance hypothesis: From classical genetics to modern genomics. The Plant Cell. 2007;19:395–402. doi: 10.1105/tpc.106.049338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Van de Peer Y, Maere S, Meyer A. The evolutionary significance of ancient genome duplications. Nature Reviews Genetics. 2009;10:725–732. doi: 10.1038/nrg2600. [DOI] [PubMed] [Google Scholar]
- 39.Singh PP, Arora J, Isambert H. Identification of ohnolog genes originating from whole genome duplication in early vertebrates, based on synteny comparison across multiple genomes. PLOS Computational Biology. 2015;11:1004394. doi: 10.1371/journal.pcbi.1004394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Makino T, McLysaght A. Positionally biased gene loss after whole genome duplication: Evidence from human, yeast, and plant. Genome Research. 2012;22:2427–2435. doi: 10.1101/gr.131953.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Tu Z, et al. Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics. 2006;7:31. doi: 10.1186/1471-2164-7-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhang L, Li W-H. Mammalian housekeeping genes evolve more slowly than tissue-specific genes. Molecular Biology and Evolution. 2004;21:236–239. doi: 10.1093/molbev/msh010. [DOI] [PubMed] [Google Scholar]
- 43.Yang J, Gu Z, Li W-H. Rate of protein evolution versus fitness effect of gene deletion. Molecular Biology and Evolution. 2003;20:772–774. doi: 10.1093/molbev/msg078. [DOI] [PubMed] [Google Scholar]
- 44.Duret L, Mouchiroud D. Determinants of substitution rates in mammalian genes: Expression pattern affects selection intensity but not mutation rate. Molecular Biology and Evolution. 2000;17:68–070. doi: 10.1093/oxfordjournals.molbev.a026239. [DOI] [PubMed] [Google Scholar]
- 45.Torgerson DG, Singh RS. Rapid evolution through gene duplication and subfunctionalization of the testes-specific proteasome subunits in drosophila. Genetics. 2004;168:1421–1432. doi: 10.1534/genetics.104.027631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lynch M, Force A. The probability of duplicate gene preservation by subfunctionalization. Genetics. 2000;154:459–473. doi: 10.1093/genetics/154.1.459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wagner A. The fate of duplicated genes: loss or new function? BioEssays. 1998;20:785–788. doi: 10.1002/(SICI)1521-1878(199810)20:10<785::AID-BIES2>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
- 48.Dermitzakis ET, Clark AG. Differential selection after duplication in mammalian developmental genes. Molecular Biology and Evolution. 2001;18:557–562. doi: 10.1093/oxfordjournals.molbev.a003835. [DOI] [PubMed] [Google Scholar]
- 49.Li W-H, Yang J, Gu X. Expression divergence between duplicate genes. Trends in Genetics. 2005;21:602–607. doi: 10.1016/j.tig.2005.08.006. [DOI] [PubMed] [Google Scholar]
- 50.Conant GC, Wolfe KH. Turning a hobby into a job: How duplicated genes find new functions. Nature Reviews Genetics. 2008;9:938–950. doi: 10.1038/nrg2482. [DOI] [PubMed] [Google Scholar]
- 51.Rastogi S, Liberles DA. Subfunctionalization of duplicated genes as a transition state to neofunctionalization. BMC Evolutionary Biology. 2005;5:28. doi: 10.1186/1471-2148-5-28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.He X, Zhang J. Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics. 2005;169:1157–1164. doi: 10.1534/genetics.104.037051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wilson, M. A. & Makova, K. D. Evolution and survival on eutherian sex chromosomes. PLoS Genetics 5 (2009). [DOI] [PMC free article] [PubMed]
- 54.Zhang P, Gu Z, Li W-H. Different evolutionary patterns between young duplicate genes in the human genome. Genome Biology. 2003;4:R56. doi: 10.1186/gb-2003-4-9-r56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Cortez D, et al. Origins and functional evolution of y chromosomes across mammals. Nature. 2014;508:488–493. doi: 10.1038/nature13151. [DOI] [PubMed] [Google Scholar]
- 56.Mueller JL, et al. Independent specialization of the human and mouse x chromosomes for the male germline. Nature genetics. 2013;45:1083–1087. doi: 10.1038/ng.2705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Saifi GM, Chandra HS. An apparent excess of sex- and reproduction-related genes on the human x chromosome. Proceedings of the Royal Society B: Biological Sciences. 1999;266:203–209. doi: 10.1098/rspb.1999.0623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Bellott DW, et al. Convergent evolution of chicken z and human x chromosomes by expansion and gene acquisition. Nature. 2010;466:612–616. doi: 10.1038/nature09172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Blackmon H, Brandvain Y. Long-term fragility of y chromosomes is dominated by short-term resolution of sexual antagonism. Genetics. 2017;207:1621–1629. doi: 10.1534/genetics.117.300382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Rice WR. The accumulation of sexually antagonistic genes as a selective agent promoting the evolution of reduced recombination between primitive sex chromosomes. Evolution. 1987;41:911–914. doi: 10.1111/j.1558-5646.1987.tb05864.x. [DOI] [PubMed] [Google Scholar]
- 61.Rice WR. Evolution of the y sex chromosome in AnimalsY chromosomes evolve through the degeneration of autosomes. BioScience. 1996;46:331–343. doi: 10.2307/1312947. [DOI] [Google Scholar]
- 62.Charlesworth D, Charlesworth B. Sex differences in fitness and selection for centric fusions between sex-chromosomes and autosomes. Genetical Research. 1980;35:205–214. doi: 10.1017/S0016672300014051. [DOI] [PubMed] [Google Scholar]
- 63.van Doorn GS, Kirkpatrick M. Turnover of sex chromosomes induced by sexual conflict. Nature. 2007;449:909–912. doi: 10.1038/nature06178. [DOI] [PubMed] [Google Scholar]
- 64.Matsumoto T, Kitano J. The intricate relationship between sexually antagonistic selection and the evolution of sex chromosome fusions. Journal of Theoretical Biology. 2016;404:97–108. doi: 10.1016/j.jtbi.2016.05.036. [DOI] [PubMed] [Google Scholar]
- 65.Charlesworth B. The evolution of chromosomal sex determination and dosage compensation. Current Biology. 1996;6:149–162. doi: 10.1016/S0960-9822(02)00448-7. [DOI] [PubMed] [Google Scholar]
- 66.Saxena R, et al. The DAZ gene cluster on the human y chromosome arose from an autosomal gene that was transposed, repeatedly amplified and pruned. Nature Genetics. 1996;14:292. doi: 10.1038/ng1196-292. [DOI] [PubMed] [Google Scholar]
- 67.Lahn BT, Page DC. Retroposition of autosomal mRNA yielded testis-specific gene family on human y chromosome. Nature Genetics. 1999;21:429–433. doi: 10.1038/7771. [DOI] [PubMed] [Google Scholar]
- 68.Bhowmick BK, Satta Y, Takahata N. The origin and evolution of human ampliconic gene families and ampliconic structure. Genome Research. 2007;17:441–450. doi: 10.1101/gr.5734907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Zhou Q, Bachtrog D. Sex-specific adaptation drives early sex chromosome evolution in drosophila. Science (New York, N.Y.) 2012;337:341–345. doi: 10.1126/science.1225385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wyckoff GJ, Li J, Wu C-I. Molecular evolution of functional genes on the mammalian y chromosome. Molecular Biology and Evolution. 2002;19:1633–1636. doi: 10.1093/oxfordjournals.molbev.a004226. [DOI] [PubMed] [Google Scholar]
- 71.Wyckoff GJ, Wang W, Wu C-I. Rapid evolution of male reproductive genes in the descent of man. Nature. 2000;403:304. doi: 10.1038/35002070. [DOI] [PubMed] [Google Scholar]
- 72.Kirkpatrick M, Barton N. Chromosome inversions, local adaptation and speciation. Genetics. 2006;173:419–434. doi: 10.1534/genetics.105.047985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. Selection in the evolution of gene duplications. Genome Biology. 2002;3:research0008.1. doi: 10.1186/gb-2002-3-2-research0008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Kondrashov FA, Koonin EV. A common framework for understanding the origin of genetic dominance and evolutionary fates of gene duplications. Trends in Genetics. 2004;20:287–290. doi: 10.1016/j.tig.2004.05.001. [DOI] [PubMed] [Google Scholar]
- 75.McVicker G, Gordon D, Davis C, Green P. Widespread genomic signatures of natural selection in hominid evolution. PLOS Genetics. 2009;5:1000471. doi: 10.1371/journal.pgen.1000471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Crow KD, Wagner GP. What is the role of genome duplication in the evolution of complexity and diversity? Molecular Biology and Evolution. 2006;23:887–892. doi: 10.1093/molbev/msj083. [DOI] [PubMed] [Google Scholar]
- 77.Bush GL, Case SM, Wilson AC, Patton JL. Rapid speciation and chromosomal evolution in mammals. Proceedings of the National Academy of Sciences of the United States of America. 1977;74:3942–3946. doi: 10.1073/pnas.74.9.3942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Livingstone K, Rieseberg L. Chromosomal evolution and speciation: a recombination-based approach: Research review. New Phytologist. 2003;161:107–112. doi: 10.1046/j.1469-8137.2003.00942.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Foster JW, Graves JA. An SRY-related sequence on the marsupial x chromosome: implications for the evolution of the mammalian testis-determining gene. Proceedings of the National Academy of Sciences of the United States of America. 1994;91:1927–1931. doi: 10.1073/pnas.91.5.1927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Charlesworth D, Charlesworth B, Marais G. Steps in the evolution of heteromorphic sex chromosomes. Heredity. 2005;95:118. doi: 10.1038/sj.hdy.6800697. [DOI] [PubMed] [Google Scholar]
- 81.Hughes JF, Rozen S. Genomics and genetics of human and primate y chromosomes. Annual Review of Genomics and Human Genetics. 2012;13:83–108. doi: 10.1146/annurev-genom-090711-163855. [DOI] [PubMed] [Google Scholar]
- 82.Chapelle Adl, Tippett PA, Wetterstrand G, Page D. Genetic evidence of x-y interchange in a human XX male. Nature. 1984;307:170–171. doi: 10.1038/307170a0. [DOI] [PubMed] [Google Scholar]
- 83.Page DC, Chapelle Adl, Weissenbach J. Chromosome y-specific DNA in related human XX males. Nature. 1985;315:224–226. doi: 10.1038/315224a0. [DOI] [PubMed] [Google Scholar]
- 84.Pepene CE, Coman I, Mihu D, Militaru M, Duncea I. Infertility in a new 46, XX male with positive SRY confirmed by fluorescence in situ hybridization: a case report. Clinical and Experimental Obstetrics & Gynecology. 2008;35:299–300. [PubMed] [Google Scholar]
- 85.Anik A, Çatlı G, Abaci A, Böber E. 46,XX male disorder of sexual development: A case report. Journal of Clinical Research in Pediatric Endocrinology. 2013;5:258–260. doi: 10.4274/Jcrpe.1098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Wang T, Liu JH, Yang J, Chen J, Ye ZQ. 46, XX male sex reversal syndrome: a case report and review of the genetic basis. Andrologia. 2009;41:59–62. doi: 10.1111/j.1439-0272.2008.00889.x. [DOI] [PubMed] [Google Scholar]
- 87.Wu Q-Y, et al. Clinical, molecular and cytogenetic analysis of 46, XX testicular disorder of sex development with SRY-positive. BMC Urology. 2014;14:70. doi: 10.1186/1471-2490-14-70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Bachtrog D. The temporal dynamics of processes underlying y chromosome degeneration. Genetics. 2008;179:1513–1525. doi: 10.1534/genetics.107.084012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Kuroki Y, et al. Comparative analysis of chimpanzee and human y chromosomes unveils complex evolutionary pathway. Nature Genetics. 2006;38:158. doi: 10.1038/ng1729. [DOI] [PubMed] [Google Scholar]
- 90.Bachtrog D. Evidence that positive selection drives y-chromosome degeneration in drosophila miranda. Nature Genetics. 2004;36:518. doi: 10.1038/ng1347. [DOI] [PubMed] [Google Scholar]
- 91.Rice WR. Genetic hitchhiking and the evolution of reduced genetic activity of the y sex chromosome. Genetics. 1987;116:161–167. doi: 10.1093/genetics/116.1.161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Dorus S, Evans PD, Wyckoff GJ, Choi SS, Lahn BT. Rate of molecular evolution of the seminal protein gene SEMG2 correlates with levels of female promiscuity. Nature Genetics. 2004;36:1326–1329. doi: 10.1038/ng1471. [DOI] [PubMed] [Google Scholar]
- 93.Ohno, S. Evolution by Gene Duplication (Springer-Verlag, 1970).
- 94.Charlesworth B, Betancourt AJ, Kaiser VB, Gordo I. Genetic recombination and molecular evolution. Cold Spring Harbor Symposia on Quantitative Biology. 2009;74:177–186. doi: 10.1101/sqb.2009.74.015. [DOI] [PubMed] [Google Scholar]
- 95.L. VANVALEN. A new evolutionary law. Evol Theory. 1973;1:1–30. [Google Scholar]
- 96.Renaud G, et al. Improved de novo genomic assembly for the domestic donkey. Science Advances. 2018;4:eaaq0392. doi: 10.1126/sciadv.aaq0392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Bellott DW, et al. Avian w and mammalian y chromosomes convergently retained dosage-sensitive regulators. Nature Genetics. 2017;49:387–394. doi: 10.1038/ng.3778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Smith CA, et al. The avian z-linked gene DMRT1 is required for male sex determination in the chicken. Nature. 2009;461:267–271. doi: 10.1038/nature08298. [DOI] [PubMed] [Google Scholar]
- 99.MarshallGraves JA. Sex determination: Birds do it with a z gene. Nature. 2009;461:177–178. doi: 10.1038/461177a. [DOI] [PubMed] [Google Scholar]
- 100.Vicoso, B. & Bachtrog, D. Progress and prospects toward our understanding of the evolution of dosage compensation. Chromosome Research 17 (2009). [DOI] [PMC free article] [PubMed]
- 101.Wyckoff GJ, Malcom CM, Vallender EJ, Lahn BT. A highly unexpected strong correlation between fixation probability of nonsynonymous mutations and mutation rate. Trends in Genetics. 2005;21:381–385. doi: 10.1016/j.tig.2005.05.005. [DOI] [PubMed] [Google Scholar]
- 102.Koonin EV. Orthologs, paralogs, and evolutionary genomics. Annual Review of Genetics. 2005;39:309–338. doi: 10.1146/annurev.genet.39.073003.114725. [DOI] [PubMed] [Google Scholar]
- 103.Vallender EJ, Paschall JE, Malcom CM, Lahn BT, Wyckoff GJ. SPEED: a molecular-evolution-based database of mammalian orthologous groups. Bioinformatics (Oxford, England) 2006;22:2835–2837. doi: 10.1093/bioinformatics/btl471. [DOI] [PubMed] [Google Scholar]
- 104.Ferger WF. The nature and use of the harmonic mean. Journal of the American Statistical Association. 1931;26:36–40. doi: 10.1080/01621459.1931.10503148. [DOI] [Google Scholar]
- 105.Huntley RP, et al. The GOA database: gene ontology annotation updates for 2015. Nucleic Acids Research. 2015;43:D1057–1063. doi: 10.1093/nar/gku1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.duPlessis L, Škunca N, Dessimoz C. The what, where, how and why of gene ontology–a primer for bioinformaticians. Briefings in Bioinformatics. 2011;12:723–735. doi: 10.1093/bib/bbr002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Škunca N, Altenhoff A, Dessimoz C. Quality of computationally inferred gene ontology annotations. PLOS Computational Biology. 2012;8:e1002533. doi: 10.1371/journal.pcbi.1002533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Martínez O, Reyes-Valdés MH. Defining diversity, specialization, and gene specificity in transcriptomes through information theory. Proceedings of the National Academy of Sciences. 2008;105:9709–9714. doi: 10.1073/pnas.0803479105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Gustavsson S, Fagerberg B, Sallsten G, Andersson EM. Regression models for log-normal data: Comparing different methods for quantifying the association between abdominal adiposity and biomarkers of inflammation and insulin resistance. International Journal of Environmental Research and Public Health. 2014;11:3521–3539. doi: 10.3390/ijerph110403521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Tong ENC, Mues C, Thomas L. A zero-adjusted gamma model for mortgage loan loss given default. International Journal of Forecasting. 2013;29:548–562. doi: 10.1016/j.ijforecast.2013.03.003. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Requests for compiled data materials pertaining to this manuscript should be addressed to J.W. Inquiries pertaining to the use of raw data from the SPEED database should be submitted to J.M.S. email address jstaley2@ksu.edu.
Requests for code should be addressed to J.W.