Summary
Whole genome duplication (WGD) has had profound macroevolutionary impacts on diverse lineages1,2, preceding adaptive radiations in vertebrates3–5, teleost fish6,7, and angiosperms8,9. In contrast to the many known ancient WGDs in animals10,11 and especially plants12–14, we are aware of evidence for only four in fungi15,16. The oldest of these occurred ~100 million years ago (mya) and is shared by ~60 extant Saccharomycetales species17,18, including the baker’s yeast Saccharomyces cerevisiae (Fig. 1). Notably, this is the only known ancient WGD in the yeast subphylum Saccharomycotina. The dearth of ancient WGD events in fungi remains a mystery15. Some studies have suggested that fungal lineages that experience chromosome19 and genome15 duplication quickly go extinct, leaving no trace in the genomic record, while others contend that the lack of known WGD is due to an absence of data15,16. Under the second hypothesis, additional sampling and deeper sequencing of fungal genomes should lead to the discovery of more WGD events. Coupling hundreds of recently published genomes from nearly every described Saccharomycotina species with three additional long-read assemblies, we discovered three novel WGD events. While the functions of retained duplicate genes originating from these events are broad, they bear many similarities to the well-known WGD that occurred in the Saccharomycetales17. Our results suggest that WGD may be a more common evolutionary force in fungi than previously believed.
Keywords: Whole Genome Duplication, Yeasts, Polyploidy, Convergent Evolution
Results and Discussion
Evidence for three whole genome duplication events in the Dipodascales
To detect signatures of ancient WGD we first used a gene/species tree reconciliation algorithm to infer gene duplications across the Saccharomycotina phylogeny. High rates of gene duplication along specific lineages of the species tree can be indicative of WGD events and have been used to identify ancient WGDs in plants20–22, animals23, and fungi24. An initial analysis using a 400-species backbone phylogeny (Fig. S1) successfully recovered the known ancient WGD near the base of Saccharomycetales (labelled WGD1). This lineage possessed the second-highest rate of duplications, over 10x higher than the average internal lineage. However, the lineages with the first and third highest rates of duplication occurred in the Dipodascales, a clade separated from the Saccharomycetales by ~300 million years (my) of evolution.
To improve the resolution of gene family evolution within both clades, we increased our sampling in Saccharomycetales (135 genomes) and Dipodascales (184 genomes) and performed additional gene/species tree reconciliation analysis. The lineage containing WGD1 was again successfully identified by a spike in duplication rate, experiencing 53.1 gene duplications / my (Fig. 1). Two internal spikes were also evident in Dipodascales. The most recent ancestral lineage to the genera Dipodascus and Geotrichum (spanning from 201–132 mya) had a duplication rate of 21.8 duplications / my, indicating that a WGD may have occurred along this lineage (labelled WGD2). Similarly, the most recent ancestral lineage to the species Magnusiomyces tetraspermus and Saprochaete suaveolens (spanning from 62–38 mya) had a duplication rate of 23.62 duplications / my (labelled WGD3) (Fig. 1). Additionally, the highest species-specific duplication rate by far belonged to M. magnusii with 130 gene duplications / my, 2.4x higher than any other Dipodascales species. M. magnusii also possesses the largest known Saccharomycotina genome (43.2Mb, 3.4x larger than S. cerevisiae), leading us to hypothesize another WGD (labelled WGD4, spanning from 27–0 mya) specific to this species alone.
Figure 1.
Three additional whole genome duplications in Saccharomycotina yeasts. Rates of gene duplication are mapped onto Saccharomycetales (left) and Dipodascales (right) phylogenies. Whole genome duplications indicated along the lineages in which they are predicted to occur. Trees have been pruned for visualization purposes.
While gene/species tree reconciliation approaches are useful in identifying putative WGDs, they are limited in their ability to distinguish true WGD from bouts of single or segmental gene duplication events25, or in definitively placing WGDs on specific lineages due to gene tree discordance and unbalanced paralog retention24. For example, descendant lineages of WGD2 and WGD3 also have elevated rates of gene duplication (Fig. 1).
A complementary approach to identify WGD involves the detection of colinear segments of paralogs (called multiplicons) within a genome26,27. Such analysis requires highly contiguous assemblies. Assemblies with fewer than 100 contigs were available for one post-WGD2 (G. candidum: 28 contigs) and one post-WGD3 (Sap. suaveolens: 12 contigs) species. To generate additional evidence for hypothesized WGDs and their placements on the yeast phylogeny we re-sequenced three additional species: D. fermentans for WGD2 (19 contigs), M. tetraspermus for WGD3 (33 contigs) and M. magnusii for WGD4 (71 contigs). These 5 genomes, along with 81 other highly contiguous Saccharomycotina genome assemblies (Data S1) were then searched for multiplicons (Fig. 2A).
Figure 2.
Paralog collinearity supports WGD events. A) Cumulative multiplicon size of each highly contiguous Saccharomycotina assembly, as measured by the number of unique genes. Genomes predicted to have undergone ancient WGD are highlighted. B) Gene order of select multiplicons. Colors represent unique orthogroups, demonstrating homology both within and between species. Orthogroups that only appear once are depicted in gray.
We found that genomes hypothesized to descend from ancient WGDs each contained numerous large multiplicons across scaffolds. Furthermore, multiplicons between D. fermentans and G. candidum had many orthogroups in common, as did Sap. Suaveolens, M. tetraspermus, and M. magnusii (Fig. 2B). However, multiplicons between post-WGD2 and post-WGD3 species (D. fermentans and Sap. suaveolens, for example) contained largely distinct orthogroups (Data S2), consistent with our inference that WGD2 and WGD3 represent two distinct events (Fig. 1). Interestingly, whereas multiplicons for most post-WGD species occurred in duplicates, in M. magnusii, they appear in tetraplicate, providing further evidence for WGD4 as an event specific to this species. WGD4 is also supported by examination of the cumulative multiplicon size in M. magnusii, which is 186.5% that of sister species Sap. Suaveolens.
Another common hallmark of ancient WGD is large increases in the number of chromosomes across lineages28. However, fungi possess small, loosely-packed chromosomes, which make karyotyping difficult29. Furthermore, Saccharomycotina yeasts have extremely diverse telomeres30,31 and centromeres32,33, which make chromosome number estimation challenging even for highly-contiguous assemblies. Despite these obstacles, chromosome count estimates from several Dipodascales, species exist, including one representative from each proposed WGD. For species not inferred to have undergone WGD, count estimates range from four34 to five35,36 chromosomes. This estimate increases to seven chromosomes for the post-WGD3 species Sap. suaveolens37 and eight38 or nine39 for the post-WGD2 species G. candidum. Finally, 13 chromosomes have been inferred from pulsed field gel electrophoresis for M. magnusii40, which is predicted to have undergone two rounds of WGD (both WGD3 and WGD4). Therefore, while limited chromosome count estimates in Dipodascales preclude a formal analysis, we interpret the available data to be consistent with our conclusions.
WGD can occur both within lineages (autopolyploidization) or between lineages via hybridization24,41 (allopolyploidization). WGD1 was originally hypothesized to be an autopolyploidization event due to the highly conserved gene order within descendent genomes18. However, a gene/species tree reconciliation analysis later recovered a lineage pre-dating WGD1 with a higher rate of gene duplications than the WGD1 lineage itself24. The authors posit this unusual finding could be explained by gene tree discordance produced through hybridization24, which suggests allopolyploidization. As our reconciliation analysis recovered no such peak in gene duplication rates preceding any of the novel WGD events (Fig. 1), we hypothesize that these events occurred either through autopolyploidization or through allopolyploidization between closely related parent species.
In contrast to more ancient WGD events, WGD4 may have occurred relatively recently and is apparently specific to just M. magnusii. Thus, this WGD may be of similar age to several young hybrids within Saccharomycotina species complexes that are known to vary in ploidy levels42–47. Though M. magnusii is not a known hybrid, further investigation is warranted to better elucidate the timing and mechanism of WGD4. For example, broader sampling within M. magnusii will help determine whether WGD4 is shared by the entire species, or specific to just the taxonomic type strain.
Common consequences of genome duplication
WGD is often referred to as an engine of evolutionary innovation2,4. For example, it has been hypothesized that WGD1 facilitated aerobic glycolysis in several yeast species48,49. Also known as the “Crabtree/Warburg effect”, this is the biochemical process that empowers baking, brewing, and winemaking. However, others have noted that some species that have not undergone WGD1 are still capable of aerobic glycolysis to a lesser extent50. As this example illustrates, identifying cause-effect relationships from rare evolutionary events, such as WGD, is exceedingly difficult51,52. The discovery of additional WGD events in Saccharomycotina affords new opportunities for understanding how WGD events contribute to evolutionary innovation.
To test whether certain functional gene classes were more likely to be retained in duplicate following WGD, and if these classes were convergently shared across events, we ran enrichment analysis on retained orthologs for each event. Notably, WGD4 ohnologs were significantly enriched (adjusted p<0.05) for only five InterPro Gene Ontology53 categories (Data S3) and for none of the Kyoto Encyclopedia of Genes and Genomes54 (KEGG) pathways (Data S4). As 70% of all M. magnusii genes are ohnologs, we hypothesize that not enough time has passed since WGD4 for genes to become lost and patterns within retained copies to emerge. As discussed above, it is not yet known whether WGD4 represents a truly ancient WGD or a strain-specific allopolyploidization. Therefore, we turn our attention to the three older WGD1, WGD2, and WGD3 (WGD1,2,3 for short) for the remainder of this discussion.
The effects of WGD appear widespread. Ohnologs were significantly enriched (adjusted p<0.01) in at least six of the seven main KEGG pathway categories across WGD1,2,3 (Data S4). Despite the diversity of affected pathways, commonalities remain. Enrichment analysis of InterPro Gene Ontology annotations and KEGG pathways revealed three overarching functional themes in post-WGD genomes: metabolism, expression, and signaling (Fig. 3). We discuss each of these in turn below.
Figure 3.
Shared gene ontology categories are preferentially retained following ancient WGD in different clades of Saccharomycotina yeasts. The ten most overrepresented gene ontology categories in ohnologs from WGD1, WGD2, and WGD3 are shown, colored by generic category.
Metabolism
Within the KEGG carbohydrate metabolism subcategory, no specific pathway was shared between events. However, four related pathways were significantly enriched (adjusted p<0.01) in one of the three WGD events (Fig. 4). For example, WGD2 ohnologs were enriched in the propanoate metabolic pathway. Propanoate esters and other related compounds are responsible for the characteristic fruity scent of G. candidum55, a popular yeast in cheesemaking56 as well as one of the major natural flavorants of cocoa57,58. Many other Geotrichum/Dipodascus species are similarly known for their flavor-enhancing properties59.
Figure 4.
Convergent and divergent impacts of whole genome duplication on metabolic and signaling pathways. Two exemplary KEGG pathways significantly enriched (adjusted p<0.01) among ohnologs from WGD1, WGD2, WGD3, and WGD4 events: glycolysis (left) and the MAPK signaling pathway (right). Genes are colored by which event ohnologs were retained from and pathways are colored by which event they were significantly enriched in. Transparent genes are those that were not identified in any genome in any amount.
As mentioned above, the metabolic effects of WGD1 have been studied previously48–50,60. Enzymes and hexose transporters involved in glycolysis have been preferentially retained in post-WGD1 species, in particular those genes whose dosage most greatly increases glycolytic flux60. We find all ten of these genes have been retained in duplicate from at least one of the three newly reported WGD events, even those not retained from WGD1. Three gene families, HXT, ENO, and PYK were retained across all four events. HXT is particularly notable, as dosage of these hexose transporter-encoding genes has the single biggest positive impact on glycolytic flux by a wide margin60.
It has also been hypothesized that WGD1 coincided with the rise of angiosperms and the increased availability of simple sugars found in fruit and nectar17. Our discovery of multiple WGDs reveals that these events are not tied to specific time periods. However, we do find that WGD events are associated with metabolic specialism in yeasts. Post-WGD species metabolize significantly fewer substrates than species that have not undergone WGD (phylogenetic ANOVA p<2.2e-16), a result that persists even if WGD1 taxa are excluded (phylogenetic ANOVA p=1e-3). We hypothesized that additional gene copies in preexisting metabolic pathways produced by WGD events facilitate higher-throughput metabolic output on glucose, reducing evolutionary pressure to metabolize alternative substrates. However, we found that post-WGD species have significantly reduced growth rates on the simple sugars mannose (phylogenetic ANOVA p=4.6e-2), and fructose (phylogenetic ANOVA p=1.6e-3), while differences on glucose were insignificant (phylogenetic ANOVA p=0.9). Clearly, more work is required to disentangle the effects of WGD on metabolic rate, growth rate, and niche breadth, an effort we hope will be aided by the discovery of these additional events.
Expression
While metabolic pathways were differentially enriched across WGD events, other pathways exhibited stronger signatures of convergence. Between KEGG and Gene Ontology annotations, WGD1,2,3 were all significantly enriched with ribosomal proteins (Data S3, Data S4, Fig. S2). In Saccharomyces species, the ribosomal protein activator Ifh1 and repressor Crf1 are known ohnologs that diverged from an ancestral generalist regulator following WGD161. Ifh1induces expression under nutrient-rich growth conditions, whereas Crf1 represses expression during periods of stress, providing finer regulatory control during periods of environmental heterogeneity compared to non-WGD1 species61. Though these regulators are not found in Dipodascales, regulation of transcription was the third most-enriched gene ontology term in WGD2 ohnologs and the most-enriched in WGD3 ohnologs (Data S3), suggesting similar dynamics may be at play.
Signaling
Lastly, signaling pathways were also consistent beneficiaries of WGD. The mitogen-activated protein kinase (MAPK), glucagon, and insulin signaling pathways were all significantly enriched (adjusted p<0.01) across WGD1,2,3 (Data S4, Fig. 4, S3). MAPK is an ancient family of kinases that drive phosphorylation cascades, which trigger a variety of cellular mechanisms of S. cerevisiae in response to stressful conditions, such as high osmolarity62, damage to the cell wall63, or starvation64 (Fig. 4). While fungi do not naturally produce glucagon or insulin, these general pathways are deeply conserved across eukaryotes65,66. These pathways contain protein kinases AMPK and PKA which serve important nutrient sensing functions and trigger cellular responses accordingly67,68. Both AMPK and PKA contain duplicate genes from all four WGD events (Fig. S3).
As with yeasts, MAPK proteins in humans exert control over the cell cycle, which render them promising targets for cancer treatments68,69. In fact, many pathways significantly enriched (adjusted p<0.01) with ohnologs from WGD1,2,3 are related to human disease, including proteoglycans in cancer, insulin resistance, and COVID-19 (Data S4).
Conclusion
We report the presence of three previously unknown whole genome duplication events, denoted WGD2, WGD3, and WGD4 in the Dipodascales clade. WGD2 occurred in the most recent common ancestor of the genera Dipodascus and Geotrichum 201–132mya. WGD3 occurred in the most recent common ancestor of the species Sap. Suaveolens, M. magnusii, and M. tetraspermus 62–38 mya. WGD4 is specific to M. magnusii, occurring 27–0 mya. All these events are strongly supported by both gene/species tree reconciliation and synteny analyses. Despite ~300 my of evolution separating them, WGD2 and WGD3 share many similar outcomes with the previously known WGD1 occurring in Saccharomycetales, as well as other WGDs across eukaryotes. Genes with many protein-protein interactions, such as those involved with transcription or metabolic/signaling networks, are expected to be more sensitive to dosage effects, and therefore more likely to be retained following WGD7,70. Indeed, genes coding for ribosomal proteins and signal transducers have both been preferentially retained following WGD in plants71 and animals72. The observed extent of convergence suggests that the effects of WGD and other major evolutionary events may be predictable, corroborating recent work in this clade73.
Another feature common to WGD-enriched pathways is their role in adaptation under diverse environments. Previous studies in yeasts have shown how these metabolic60, ribosomal61, and signaling64,66 pathways provide heterogenous responses to hostile conditions and to the quantity and quality of available nutrients. The enrichment of these gene families following WGD may explain why modern polyploid S. cerevisiae strains are more fit in challenging environments, such as non-optimal carbon sources74, human hosts75, or brewing vats76. This result may further help to address the more widely observed trend of polyploid plant77 and animal78 species occurring in extreme, rapidly-changing environments.
The nonrandom distribution of retained ohnologs indicates WGD as a potential driver of evolutionary innovation. However, more work is required to identify whether various modes of functional divergence are involved, such as neofunctionalization or escape from adaptive conflict79, or if simply increasing dosage of conserved copies is sufficient to precipitate major evolutionary change60. Previously thought to be largely absent from fungi15, these results underscore the importance of WGD in all eukaryotic kingdoms. We anticipate more fungal WGDs will be discovered as sampling and sequencing continue to improve, and that these events will yield a fuller portrait of eukaryotic genome evolution.
Methods
Experimental model and study participant details
Dipodascus fermentans PYCC 3480T (NRRL Y-1492) was obtained from the Portuguese Yeast Culture Collection (PYCC) and was routinely grown on solid yeast peptone dextrose (YPD) media at 25°C. M. magnusii (NRRL Y-17563) and M. tetraspermus (NRRL Y-7288) were routinely grown on solid yeast peptone dextrose (YPD) media at room temperature (22°C). Liquid cultures were inoculated from a single colony and grown in 25 ml YPD at room temperature in a 125 ml baffled flask shaking at 225 rpm.
Method details
D. fermentans extraction and assembly
Genomic DNA from overnight grown cultures of D. fermentans was obtained using the Quick-DNA Fungal/Bacterial Miniprep Kit from Zymoresearch (cat no. D6005), following the manufacturer’s protocol. Long-read data was obtained using Oxford Nanopore Technology, with a MinION flowcell. For de novo assembly, Canu v2.280 was used with default parameters, only adjusting the genome size flag to 25 m. The resulting contigs were corrected with two rounds of Racon v1.5.081, one with the Nanopore reads and the other with publicly available Illumina reads82,83 (SRR16988715). Afterwards, several rounds of Pilon v1.2484 were performed using Illumina reads until no changes were seen on the change file. To further increase the contiguity of the assembly, LINKS v1.8.785 was implemented.
M. magnusii and M. tetraspermus extraction and assembly
High molecular weight DNA from M. magnusii and M. tetraspermus was extracted using the Zymo Quick-DNA HMW MagBead Kit (cat no. D6060). The manufacturer’s protocol was followed, but lysis was optimized for non-conventional yeast species. The yeast cells were pelleted by centrifugation at 5000 x g and resuspended in 1 ml of 1 M sorbitol with 50 mM dithiothreitol (DTT) and incubated at 30°C for 10 min. The cells were then washed in fresh 1 M sorbitol and resuspended in 200 ul 1 M sorbitol with 5 U of Zymolyase Ultra (Zymo Research, cat no. E1007–2) and incubated at 37°C for 2 hrs. Once cells were spheroplasted, 205 ul phosphate buffered saline, 20 ul of 10% SDS, and 10 ul proteinase K were added, and the spheroplasts were lysed at 55°C for 10 min with occasional inversion. The DNA in the lysate was then bound to the beads following the manufacturer’s protocol. After extraction, the DNA was enriched for high molecular weight DNA using a bead cleanup with a custom buffer (10 mM Tris-HCl, 1 mM EDTA pH 8, 1.6 M NaCl, 11% PEG) as previously described86. Genome sequencing was performed by Plasmidsaurus using Oxford Nanopore Technology. Both genomes were assembled with flye v2.9.687 using the ǹano-hq` option.
Gene annotation
To infer gene boundaries for the newly generated genomes of D. fermentans, M. magnussi, and M. tetraspermus we used funannotate v1.8.1688. To do so, each genome was first masked using tantan v4089 using the funannotate mask function. Next, each genome was annotated using the funannotate predict function, which is a wrapper function to use multiple gene calling algorithms and creates a consensus set of gene boundaries. Prediction algorithms implemented include Augustus v3.3.290, SNAP v2006–07-2891, and GlimmerHMM92 each algorithm was trained on gene models predicted using BUSCO v2.093 gene models from the OrthoDB v994 database of near-universally single-copy orthologs from fungi. For Augustus, the òptimize_augustus` argument was used. Additional gene boundaries were predicted by mapping gene annotations from a clustered set of proteins from 332 Saccharomycotina proteomes82; clustering was done using CD-HIT v4.8.195 using default settings. The results from each gene prediction algorithm were used to create a consensus set of gene boundaries using EVidenceModeler v1.1.196 with the “repeats2evm” argument. All approaches were given the same weight, except high-quality gene annotations (defined as >90% exon evidence) predicted by Augustus, which were given twice the weight of other algorithms. Gene models less than 50 amino acids in length and putatively transposable elements were subsequently removed. Putative transposable elements were identified using sequence similarity searches conducted using DIAMOND v2.1.897 and the funannotate database of repeat sequences. The resulting gene models were functionally with both KEGG v114.054 and InterPro v106.053 databases. KEGG orthologs were identified through the KofamKOALA98 web server. InterPro Gene Ontology annotations were assigned using InterProScan v5.74_105.099. Each gene model was annotated locally using the `disable-precalc` option.
Gene Tree Inference
Three comparative genomic datasets were used by this study. The first was based on the recently published Y1000+ Project dataset83. The full 1,154 genomes of this dataset proved computationally intractable and was subsampled down to 400 using the following procedure: a genome with the shortest terminal branch in the species tree was pruned at random, unless that genome had ≤50 contigs. This process was repeated iteratively until 400 genomes remained in the tree. This sampling strategy maximized phylogenetic breadth and depth while retaining highly-contiguous genomes that could be used for synteny analysis. Gene trees were inferred with OrthoFinder v3.0.1b1100 using the Y1000+ species tree as a reference. OrthoFinder identified large spikes in duplication rate within the class Dipodascales, warranting further investigation. Therefore, a second dataset was assembled using all 184 available Dipodascales genomes, as well as a third dataset of 135 Saccharomycetales genomes where WGD1 was known to occur. OrthoFinder was run on both of these datasets as before. A full record of all genomes used in this study can be found at Data S1.
Quantification and statistical analysis
Synteny analysis was performed using the wgd v2 software package101. First, paralogous gene sets were identified within each of 83 Saccharomycotina genome assemblies with ≤50 contigs, in addition to the 2 new genomes sequenced by this study, using the `wgd dmd` command with default parameters. Next, colinear segments (multiplicons) were called with the `wgd syn` command, also under default parameters.
Enrichment analysis was performed using the clusterProfiler v4.10102 package in R v4.3.2103. The function ènrichKEGG` was used for KEGG pathway enrichment, while the function ènricher` was used for gene ontology enrichment. False discovery rate104 was used to control for multiple testing in both cases. Target genes were those that occurred in duplicate across colinear segments within genomes from species predicted to have experienced a given WGD event, against a background of all genes in the genome(s). As M. magnusii experienced two rounds of WGD, target genes in this species were further filtered to those estimated to have duplicated prior to the M. magnusii + Sap. suaveolens split (for WGD3) and those duplicates specific to M. magnusii (for WGD4).
To test whether niche breadth or growth curves were significantly different in post-WGD species, we performed phylogenetic ANOVA using the àov.phylò function as implemented in the geiger v2.0.11105 package. Niche breadth data was taken from David et al. 202573, and growth curves were obtained from Opulente et al. 202483.
Supplementary Material
Highlights.
Evidence for three whole genome duplications (WGDs) in Dipodascales yeasts
Impacts of WGD are broad but bear many similarities to the known WGD in Saccharomycetales yeasts
Duplicates with many protein-protein interactions are more likely to be retained over long timescales
WGD in fungi is likely underreported
Acknowledgements
This work was performed using resources contained within the Advanced Computing Center for Research and Education at Vanderbilt University in Nashville, TN. This work was supported by the NSF (grants DBI-2305612 to K.T.D., DEB-2110403 to C.T.H., and DEB-2110404 to A.R.) and the NIH (R35GM151348 to M.P.). This work was partially supported by FCT—Fundação para a Ciência e a Tecnologia, I.P. (FCT/MCTES; https://www.fct.pt/) in the scope of projects UIDP/04378/2020, UIDB/04378/2020, LA/P/0140/2020 and grant PTDC/BIA-EVL/0604/2021 (to C.G.). Research in the Hittinger Lab is also supported by the United States Department of Agriculture National Institute of Food and Agriculture (Hatch Project 7005101), in part by the Department of Energy (DOE) Great Lakes Bioenergy Research Center (DOE Biological and Environmental Research Office of Science DE–SC0018409). Research in the Rokas Lab is also supported by the NIH/National Institute of Allergy and Infectious Diseases (R01 AI153356). J.L.S. is a Howard Hughes Medical Institute Awardee of the Life Sciences Research Foundation.
Footnotes
Conflicts of Interest
J.L.S. is an advisor to ForensisGroup Inc. J.L.S. is a scientific consultant to FutureHouse Inc. A.R. is a scientific consultant for LifeMine Therapeutics, Inc. All other authors declare no conflict of interest.
References
- 1.Otto S.P. (2007). The evolutionary consequences of polyploidy. Cell 131, 452–462. [DOI] [PubMed] [Google Scholar]
- 2.Van de Peer Y., Mizrachi E., and Marchal K. (2017). The evolutionary significance of polyploidy. Nature Reviews Genetics 18, 411. [Google Scholar]
- 3.Dehal P., and Boore J.L. (2005). Two rounds of whole genome duplication in the ancestral vertebrate. PLoS biology 3, e314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ohno S. (1970). Evolution by gene duplication (Springer Science & Business Media; ). [Google Scholar]
- 5.Yu D., Ren Y., Uesaka M., Beavan A.J.S., Muffato M., Shen J., Li Y., Sato I., Wan W., Clark J.W., et al. (2024). Hagfish genome elucidates vertebrate whole-genome duplication events and their evolutionary consequences. Nat Ecol Evol 8, 519–535. 10.1038/s41559-023-02299-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Brunet F.G., Crollius H.R., Paris M., Aury J.-M., Gibert P., Jaillon O., Laudet V., and Robinson-Rechavi M. (2006). Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. Molecular biology and evolution 23, 1808–1816. [DOI] [PubMed] [Google Scholar]
- 7.Glasauer S.M., and Neuhauss S.C. (2014). Whole-genome duplication in teleost fishes and its evolutionary consequences. Molecular genetics and genomics 289, 1045–1060. [DOI] [PubMed] [Google Scholar]
- 8.Bodt S.D., Maere S., and Peer Y.V. de (2005). Genome duplication and the origin of angiosperms. Trends in Ecology & Evolution 20, 591–597. 10.1016/j.tree.2005.07.008. [DOI] [PubMed] [Google Scholar]
- 9.Tank D.C., Eastman J.M., Pennell M.W., Soltis P.S., Soltis D.E., Hinchliff C.E., Brown J.W., Sessa E.B., and Harmon L.J. (2015). Nested radiations and the pulse of angiosperm diversification: increased diversification rates often follow whole genome duplications. New Phytologist 207, 454–467. [DOI] [PubMed] [Google Scholar]
- 10.Gregory T.R., and Mable B.K. (2005). Polyploidy in animals. In The evolution of the genome (Elsevier; ), pp. 427–517. [Google Scholar]
- 11.Otto S.P., and Whitton J. (2000). Polyploid incidence and evolution. Annual review of genetics 34, 401–437. [Google Scholar]
- 12.Ren R., Wang H., Guo C., Zhang N., Zeng L., Chen Y., Ma H., and Qi J. (2018). Widespread Whole Genome Duplications Contribute to Genome Complexity and Species Diversity in Angiosperms. Molecular Plant 11, 414–428. 10.1016/j.molp.2018.01.002. [DOI] [PubMed] [Google Scholar]
- 13.Clark J.W., and Donoghue P.C.J. (2018). Whole-Genome Duplication and Plant Macroevolution. Trends in Plant Science 23, 933–945. 10.1016/j.tplants.2018.07.006. [DOI] [PubMed] [Google Scholar]
- 14.Landis J.B., Soltis D.E., Li Z., Marx H.E., Barker M.S., Tank D.C., and Soltis P.S. (2018). Impact of whole-genome duplication events on diversification rates in angiosperms. American Journal of Botany 105, 348–363. 10.1002/ajb2.1060. [DOI] [PubMed] [Google Scholar]
- 15.Campbell M.A., Ganley A.R.D., Gabaldón T., and Cox M.P. (2016). The Case of the Missing Ancient Fungal Polyploids. The American Naturalist 188, 602–614. 10.1086/688763. [DOI] [Google Scholar]
- 16.Naranjo-Ortiz M.A., and Gabaldón T. (2020). Fungal evolution: cellular, genomic and metabolic complexity. Biological Reviews 95, 1198–1232. 10.1111/brv.12605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Wolfe K.H., and Shields D.C. (1997). Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387, 708–713. [DOI] [PubMed] [Google Scholar]
- 18.Scannell D.R., Frank A.C., Conant G.C., Byrne K.P., Woolfit M., and Wolfe K.H. (2007). Independent sorting-out of thousands of duplicated gene pairs in two yeast species descended from a whole-genome duplication. Proceedings of the National Academy of Sciences 104, 8397–8402. 10.1073/pnas.0608218104. [DOI] [Google Scholar]
- 19.Kohanovski I., Pontz M., Vande Zande P., Selmecki A., Dahan O., Pilpel Y., Yona A.H., and Ram Y. (2024). Aneuploidy can be an evolutionary diversion on the path to adaptation. Molecular biology and evolution 41, msae052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Yang Y., Moore M.J., Brockington S.F., Mikenas J., Olivieri J., Walker J.F., and Smith S.A. (2018). Improved transcriptome sampling pinpoints 26 ancient and more recent polyploidy events in Caryophyllales, including two allopolyploidy events. New Phytologist 217, 855–870. 10.1111/nph.14812. [DOI] [PubMed] [Google Scholar]
- 21.Li Z., Baniaga A.E., Sessa E.B., Scascitelli M., Graham S.W., Rieseberg L.H., and Barker M.S. (2015). Early genome duplications in conifers and other seed plants. Science advances 1, e1501084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Jiao Y., Wickett N.J., Ayyampalayam S., Chanderbali A.S., Landherr L., Ralph P.E., Tomsho L.P., Hu Y., Liang H., and Soltis P.S. (2011). Ancestral polyploidy in seed plants and angiosperms. Nature 473, 97–100. [DOI] [PubMed] [Google Scholar]
- 23.Huerta-Cepas J., Dopazo H., Dopazo J., and Gabaldón T. (2007). The human phylome. Genome biology 8, R109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Marcet-Houben M., and Gabaldón T. (2015). Beyond the Whole-Genome Duplication: Phylogenetic Evidence for an Ancient Interspecies Hybridization in the Baker’s Yeast Lineage. PLOS Biology 13, e1002220. 10.1371/journal.pbio.1002220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zwaenepoel A., Li Z., Lohaus R., and Peer Y.V. de (2019). Finding Evidence for Whole Genome Duplications: A Reappraisal. Molecular Plant 12, 133–136. 10.1016/j.molp.2018.12.019. [DOI] [PubMed] [Google Scholar]
- 26.Peer De, and Van Y. (2004). Computational approaches to unveiling ancient genome duplications. Nat Rev Genet 5, 752–763. 10.1038/nrg1449. [DOI] [PubMed] [Google Scholar]
- 27.Tang H., Bowers J.E., Wang X., Ming R., Alam M., and Paterson A.H. (2008). Synteny and Collinearity in Plant Genomes. Science 320, 486–488. 10.1126/science.1153917. [DOI] [PubMed] [Google Scholar]
- 28.Glick L., and Mayrose I. (2014). ChromEvol: assessing the pattern of chromosome number evolution and the inference of polyploidy along a phylogeny. Molecular biology and evolution 31, 1914–1922. [DOI] [PubMed] [Google Scholar]
- 29.Deacon J.W. (2005). Fungal biology (John Wiley & Sons; ). [Google Scholar]
- 30.Lue N.F. (2010). Plasticity of telomere maintenance mechanisms in yeast. Trends in Biochemical Sciences 35, 8–17. 10.1016/j.tibs.2009.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Peska V., Fajkus P., Bubeník M., Brázda V., Bohálová N., Dvořáček V., Fajkus J., and Garcia S. (2021). Extraordinary diversity of telomeres, telomerase RNAs and their template regions in Saccharomycetaceae. Sci Rep 11, 12784. 10.1038/s41598-021-92126-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Helsen J., Ramachandran K., Sherlock G., and Dey G. (2025). Centromeres evolve progressively through selection at the kinetochore interface. Preprint at bioRxiv, https://doi.org/10.1101/2025.01.16.633479 10.1101/2025.01.16.633479. [DOI] [Google Scholar]
- 33.Haase M.A.B., Lazar-Stefanita L., Baudry L., Wudzinska A., Zhou X., Rokas A., Hittinger C.T., Musacchio A., and Boeke J.D. (2025). Ancient co-option of LTR retrotransposons as yeast centromeres. Preprint at bioRxiv, https://doi.org/10.1101/2025.04.25.647736 10.1101/2025.04.25.647736. [DOI] [Google Scholar]
- 34.Brejová B., Lichancová H., Brázdovič F., Hegedűsová E., Forgáčová Jakúbková M., Hodorová V., Džugasová V., Baláž A., Zeiselová L., Cillingová A., et al. (2019). Genome sequence of the opportunistic human pathogen Magnusiomyces capitatus. Curr Genet 65, 539–560. 10.1007/s00294-018-0904-y. [DOI] [PubMed] [Google Scholar]
- 35.Brejová B., Lichancová H., Hodorová V., Neboháčová M., Tomáška Ľ., Vinař T., and Nosek J. (2019). Genome Sequence of an Arthroconidial Yeast, Saprochaete fungicola CBS 625.85. Microbiology Resource Announcements 8, 10.1128/mra.00092-19. https://doi.org/10.1128/mra.00092-19. [DOI] [Google Scholar]
- 36.Hodorová V., Lichancová H., Zubenko S., Sienkiewicz K., Penir S.M.U., Afanasyev P., Boceck D., Bonnin S., Hakobyan S., Smyczynska U., et al. (2019). Genome Sequence of the Yeast Saprochaete ingens CBS 517.90. Microbiology Resource Announcements 8, 10.1128/mra.01366-19. https://doi.org/10.1128/mra.01366-19. [DOI] [Google Scholar]
- 37.Lichancová H., Hodorová V., Sienkiewicz K., Penir S.M.U., Afanasyev P., Boceck D., Bonnin S., Hakobyan S., Krawczyk P.S., Smyczynska U., et al. (2019). Genome Sequence of Flavor-Producing Yeast Saprochaete suaveolens NRRL Y-17571. Microbiol Resour Announc 8, e00094–19. 10.1128/MRA.00094-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gente S., Desmasures N., Jacopin C., Plessis G., Beliard M., Panoff J.-M., and Guéguen M. (2002). Intra-species chromosome-length polymorphism in Geotrichum candidum revealed by pulsed field gel electrophoresis. International journal of food microbiology 76, 127–134. [DOI] [PubMed] [Google Scholar]
- 39.Naumova E.S., Smith M.Th., Boekhout T., de Hoog G.S., and Naumov G.I. (2001). Molecular differentiation of sibling species in the Galactomyces geotrichum complex. Antonie Van Leeuwenhoek 80, 263–273. 10.1023/A:1013038610122. [DOI] [PubMed] [Google Scholar]
- 40.Filipp D., Filipp P., Nosek J., and Hladk M. (1995). Electrophoretic karyotype of Dipodascus (Endomyces) magnusii: two main intraspecific chromosomal polymorphisms associated with the difference in total genome size. Curr Genet 29, 81–87. 10.1007/BF00313197. [DOI] [PubMed] [Google Scholar]
- 41.Steenwyk J.L., Lind A.L., Ries L.N.A., Reis T.F. dos, Silva L.P., Almeida F., Bastos R.W., Silva T.F. de C.F. da, Bonato V.L.D., Pessoni A.M., et al. (2020). Pathogenic Allodiploid Hybrids of Aspergillus Fungi. Current Biology 30, 2495–2507.e7. 10.1016/j.cub.2020.04.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Barros K.O., Al-Oboudi J., Freitas L.F.D., Sousa F.M.P., Batista T.M., Santos A.R.O., Morais P.B., Sampaio J.P., Lachance M.-A., Hittinger C.T., et al. (2025). Taxogenomic analysis of Pichia senei sp. nov. and new insights into hybridization events in the Pichia cactophila species complex. FEMS Yeast Res 25, foaf037. 10.1093/femsyr/foaf037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Gordon J.L., and Wolfe K.H. (2008). Recent allopolyploid origin of Zygosaccharomyces rouxii strain ATCC 42981. Yeast 25, 449–456. 10.1002/yea.1598. [DOI] [PubMed] [Google Scholar]
- 44.Mira N.P., Münsterkötter M., Dias-Valada F., Santos J., Palma M., Roque F.C., Guerreiro J.F., Rodrigues F., Sousa M.J., Leão C., et al. (2014). The Genome Sequence of the Highly Acetic Acid-Tolerant Zygosaccharomyces bailii-Derived Interspecies Hybrid Strain ISA1307, Isolated From a Sparkling Wine Plant. DNA Res 21, 299–313. 10.1093/dnares/dst058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Pryszcz L.P., Németh T., Gácser A., and Gabaldón T. (2014). Genome Comparison of Candida orthopsilosis Clinical Strains Reveals the Existence of Hybrids between Two Distinct Subspecies. Genome Biol Evol 6, 1069–1078. 10.1093/gbe/evu082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Curtin C.D., Borneman A.R., Chambers P.J., and Pretorius I.S. (2012). De-Novo Assembly and Analysis of the Heterozygous Triploid Genome of the Wine Spoilage Yeast Dekkera bruxellensis AWRI1499. PLOS ONE 7, e33840. 10.1371/journal.pone.0033840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Langdon Q.K., Peris D., Baker E.P., Opulente D.A., Nguyen H.-V., Bond U., Gonçalves P., Sampaio J.P., Libkind D., and Hittinger C.T. (2019). Fermentation innovation through complex hybridization of wild and domesticated yeasts. Nat Ecol Evol 3, 1576–1586. 10.1038/s41559-019-0998-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Thomson J.M., Gaucher E.A., Burgan M.F., De Kee D.W., Li T., Aris J.P., and Benner S.A. (2005). Resurrecting ancestral alcohol dehydrogenases from yeast. Nat Genet 37, 630–635. 10.1038/ng1553. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lin Z., and Li W.-H. (2011). Expansion of Hexose Transporter Genes Was Associated with the Evolution of Aerobic Fermentation in Yeasts. Molecular Biology and Evolution 28, 131–142. 10.1093/molbev/msq184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Hagman A., Säll T., Compagno C., and Piskur J. (2013). Yeast “Make-Accumulate-Consume” Life Strategy Evolved as a Multi-Step Process That Predates the Whole Genome Duplication. PLOS ONE 8, e68734. 10.1371/journal.pone.0068734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Uyeda J.C., Zenil-Ferguson R., and Pennell M.W. (2018). Rethinking phylogenetic comparative methods. Systematic Biology 67, 1091–1109. 10.1093/sysbio/syy031. [DOI] [PubMed] [Google Scholar]
- 52.Maddison W.P., and FitzJohn R.G. (2015). The Unsolved Challenge to Phylogenetic Correlation Tests for Categorical Characters. Systematic Biology 64, 127–136. 10.1093/sysbio/syu070. [DOI] [PubMed] [Google Scholar]
- 53.Blum M., Chang H.-Y., Chuguransky S., Grego T., Kandasaamy S., Mitchell A., Nuka G., Paysan-Lafosse T., Qureshi M., Raj S., et al. (2021). The InterPro protein families and domains database: 20 years on. Nucleic Acids Research 49, D344–D354. 10.1093/nar/gkaa977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kanehisa M., and Goto S. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research 28, 27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Mdaini N., Gargouri M., Hammami M., Monser L., and Hamdi M. (2006). Production of natural fruity aroma by Geotrichum candidum. Appl Biochem Biotechnol 128, 227–235. 10.1385/ABAB:128:3:227. [DOI] [PubMed] [Google Scholar]
- 56.Boutrou R., and Guéguen M. (2005). Interests in Geotrichum candidum for cheese technology. International Journal of Food Microbiology 102, 1–20. 10.1016/j.ijfoodmicro.2004.12.028. [DOI] [PubMed] [Google Scholar]
- 57.Barros M.C., Quadros J.S., Silva M.A.P. da, and Nunes R.S.C. (2022). Taxonomy and Characterization of Fungi Isolated from Cocoa Beans during Fermentation in Saf’s Agroforestry System in the Amazon. Open Access Library Journal 9, 1–10. 10.4236/oalib.1108825. [DOI] [Google Scholar]
- 58.Koné M.K., Guéhi S.T., Durand N., Ban-Koffi L., Berthiot L., Tachon A.F., Brou K., Boulanger R., and Montet D. (2016). Contribution of predominant yeasts to the occurrence of aroma compounds during cocoa bean fermentation. Food Research International 89, 910–917. 10.1016/j.foodres.2016.04.010. [DOI] [Google Scholar]
- 59.Grondin E., Shum Cheong Sing A., James S., Nueno-Palop C., François J.M., and Petit T. (2017). Flavour production by Saprochaete and Geotrichum yeasts and their close relatives. Food Chemistry 237, 677–684. 10.1016/j.foodchem.2017.06.009. [DOI] [PubMed] [Google Scholar]
- 60.Conant G.C., and Wolfe K.H. (2007). Increased glycolytic flux as an outcome of whole-genome duplication in yeast. Molecular Systems Biology 3. 10.1038/msb4100170. [DOI] [Google Scholar]
- 61.Wapinski I., Pfiffner J., French C., Socha A., Thompson D.A., and Regev A. (2010). Gene duplication and the evolution of ribosomal protein gene regulation in yeast. Proceedings of the National Academy of Sciences 107, 5505–5510. 10.1073/pnas.0911905107. [DOI] [Google Scholar]
- 62.Saito H., and Posas F. (2012). Response to Hyperosmotic Stress. Genetics 192, 289–318. 10.1534/genetics.112.140863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Levin D.E. (2011). Regulation of cell wall biogenesis in Saccharomyces cerevisiae: the cell wall integrity signaling pathway. Genetics 189, 1145–1175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Gustin M.C., Albertyn J., Alexander M., and Davenport K. (1998). MAP Kinase Pathways in the YeastSaccharomyces cerevisiae. Microbiology and Molecular Biology Reviews 62, 1264–1300. 10.1128/mmbr.62.4.1264-1300.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Barbieri M., Bonafè M., Franceschi C., and Paolisso G. (2003). Insulin/IGF-I-signaling pathway: an evolutionarily conserved mechanism of longevity from yeast to humans. Am J Physiol Endocrinol Metab 285, E1064–1071. 10.1152/ajpendo.00296.2003. [DOI] [PubMed] [Google Scholar]
- 66.Zaman S., Lippman S.I., Schneper L., Slonim N., and Broach J.R. (2009). Glucose regulates transcription in yeast through a network of signaling pathways. Mol Syst Biol 5, 245. 10.1038/msb.2009.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Thevelein J.M., and De Winde J.H. (1999). Novel sensing mechanisms and targets for the cAMP–protein kinase A pathway in the yeast Saccharomyces cerevisiae. Molecular Microbiology 33, 904–918. 10.1046/j.1365-2958.1999.01538.x. [DOI] [PubMed] [Google Scholar]
- 68.Braicu C., Buse M., Busuioc C., Drula R., Gulei D., Raduly L., Rusu A., Irimie A., Atanasov A.G., Slaby O., et al. (2019). A Comprehensive Review on MAPK: A Promising Therapeutic Target in Cancer. Cancers 11, 1618. 10.3390/cancers11101618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Bahar M.E., Kim H.J., and Kim D.R. (2023). Targeting the RAS/RAF/MAPK pathway for cancer therapy: from mechanism to clinical studies. Sig Transduct Target Ther 8, 455. 10.1038/s41392-023-01705-z. [DOI] [Google Scholar]
- 70.Hakes L., Pinney J.W., Lovell S.C., Oliver S.G., and Robertson D.L. (2007). All duplicates are not equal: the difference between small-scale and genome duplication. Genome Biol 8, R209. 10.1186/gb-2007-8-10-r209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Freeling M., and Thomas B.C. (2006). Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. Genome Res. 16, 805–814. 10.1101/gr.3681406. [DOI] [PubMed] [Google Scholar]
- 72.Blomme T., Vandepoele K., De Bodt S., Simillion C., Maere S., and Van de Peer Y. (2006). The gain and loss of genes during 600 million years of vertebrate evolution. Genome Biol 7, R43. 10.1186/gb-2006-7-5-r43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.David K.T., Schraiber J.G., Crandall J.G., Labella A.L., Opulente D.A., Harrison M.-C., Wolters J.F., Zhou X., Shen X.-X., Groenewald M., et al. (2025). Convergent expansions of keystone gene families drive metabolic innovation in Saccharomycotina yeasts. Proceedings of the National Academy of Sciences 122, e2500165122. 10.1073/pnas.2500165122. [DOI] [Google Scholar]
- 74.Scott A.L., Richmond P.A., Dowell R.D., and Selmecki A.M. (2017). The Influence of Polyploidy on the Evolution of Yeast Grown in a Sub-Optimal Carbon Source. Molecular Biology and Evolution 34, 2690–2703. 10.1093/molbev/msx205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Zhu Y.O., Sherlock G., and Petrov D.A. (2016). Whole Genome Analysis of 132 Clinical Saccharomyces cerevisiae Strains Reveals Extensive Ploidy Variation. G3 Genes|Genomes|Genetics 6, 2421–2434. 10.1534/g3.116.029397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Saada O.A., Tsouris A., Large C., Friedrich A., Dunham M.J., and Schacherer J. (2022). Phased polyploid genomes provide deeper insight into the multiple origins of domesticated Saccharomyces cerevisiae beer yeasts. Current Biology 32, 1350–1361.e3. 10.1016/j.cub.2022.01.068. [DOI] [PubMed] [Google Scholar]
- 77.Rice A., Šmarda P., Novosolov M., Drori M., Glick L., Sabath N., Meiri S., Belmaker J., and Mayrose I. (2019). The global biogeography of polyploid plants. Nature Ecology & Evolution 3, 265–273. [DOI] [PubMed] [Google Scholar]
- 78.David K.T. (2022). Global Gradients in the Distribution of Animal Polyploids. PNAS. [Google Scholar]
- 79.Hittinger C.T., and Carroll S.B. (2007). Gene duplication and the adaptive evolution of a classic genetic switch. Nature 449, 677–681. 10.1038/nature06151. [DOI] [PubMed] [Google Scholar]
- 80.Koren S., Walenz B.P., Berlin K., Miller J.R., Bergman N.H., and Phillippy A.M. (2017). Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27, 722–736. 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Vaser R., Sović I., Nagarajan N., and Šikić M. (2017). Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27, 737–746. 10.1101/gr.214270.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Shen X.-X., Opulente D.A., Kominek J., Zhou X., Steenwyk J.L., Buh K.V., Haase M.A.B., Wisecaver J.H., Wang M., Doering D.T., et al. (2018). Tempo and Mode of Genome Evolution in the Budding Yeast Subphylum. Cell 175, 1533–1545.e20. 10.1016/j.cell.2018.10.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Opulente D.A., LaBella A.L., Harrison M.-C., Wolters J.F., Liu C., Li Y., Kominek J., Steenwyk J.L., Stoneman H.R., VanDenAvond J., et al. (2024). Genomic factors shape carbon and nitrogen metabolic niche breadth across Saccharomycotina yeasts. Science 384, eadj4503. 10.1126/science.adj4503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Walker B.J., Abeel T., Shea T., Priest M., Abouelliel A., Sakthikumar S., Cuomo C.A., Zeng Q., Wortman J., Young S.K., et al. (2014). Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963. 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Warren R.L., Yang C., Vandervalk B.P., Behsaz B., Lagman A., Jones S.J.M., and Birol I. (2015). LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads. Gigascience 4, 35. 10.1186/s13742-015-0076-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Rojas J., Hose J., Dutcher H.A., Place M., Wolters J.F., Hittinger C.T., and Gasch A.P. (2024). Comparative modeling reveals the molecular determinants of aneuploidy fitness cost in a wild yeast model. Cell Genomics 4. 10.1016/j.xgen.2024.100656. [DOI] [Google Scholar]
- 87.Kolmogorov M., Yuan J., Lin Y., and Pevzner P.A. (2019). Assembly of long, error-prone reads using repeat graphs. Nature biotechnology 37, 540–546. [Google Scholar]
- 88.Palmer J.M., and Stajich J. (2020). Funannotate v1.8.1: Eukaryotic genome annotation. (Zenodo). https://doi.org/10.5281/zenodo.4054262 10.5281/zenodo.4054262. [DOI] [Google Scholar]
- 89.Frith M.C. (2011). A new repeat-masking method enables specific detection of homologous sequences. Nucleic acids research 39, e23–e23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Stanke M., Keller O., Gunduz I., Hayes A., Waack S., and Morgenstern B. (2006). AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Research 34, W435–W439. 10.1093/nar/gkl200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Korf I. (2004). Gene finding in novel genomes. BMC Bioinformatics 5. 10.1186/1471-2105-5-59. [DOI] [Google Scholar]
- 92.Majoros W.H., Pertea M., and Salzberg S.L. (2004). TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879. 10.1093/bioinformatics/bth315. [DOI] [PubMed] [Google Scholar]
- 93.Waterhouse R.M., Seppey M., Simão F.A., Manni M., Ioannidis P., Klioutchnikov G., Kriventseva E.V., and Zdobnov E.M. (2018). BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics. Mol Biol Evol 35, 543–548. 10.1093/molbev/msx319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Zdobnov E.M., Tegenfeldt F., Kuznetsov D., Waterhouse R.M., Simão F.A., Ioannidis P., Seppey M., Loetscher A., and Kriventseva E.V. (2017). OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs. Nucleic Acids Research 45, D744–D749. 10.1093/nar/gkw1119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Fu L., Niu B., Zhu Z., Wu S., and Li W. (2012). CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Haas B.J., Salzberg S.L., Zhu W., Pertea M., Allen J.E., Orvis J., White O., Buell C.R., and Wortman J.R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biology 9, R7. 10.1186/gb-2008-9-1-r7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Buchfink B., Reuter K., and Drost H.-G. (2021). Sensitive protein alignments at tree-of-life scale using DIAMOND. Nature methods 18, 366–368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Aramaki T., Blanc-Mathieu R., Endo H., Ohkubo K., Kanehisa M., Goto S., and Ogata H. (2020). KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics 36, 2251–2252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Jones P., Binns D., Chang H.-Y., Fraser M., Li W., McAnulla C., McWilliam H., Maslen J., Mitchell A., and Nuka G. (2014). InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Emms D.M., and Kelly S. (2019). OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome biology 20, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Chen H., Zwaenepoel A., and Van de Peer Y. (2024). wgd v2: a suite of tools to uncover and date ancient polyploidy and whole-genome duplication. Bioinformatics 40, btae272. 10.1093/bioinformatics/btae272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Wu T., Hu E., Xu S., Chen M., Guo P., Dai Z., Feng T., Zhou L., Tang W., and Zhan L.I. (2021). clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. The innovation 2. [Google Scholar]
- 103.Team, R.C. (2017). R: A Language and Environment for Statistical Computing.
- 104.Benjamini Y., and Hochberg Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57, 289–300. [Google Scholar]
- 105.Pennell M.W., Eastman J.M., Slater G.J., Brown J.W., Uyeda J.C., FitzJohn R.G., Alfaro M.E., and Harmon L.J. (2014). geiger v2. 0: an expanded suite of methods for fitting macroevolutionary models to phylogenetic trees. Bioinformatics 30, 2216–2218. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




