Abstract
Triploids are rare in nature because of difficulties in meiotic and gametogenic processes, especially in vertebrates. The Carassius complex of cyprinid teleosts contains sexual tetraploid crucian carp/goldfish (C. auratus) and unisexual hexaploid gibel carp/Prussian carp (C. gibelio) lineages, providing a valuable model for studying the evolution and maintenance mechanism of unisexual polyploids in vertebrates. Here we sequence the genomes of the two species and assemble their haplotypes, which contain two subgenomes (A and B), to the chromosome level. Sequencing coverage analysis reveals that C. gibelio is an amphitriploid (AAABBB) with two triploid sets of chromosomes; each set is derived from a different ancestor. Resequencing data from different strains of C. gibelio show that unisexual reproduction has been maintained for over 0.82 million years. Comparative genomics show intensive expansion and alterations of meiotic cell cycle-related genes and an oocyte-specific histone variant. Cytological assays indicate that C. gibelio produces unreduced oocytes by an alternative ameiotic pathway; however, sporadic homologous recombination and a high rate of gene conversion also exist in C. gibelio. These genomic changes might have facilitated purging deleterious mutations and maintaining genome stability in this unisexual amphitriploid fish. Overall, the current results provide novel insights into the evolutionary mechanisms of the reproductive success in unisexual polyploid vertebrates.
Subject terms: Genetics, Evolution
Genome sequencing and haplotype assembly of two cyprinid teleosts, a sexual tetraploid and an unisexual hexaploid, reveal insights into the evolutionary mechanisms underpinning the reproductive success of unisexual polyploid vertebrates.
Main
The genus Carassius are very important aquaculture fish and a rare group of vertebrates with different ploidies, including tetraploids and hexaploids1–3. Previous studies revealed that the chromosomes of C. gibelio have undergone a two-step evolutionary process4. Approximately 10 million years ago (Mya), an ancient hybridization of two distant species in the family Cyprinidae led to the origin of the common ancestor of Carassius, Cyprinus and Sinocyclocheilus. Both ancestral parents had 50 chromosomes (2n = 2× = 50); thus, the allotetraploidy resulted in a doubling of the chromosome number to 100 (2n = 4× = 100) (refs. 3,5,6). Then, C. gibelio experienced subsequent autotriploidy and possessed approximately 150 chromosomes (3n = 6× ≈ 150) (refs. 4,7–9). Therefore, the hexaploid C. gibelio could also be considered a triploid.
Triploids are generally considered an evolutionary ‘dead end’ because of two major challenges to become true ‘species’10. First, triploid organisms usually cannot produce gametes because pairing and equal segregation of three homologous chromosomes in meiotic and gametogenic processes are insurmountable. Second, the ability of recombination to purge deleterious mutations and generate new traits is reduced without sexual reproduction11,12. Unisexual organisms are thought to have high intra-individual genetic diversity (Meselson effect) and accumulation of deleterious mutations (Muller’s ratchet) because of the lack of meiotic recombination11,13–15. However, triploids are commonly found in some polyploid complex species, including the Loxopholis complex16, Misgurnus complex17, Poecilia complex18 and Carassius complex1,19. Interestingly, triploid C. gibelio overcomes reproductive obstacles via unisexual gynogenesis, where the eggs are activated by the sperm of sympatric sexual species to initiate embryogenesis, such as by kleptospermy in the Amazon molly20,21, and occupies a wider range of habitats and possesses higher genetic diversity than related sexual species1,19,22,23. However, the evolutionary mechanisms underpinning the unisexual reproduction of C. gibelio remain unknown.
In this study, we sequenced the genomes of the Carassius polyploid complex, including C. gibelio and its close relative C. auratus, and assembled their two high-quality subgenomes (A and B) that were created during the allotetraploidy event. Combined with resequencing data from different strains, we found that the investigated C. gibelio descended from an autotriploidy event hundreds of thousands of years ago. Comparative genome analysis and cytological observations revealed that some meiotic cell cycle-related genes and an oocyte-specific histone variant have intensively expanded and changed, which provided the genomic variation evidence that facilitates gynogenetic oogenesis in C. gibelio. Moreover, unexpected sporadic homologous recombination and a high level of gene conversion among homologues may be the main driver to purge deleterious mutations in C. gibelio. Overall, these novel discoveries provide unprecedented insights into a rare reproductive mode in nature and the underlying genomic evolution mechanism. Additionally, the newly sequenced genomes are valuable resources for precise genetic breeding of Carassius species in aquaculture.
Results
C. gibelio and C. auratus genome sequencing and assembly
PacBio, Illumina and Hi-C sequencing technologies were applied to generate a high-quality genome assembly for C. gibelio and C. auratus (Supplementary Tables 1–4 and Supplementary Fig. 1). The Illumina short reads were first used to investigate the polyploidy through Smudgeplot analysis (Supplementary Note 1)24. In C. auratus, 58% of heterozygous k-mer pairs (with only one nucleotide difference and presented as x and x′) are bivalent (xx′) and 33% of heterozygous k-mer pairs are tetravalent (xxx′x′ and xxxx′) (Extended Data Fig. 1a). This pattern is consistent with amphidiploid (a synonym of allotetraploid25, AABB) characteristics, where two subgenomes are quite divergent but still homologous. In contrast, C. gibelio had mostly heterozygous k-mer pairs with the structure xxx′ (72%), followed by heterozygous k-mer pairs with the structure xxxx′x′x′ (23%) (Extended Data Fig. 1b), which fits the AAABBB genotype. The estimated haplotype genome size of C. gibelio ranged from 1.49 to 1.56 Gb in k-mer analysis, which is approximately one-third of the genome content (4.70–5.38 pg) estimated by flow cytometric analysis26,27 and similar to the estimated haploid genome size of C. auratus (Supplementary Table 5 and Supplementary Note 1). These results indicate that both of the species have the same amphihaploid content (AB).
The haplotype genome of C. gibelio comprised 2,804 contigs, with a length of 1.59 GB and contig N50 of 1.71 Mb (Supplementary Table 6). In total, 2,063 contigs were anchored into 50 chromosomes with a total length of 1,502.18 Mb using the Hi-C data (Fig. 1a, Supplementary Table 7 and Supplementary Fig. 2). The assembly contained 98.16% of complete benchmarking universal single-copy orthologs (BUSCO) genes, 45,249 protein-coding genes and 728.98 Mb (45.85%) of repeat contents (Supplementary Tables 8–13 and Supplementary Note 2). The C. auratus genome was also assembled with a size of 1.52 Gb and contig N50 of 3.89 Mb, and anchored to 50 chromosomes (Fig. 1a, Supplementary Fig. 3 and Supplementary Tables 6 and 7). The 50 chromosomes of the both fish were divided into two subgenomes, each of which included 25 chromosomes (Fig. 1b), based on the annotation of gene and repeat content. The partition of subgenomes was observed to be consistent with previously published domestic goldfish and common carp genomes through synteny analysis (Supplementary Figs. 4 and 5).
Because both the k-mer estimated and assembled genome sizes of C. gibelio were approximately one-third of the genome content, it was evident that the genome assembly included only AB subgenomes; this was the same as the genome assembly of C. auratus. To validate this inference, we made the following two comparisons. First, we performed synteny analysis between C. gibelio and C. auratus, and found that each of their chromosomes aligned well without obvious chromosomal fission or fusion events (Fig. 1a). Second, the reads of each species were mapped back to corresponding genome assemblies to evaluate the allele frequencies and read depths. The minor allele frequencies of most chromosomes were found to be ~0.33 in C. gibelio and ~0.50 in C. auratus (Fig. 1c). The read depths across the genome were also approximately three times that of the single haplotype in C. gibelio and two times that of the single haplotype in C. auratus (Fig. 1d).
Moreover, to provide more evidence at the genomic block and gene levels, we performed an allelic analysis by BAC phasing and polymerase chain reaction (PCR) verification. We found that most of the phased blocks indeed had three homologous alleles for both A and B subgenomes in C. gibelio (Supplementary Fig. 6), and the functionally investigated foxl2 and viperin were also demonstrated to contain three highly identical alleles28,29. These results clearly show that both the genome assemblies of C. gibelio and C. auratus comprise one haplotype of the AB subgenomes, but C. gibelio has three haplotypes for most chromosomes (this will be discussed in a later section) and C. auratus has two haplotypes for all chromosomes (Fig. 1e). Following the nomenclature of amphidiploid, we called C. gibelio an amphitriploid (AAABBB) with two triploid sets of chromosomes, each of which was derived from a different ancestor.
Allotetraploidy and genomic variations of Carassius
The phylogenetic relationship was reconstructed using both concatenated and coalescent methods (Fig. 2 and Supplementary Fig. 7). Consistent with previous studies5,30, subgenome B had a closer relationship to the diploid mud carp (Cirrhinus molitorella) and Yunnan Wenkong Barbinae fish (Poropuntius huangchuchieni) than subgenome A. It could be inferred that: (1) the progenitor-like genomes (ancestors of subgenomes A and B) diverged around 19.50 Mya (T1) (Fig. 2); (2) the allotetraploidy event (the hybridization of subgenomes A and B) occurred between 10.17 and 12.87 Mya (T2), based on the divergence times of common carp (Cyprinus carpio) versus Carassius, and versus P. huangchuchieni; and (3) the divergence time of C. gibelio and C. auratus occurred around 0.96 Mya (T3) (Fig. 2). The new estimates of timing were more ancient than previously thought (T1: 13.75 to 15.09 Mya) (ref. 30) partially because we discarded a suspicious time calibration: the divergence time between Cyprininae and Leuciscinae (~20.5 Mya) (refs. 30,31). This widely used time calibration was not from fossil records but from estimation based on several nuclear and mitochondrial genes along with the mutation rate of mammals32. Compared with previous dating, newly estimated divergence times without this calibration have a better fit to the distribution of synonymous mutations (Ks) between species (Supplementary Fig. 8). In addition, we noticed that the phylogenetic position of Cirrhinus molitorella and a previous study30 conflicted with another previous study33, in which a single gene (rag2) tree was constructed and the results showed that C. molitorella was an outgroup of both subgenomes A and B. To determine why this inconsistency occurred, we further examined the proportion of topology for each orthologous gene. The results highlighted a high level of phylogeny heterogeneity (Supplementary Table 14), and the topology with the highest proportion was consistent with the current phylogenetic tree.
The evolution of subgenomes of these carps has been widely studied5,30,31,33–35, and here, the more dominant subgenome B was confirmed (Supplementary Fig. 9 and Supplementary Note 3). Also, we have identified genes that are specifically lost in Carassius species (Supplementary Table 15, Supplementary Fig. 10 and Supplementary Note 4).
Autotriploidy origin and genomic changes of C. gibelio
Overall, six C. gibelio individuals from three strains were used to investigate the origin of this unisexual species, including three individuals for strain A+, two for strain H and one for strain F (Supplementary Table 16). Combined with ten C. auratus individuals and one Cyprinus carpio individual downloaded from public databases (Supplementary Table 17), 48,843,026 single-nucleotide polymorphisms (SNPs) and 8,431,930 insertions and deletions were called within C. gibelio using the C. auratus genome assembly as a reference (Supplementary Table 18). The depth distributions of minor alleles revealed that almost all C. gibelio individuals had three alleles for each chromosome, whereas all C. auratus individuals had two alleles for each chromosome (Extended Data Fig. 2); this further confirmed that C. auratus and C. gibelio are amphidiploid and amphitriploid, respectively.
Principal component (PC) analysis was used to examine the phylogenetic relationships among different strains of C. gibelio and C. auratus. The first component explained 18.62% of the genetic variance and showed a clear split between C. gibelio and C. auratus, whereas the second component explained 13.28% of the genetic variance and showed clear distance among the three strains of C. gibelio that could be associated with the lack of gene flow due to unisexual reproduction (Fig. 3a). The maximum likelihood tree yielded similar results (Fig. 3b). Moreover, 4,400 non-coding elements were found to be shared by all C. gibelio individuals (Supplementary Fig. 11 and Supplementary Note 5) but were absent in C. auratus, Cyprinus carpio and S. graham, indicating that they are newly evolved elements in C. gibelio. Taken together, these results suggest that the investigated C. gibelio might have a common origin.
The divergence time of the three C. gibelio strains was estimated to be approximately 0.82 Mya (T4) using four degenerated sites (Supplementary Fig. 12). Therefore, all C. gibelio lines probably originated from an amphidiploid ancestor that experienced an autotriploidy event at approximately 0.82–0.96 Mya (Fig. 3c). This also means that the unisexual reproduction of C. gibelio has been maintained for a long time.
We also noticed that some chromosomes in the individuals, including C. gibelio (Cg)-F1, Cg-A1, Cg-A2 and Cg-A3, exhibited unusual alterations of allele frequencies and read depths (Supplementary Fig. 13). Compared with other chromosomes, these unusual chromosomes from different individuals had allele frequencies of approximately 0.50, which is very close to that of C. auratus chromosomes, and had approximately 2/3 or 4/3 the read depths of other C. gibelio chromosomes (Supplementary Fig. 13). These data indicate that these chromosomes have lost or obtained one haplotype. In addition, we estimated the expression ratios of the individual Cg-F for each chromosome compared with the corresponding C. auratus genes. In a global analysis that combined seven tissues to determine the average expression levels of orthologous genes between C. auratus and C. gibelio, the three unusual chromosomes displayed clear decreases in average gene expression ratio (P = 6.86 × 10−7, 6.24 × 10−8 and 2.21 × 10−9, t-test), and were only approximately 2/3 that of other chromosomes (Supplementary Fig. 14).
Expansion of meiosis-related genes in the C. gibelio genome
In triploids, the three homologous chromosomes cannot pair correctly or segregate equally during meiosis I, which causes failure of gametogenesis36. To understand what happens in C. gibelio oogenesis, we first measured the DNA content during oocyte development. The DNA content of C. gibelio oocytes at early prophase was approximately 1.67 times that of corresponding C. auratus oocytes (Fig. 4a), whereas the DNA content of C. gibelio mature oocytes was approximately 3 times that of C. auratus mature oocytes (Fig. 4a); this indicates formation of unreduced eggs in C. gibelio compared with formation of reduced eggs in C. auratus. Additionally, compared with 50 bivalents in C. auratus, an average of more than 130 univalents was counted in germinal vesicle breakdown oocytes of C. gibelio (Fig. 4b); these findings suggest that chiasmata, which physically connect homologous chromosomes, were largely missing. Therefore, meiosis I was suppressed during oogenesis in C. gibelio (Fig. 4c).
To explore the genomic clues concerning the unreduced eggs in C. gibelio, we performed an in-depth comparative genomic analysis and found a total of 13 gene families that have more copies in all C. gibelio individuals compared with C. auratus and Cyprinus carpio (Fig. 4d and Supplementary Table 19). Interestingly, nine of the expanded gene families have important roles in oocyte development, especially in meiosis and spindle organization. The most expanded gene is a histone variant, h2af1al, of which the B homeologue has expanded to 11 copies in the C. gibelio assembly (Fig. 4e). Five of the expanded copies (B1–B5) were found to be specifically expressed in the ovary (Fig. 4e). Further, transcriptomic analyses of the isolated oocytes and embryos indicated that these histone variants are maternal factors with high expression in pre-vitellogenic oocytes (POs) and vitellogenic oocytes (VOs), which correspond to pre- and post-diplotene stages of meiosis prophase I, respectively. Histone variants can replace canonical histones to remodel chromatin and affect histone post-translational modifications37, and H2af1al has the ability to modify nucleosome properties during oogenesis in C. gibelio38.
Importantly, all of the expanded meiosis-related genes, including two cell cycle-related genes (fbxo5 and ccna2), three spindle organization genes (rhoA, incenp and nusap1) and three nuclear envelope-related genes (lem4, lap2 and bmb), were assigned to the common meiosis pathway of oocyte development (Fig. 4f). Most of them (22 of the 26 extra copies of the eight expanded genes) were expressed in the ovary, POs or VOs (RPKM >1) (Supplementary Fig. 15), indicating that they have roles in oocyte development of C. gibelio. We also noticed that most of the new expanded copies were distributed far from the parental copies in genome, with only three exceptions (Extended Data Fig. 3a,b and Supplementary Table 19). In particular, all of the extra copies of h2af1al (11 extra copies) and faap24 (two extra copies) were adjacent to a C. gibelio-specific repeat unit (Extended Data Fig. 3c), indicating that the expansions of these genes might have been mediated by repetitive sequences. The above data suggest that an alternative oogenic pathway to produce chromosome number-unreduced eggs is probably related to intensive expansion of meiosis-related genes in C. gibelio.
Gene conversion and sporadic homologous recombination
It is usually believed that unisexual organisms cannot purge deleterious mutations because no homologous recombination exists during gametogenesis. To study whether deleterious mutations accumulate in C. gibelio, we first compared the genomic heterozygosity between the two Carassius species. The percentage of heterozygous sites is approximately two times higher in C. gibelio than in C. auratus (Fig. 5a). As C. gibelio has three haplotypes per chromosome, this difference is not surprising. We then investigated the number of loss-of-function mutations, non-synonymous substitutions and synonymous substitutions in the two Carassius species using Cyprinus carpio as a reference. Interestingly, there was no notable difference between the two species and all three types of mutations exhibited similar distribution patterns (Fig. 5b and Supplementary Fig. 16). These results indicate that C. gibelio is likely to have the ability to purge mutations, including deleterious mutations, even though it reproduces unisexually.
To evaluate the ability of C. gibelio to purge mutations, we conducted a four-generation breeding experiment for 5 years and tested whether loss of heterozygosity (LOH) occurred in the laboratory environment. LOH is a common form of allelic imbalance by which a heterozygous allele becomes homozygous by deleting one homologue or gene conversion, a unidirectional modification of the DNA sequence between similar sequences (Extended Data Fig. 4). Using 11 individuals from the offspring of the gynogenetic line (Supplementary Table 20), we identified 805 LOH regions across 46 chromosomes (Fig. 5c). Most LOH regions were shared by many individuals and thus were probably inherited from ancestors; however, a few were unique, which means they should be newly occurring in individuals (Fig. 5c). PCR and Sanger sequencing validated 97 out of 101 arbitrarily selected LOH loci (Supplementary Fig. 17). The rate of LOH was estimated to be 1.49 × 10−4 per heterozygous site per generation (Supplementary Table 21), which was much higher than the base-substitution mutation rate of 8.88 × 10−9 (Methods). The rate of homologous gene conversion was 1.42 × 10−4 per heterozygous site per generation (Supplementary Table 22), which indicated that gene conversion is responsible for the vast majority of LOH. The gene conversion rate of C. gibelio is two orders of magnitude higher than that of the reported unisexual species39,40 and nearly reaches the reported range of some sexual species41,42, which have an efficient deleterious mutation purging mechanism through recombination in normal meiosis.
Gene conversion has been revealed to be able to compensate for the lack of meiotic recombination in diploid asexual/unisexual organisms43. When an LOH event occurs in a genomic region of diploid species, a variant may be cleared or spread, both at a ratio of 50% (Fig. 5d, top). However, there are six possible scenarios of gene conversion in triploid species (Fig. 5d, bottom). In two of the scenarios, the newly occurring mutation was eliminated; in two other scenarios, the proportion of this mutation did not change; and in the last two scenarios, this mutation expanded to more alleles. Therefore, gene conversion can purge mutations and increase diversity among offspring in a more complex manner for triploids.
To understand this from a detailed perspective, we presented two candidate gene conversion regions (Fig. 5e and Extended Data Fig. 5). According to the read coverage of SNP sites between the individuals from the gynogenetic C. gibelio pedigree that did or did not experience gene conversion (Supplementary Fig. 18), the haplotype blocks of gene conversion could be inferred (see the detailed description in Supplementary Note 6). As shown in Fig. 5f, after gene conversion from haplotype 1 to haplotype 2, 12 out of 35 SNP sites (~1/3) became homozygous, which resulted in LOH; the other sites were still heterozygous, among which 14 SNP sites were clearly converted, and nine SNP sites looked unchanged because their haplotypes 1 and 2 had the same bases before conversion. Therefore, high gene conversion might render C. gibelio capable of purging deleterious mutations and may be associated with the alternative ameiotic oogenic mechanism.
Consequently, we comparatively explored chromatin behaviour and recombination occurrence during oogenesis of sexual C. auratus and unisexual C. gibelio through co-immunostaining with anti-antibodies for synaptonemal complex (SC) transverse element (Sycp1), lateral element (Sycp3) and recombinase Rad51 (refs. 44,45). Typical SC formation and homologous recombination were observed in C. auratus, in which 50 synaptonemal bivalents and numerous recombinase Rad51-stained foci were visible, and the highest number of foci was reached (over ~200 per cell on average) at zygotene (Fig. 6a). In contrast, only Sycp3-stained univalents appeared in most oocytes of C. gibelio (Fig. 6a), which indicated that SC did not assemble within these oocytes. Homologous recombination indicated by Rad51 signals was also largely suppressed, but sporadic Rad51-stained foci were observed in oocytes of C. gibelio (Fig. 6a). Importantly, the ratio of the Rad51-positive oocytes was found to have an increasing trend along with the progress of oocyte development (Fig. 6b), in which some oocytes (~2.5%) even showed high levels of Rad51-stained foci (over 400) and synaptonemal bivalents (over 20) (Fig. 6c). The different levels of homologous recombination revealed in different oocytes of C. gibelio are consistent with the large variations of gene conversion rates observed among different gynogenetic individuals (Supplementary Table 22), indicating an association between them because non-crossover homologous recombination usually results in gene conversion46.
Discussion
The genomic anatomy of polyploids has been broadly determined in plants and animals, such as in a tetraploid frog (LLSS)47, hexaploid wheat (AABBDD)48 and octoploid strawberry (AABBCCDD)49. However, these dissected polyploid genomes actually represent diploid genomes that contain two or multiple subgenomes. Here, we provide an assembly of an amphitriploid genome (AAABBB), where most genes commonly have two divergent homeologues and each homeologue possesses three highly similar alleles. Although phasing is not complete because of the recent autotriploidy event and the limitation of error-prone long reads, we revealed important genomic changes based on this assembly, including intensive expansion of many meiosis-related genes and a high rate of gene conversion.
Recently, Hojsgaard and Schartl proposed that a genomic assemblage and an alternative reproductive module might be required for the formation of a functioning asexual/unisexual genome50. Intriguingly, the unique amphitriploid genome just represents a non-recombinant genomic assemblage, with intensive expansion and alterations of meiotic cell cycle-related genes and an oocyte-specific histone variant (Fig. 4d,e and Supplementary Fig. 15). These genomic alterations might act as a complementary reproductive module to skip meiosis using an alternative ameiotic pathway to develop into unreduced eggs, and may be essential for the success of unisexual gynogenesis in C. gibelio.
It has been argued that asexual/unisexual lineages should go extinct quickly because they have a reduced ability to purge deleterious mutations and generate high levels of heterozygosity51,52. Similar to C. gibelio, some extant asexual lineages do not exhibit such genomic decays40,53,54. Ameiotic homologous recombination that results in gene conversion has been proposed to be the mechanism to conquer these hindrances for the evolutionary longevity of asexual/unisexual lineages14,40,43. Interestingly, we observed sporadic homologous recombination during oocyte development, and the high rate of gene conversion in C. gibelio is even two orders of magnitude higher than the famous unisexual Amazon molly40, indicating that C. gibelio might have an efficient way to increase genetic diversity and purge deleterious mutations. Besides high gene conversion rate, in sharp contrast to other unisexual vertebrates, rare and variable proportions of males (1.2–26.5%) have been found in wild populations of C. gibelio55. Previous studies revealed that the male-specific supernumerary microchromosomes may be the main driving forces for the occurrence of genotypic males56,57 and could result in the creation of beneficial genetic diversity58,59. Therefore, gene conversion and sex might play a key role in fine-tuning the efficiency of gynogenesis60 and contribute to the long evolutionary existence of C. gibelio. However, after initial attempts, we were unfortunately not able to detect substantial mutations around the potential master sex gene amh61 between C. auratus and C. gibelio (Supplementary Fig. 19). Additionally, we failed to obtain any informative male-specific supernumerary sequences from one male individual of C. gibelio (Supplementary Note 7). A high-quality male genome assembly for C. gibelio will be required to uncover the mechanisms underlying male determination62 and gene conversion in the future.
In addition to the genetic importance of our results, the current genomic anatomy in the Carassius complex is also of biological value for genetic breeding to improve aquaculture strains because C. gibelio is one of the most important aquaculture species in China, with approximately 3 million tons of annual production capacity. In the past decades, several new varieties, including allogynogenetic gibel carp63, high dorsal gibel carp64, gibel carp ‘CAS III’ (ref. 65), gibel carp ‘CAS V’ (refs. 66,67) and ‘Changfeng’ gibel carp68,69, have been successfully bred and have made important contributions to Chinese aquaculture70,71. Thus, the genomic data of amphitriploid C. gibelio will provide a valuable resource for accelerating the genetic analysis of economic traits and the precise breeding of new varieties.
Overall, our data and analyses have provided important insight into the genome structure, evolutionary history and genetic maintenance mechanism of the unique amphitriploid C. gibelio. Nevertheless, it is noteworthy that better genome assemblies with all chromosomes phased, which requires very advanced sequencing technology, may be able to provide more comprehensive genetic data to infer the complete picture of the evolution and maintenance of the rare amphitriploid genome of C. gibelio.
Methods
Experimental fish
All individuals were maintained and sampled from the National Aquatic Biological Resource Center. Animal experiment was approved by the Animal Care and Use Committee of the Institute of Hydrobiology (IHB), Chinese Academy of Sciences (CAS) (approval ID keshuizhuan 0829).
Genome and transcriptome sequencing
Genomic DNA was extracted from the blood cells of a female adult individual from strain F of C. gibelio and of an adult female from C. auratus, separately. The short reads were sequenced for the two species using Illumina Hiseq2000 with PE 100 bp and PE 49 bp respectively for short (170, 250, 500 and 800 bp) and long (2, 5, 10, 20 and 40 kb) insert size libraries. BAC libraries with an insert fragment size of 120 kb in length were constructed only for C. gibelio. A total of 95,492 BAC clones (~6.4×) were randomly selected to extract plasmids. For each clone, unique index primer and adapter index were linked to the fragment end, and a 500 bp insert size library was constructed and used for Illumina sequencing with PE 100 bp to a coverage depth of ~100×. The single-molecule long reads were sequenced for both species using Pacific Biosciences Sequel instrument with libraries with a 20-kb average DNA insert size.
For Hi-C sequencing, blood cells were fixed with 2% formaldehyde for each species independently. The cross-linked DNA was digested with MboI, and the sticky ends were biotinylated by incubating with biotin-14-dATP and Klenow enzyme. After DNA purification and removal of biotin from unligated ends, Hi-C products were enriched and physically sheared to fragment sizes of 200–300 bp. The biotin-tagged Hi-C DNA was pulled down and processed into paired-end sequencing libraries that were sequenced PE 100 bp on the Illumina Hi-Seq2000 platform. At last, 440 Gb and 231 Gb Hi-C data were obtained from C. gibelio and C. auratus, respectively.
RNA was extracted from samples of C. gibelio and C. auratus, including eight adult tissues (heart, liver, kidney, muscle, ovary, hypothalamus, pituitary and other brain), POs and VOs72, and embryos at seven developmental stages (four-cell, blastula, gastrula, bud, eight-somite, 1 day post-fertilization (dpf) and 3 dpf). Three biological replicates were analysed per sample. In total, 102 RNA-seq libraries were constructed and sequenced on Illumina Hiseq 2000 platform.
Genome assembly and chromosome anchoring
Pacbio long reads were used for de novo assembly by NextDenovo (https://github.com/Nextomics/NextDenovo) software (v2.3.1). Then the Pacbio long reads and all Illumina reads were used to correct raw de novo assembly by Nextpolish software (https://github.com/Nextomics/NextPolish) (v1.3.1, with parameter task=best). Subsequently, Hi-C sequencing data were used to improve the draft genome, and the Hi-C data were mapped to the polished assembly genome with Juicer (v 1.6) (ref. 73). Next, a chromosome-length assembly was generated by the 3D-DNA software (v180922 with default parameters)74. To further improve the chromosome-scale assembly and quality control, manual review and refinement of the candidate assembly were performed by Juicebox Assembly Tools74. The haplotigs and overlapping sequence in the assemblies were removed by using Purge_dups (https://github.com/dfguan/purge_dups) software (v1.0.1).
Genome annotation
The repetitive sequences were annotated using both homology-based and de novo predictions. First, the long terminal repeats and tandem repeats were identified using LTR FINDER (v1.0.5) and TRF (v4.07b)75. Second, the transposable elements (TEs) were identified using RepeatMasker (v4.0.5) (ref. 76) and RepeatProteinMask (v1.36) with the Repbase TE library. Finally, RepeatModeler (v1.0.8) (ref. 77) was used to construct a de novo TE library, which was then used to predict repeats with RepeatMasker (v4.0.5).
To comprehensively annotate genes, we integrated different evidence. For de novo prediction, AUGUSTUS (v3.2.1) (ref. 78) was used to predict coding genes with the repeat-masked genome. For the homologue-based approach, protein-coding sequences from three different species, Danio rerio (GRCz11), Oryzias latipes (GAculeatus_UGA_version5) and Gasterosteus aculeatus (ASM223467v1), were mapped against the repeat-masked genome using tBLASTN79 with an E-value cut-off of 10−5. Then, GeneWise (v2.2.0) (ref. 80) was used to predict gene models with the aligned sequences as well as the corresponding query proteins. Additionally, Illumina RNA-seq data of C. gibelio and C. auratus were mapped to genome of C. gibelio and C. auratus, respectively, using HISAT2 (v2.1.0) (ref. 81) and were assembled to transcripts using StringTie (2.1.4) (ref. 82) software. In addition, we generated whole-genome alignments to project the Ensembl gene annotation for D. rerio by TOGA (https://github.com/hillerlab/TOGA). Finally, EVM (v1.1.1) (ref. 83) was used to integrate all evidence to produce the final gene sets.
Gene functions were assigned according to the best match of the alignment to the public databases, including Swiss-Prot (release-2017_09), TrEmBLE (release-2017_09) (ref. 84), KEGG (v84.0) (ref. 85), COG86 and NCBI NR (v20170924) protein databases. The motifs and domains in protein sequences were annotated using InterProScan (InterProscan-5.16-55.0) (ref. 87) by searching publicly available databases, including Pfam, PRINTS, PANTHER, ProDom, SMART, ProSiteProfiles and appl ProSitePatterns. The actinopterygii_odb10 lineage dataset was selected to measure the completeness of the geneset using the BUSCO method88.
Subgenome-specific repeats and subgenome distinction
Firstly, we classified the TEs into clusters according to the target sequences in the Repbase or de novo consensus library. Then we analysed the distribution of each cluster in the chromosomes. For each homoeologous chromosome pairs of subgenomes A and B (LG1 versus LG2, LG3 versus LG4, …), we found some clusters with a notable difference in the homoeologous pairs. If one cluster is an alternative in all the 25 homoeologous chromosomes pairs, it should and could be a specific marker to classify the two subgenomes, which originated from two distinct progenitor species. Finally, we identified the A-subgenome specific TEs in C. gibelio that targeted two consensuses from de novo library, and identified the B-subgenome specific TEs that targeted three de novo sequences. The same pattern of subgenome-specific repeats was also found in C. auratus. The subgenome distinction was also validated by comparing with previous studies5,30,31,33–35 by synteny alignment.
In addition, we used MCScan89 to identify syntenic blocks between C. gibelio genome and C. auratus genome, between subgenomes A and B of C. auratus, between subgenomes A and B of C. gibelio, and with other published genomes with the parameters of -a -e 1e-5 -u 1 -s 5. Firstly, we conducted an all-vs-all BLASTP to align proteins of the two genesets with the E-value parameters ‘1e-5’. The alignments were then subjected to MCScan to determine syntenic blocks, which were visualized by using CIRCOS software90.
Resequencing-based ploidy analysis
BWA (Version 0.7.12-r1039) (ref. 91) was used to map the Illumina reads of the two C. auratus and six C. gibelio generated in this study (Supplementary Table 16) to their respective genomes and subsequently sorted by SAMtools (Version 1.4) (ref. 92) to obtain the bam files. The SNPs were called by FreeBayes (v0.9.10-3-g47a713e)93 and filtered by following four thresholds: (1) ratio of two alleles depth between 1:9 and 9:1 for Cg and between 1:6 and 6:1 for C. auratus (Ca); (2) the highest sequencing depth of SNP position <200× for Cg and <400× for Ca; (3) the lowest sequencing depth for each allele ≥5; (4) the minimum distance for adjacent SNPs ≥5 bp. Then, the density distribution of the three alleles (reference, alternative and both) of all SNPs was counted, where the smallest peak of the distribution was defined as the depth of single haplotype. The genomic ploidy (n) was evaluated through a 1 Mb non-overlapping sliding window by the following equation:
k is the number of SNPs in a window.
In addition, the distribution of heterozygosity was estimated using 500 kb non-overlapping sliding windows for each individual. The potential effects of these SNPs were evaluated by SnpEff 94 with default parameters.
BAC-based ploidy analysis
We split each BAC library data by index sequences, filtered and assembled each BAC clone in SOAPdenovoso2-r244 software95. The haplotype sequences were phased using pairs of adjacent tri- or bi-allelic SNPs that could be spanned by a single Illumina read (SNP pair). The BAC sequences that could be well phased and contain at least four genes were selected for further PCR validation and plotting.
Phylogenetic analysis of C. gibelio and C. auratus
To understand the evolution of the subgenomes A and B of C. gibelio and C. auratus, genomes of six Cyprinidae fishes were retrieved from public database: Cirrhinus molitorella (GCA_004028445.1), Megalobrama amblycephala (http://gigadb.org/), D. rerio (Ensembl GRCz11), Ctenopharyngodon idellus (http://bioinfo.ihb.ac.cn/gcgd/php/index.php), Poropuntius huangchuchieni (Datadryad, 10.5061/dryad.crjdfn32p) and Cyprinus carpio (GCA_018340385.1). The 11 peptide sequence sets from five genomes (C. molitorella, M. amblycephala, D. rerio, C. idellus and P. huangchuchieni) and six subgenomes (subgenome A of C. gibelio, C. auratus and Cyprinus carpio, subgenome B of C. gibelio, C. auratus, and Cyprinus carpio) were subjected to DIAMOND96 to conduct all-to-all blast to identify the potential homologous sequences with an E-value <10−5.
The protein sequences of the 1:1:1 orthologous genes were aligned using MUSCLE (v3.8.425) (ref. 97) with the default parameters. These alignments were subsequently converted into coding sequence alignment by tracing the coding relationship using pal2nal.v14 (ref. 98). Gblocks (v0.91b) (ref. 99) was employed to conduct further checks (trim) on the coding sequence alignments with parameters ‘-t = c’. The 4d sites were extracted from the gene sequences retained in the last step. The divergence times between individual species (subgenomes) were estimated using MCMCTree100 by using the 4d sites and species tree from ASTRAL101 analysis. Time calibration consults fossil record information: 40.4–48.6 Mya for the time of the most recent common ancestor of D. rerio and C. auratus102–105.
On the basis of DIAMOND96 blast results, we selected the reciprocal optimal gene pairs for each species (subgenome) and C. auratus subgenome B. These pairs were aligned by MUSCLE97 and the Ks values were calculated by KaKs_Calculator2.0 (ref. 106) with the default parameters. Correlation between divergence times of species pairs from various studies and peak values of Ks distribution was assessed by least-squares-based regression analysis.
Phylogenetic analysis of six C. gibelio individuals
BWA (Version 0.7.12-r1039) (ref. 91) was used to map the Illumina reads of the ten C. auratus, six C. gibelio and one Cyprinus carpio (Supplementary Tables 16 and 17) to the C. auratus genomes, and subsequently sorted by SAMtools (Version 1.4) (ref. 92) to obtain the bam files. The SNPs were called by FreeBayes (v0.9.10-3-g47a713e)93 with parameters ‘–gvcf–min-coverage 5–limit-coverage 200’. Subsequently, PLINK v1.90b6.6 (ref. 107) was used to conduct PC analysis. Moreover, the 4d sites were extracted on the basis of the ‘GFF’ file of the C. auratus genome and the obtained SNPs. The evolutionary relationships of all resequenced individuals were then constructed by RAxML-8.2.12 (ref. 108) under settings ‘-m GTRGAMMA -x 12345 -N 100 -p 12345’. The divergence times between individuals were estimated by MCMCTree100 along the newly obtained evolutionary tree. The time calibration points refer to the previously obtained time settings for Cyprinus carpio–C. auratus (9.216–11.11 Mya) and C. auratus–C. gibelio (0.86–1.051 Mya) (Fig. 2).
Lineage-specific gene expansion in C. gibelio
The Illumina reads of the two C. auratus, six C. gibelio and one Cyprinus carpio (Supplementary Table 16) to the C. auratus genome using BWA (Version 0.7.12-r1039)91. We first identified the homologous sites whose minimum value of reads depth of all C. gibelio individuals were greater than twice the maximum value of the individuals of other species in the whole genome. Then, the genes whose coding sequence contains more than 60% of such sites were selected as genes that are potentially expanded in C. gibelio. For each of such genes, we examined its copy number in the genome assemblies of C. gibelio, C. auratus and Cyprinus carpio combined with given gene annotation file and manual annotation with GeneWise80 using default settings.
LOH analysis
For LOH analysis, one female individual of G4 generation of clone F (ref. 66) was selected to construct a C. gibelio clonal line by reproducing successive four generations via gynogenesis. We sequenced 11 individuals (~48× depth for each sample) from the offspring of the gynogenetic line and called SNPs of each individual as the method in ‘Resequencing-based ploidy analysis’. After multi-step filtering, we obtained 64,246 LOH sites, in which 101 LOH sites were randomly selected for PCR validation. The contiguous tracts of LOH sites were also extracted and classified into two types: caused by gene deletion or by gene conversion. Finally, the rates of LOH, gene deletion and gene conversion were calculated respectively. The details of the above processes are documented in Supplementary Note 4.
Base-substitution mutation analysis
On the basis of the SNPs obtained in the ‘LOH analysis’ step, we analysed each line for base-substitution mutations and calculated the mutation rate. We analysed mutation sites using the following criteria: (1) The non-triploid chromosome was filtered for each line separately. (2) The minimum coverage was 20× and maximum coverage 80×, on average. (3) Sites directly adjacent to small insertion–deletion mutations were filtered to avoid false-positive inferences created by misalignment. (4) For each SNP site of one line, the coverage depth of minor allele ≥6× was considered as heterozygous site of the line, and ≤2× was considered as homozygous site. (5) Ambiguous SNPs with coverage depth of minor allele >2× and <6× were filtered. Mutation sites were called only when they arose at highly credible ancestrally homozygous sites, and generated unambiguous heterozygous genotype in only one line. We calculated the mutation rate by the mutation sites of G4-4, G4-7, G4-8 and G4-9 using the equation μbs = m/(3nT) (ref. 109). Where μbs is the base-substitution rate per site per generation, m is the observed number of base substitutions, 3n are the total number of analysed sites and T is the number of generations. Finally, the base-substitution mutation rate of C. gibelio is 8.88 × 10−9 per site per generation, a little higher than the rate of C. auratus.
Antibody preparation, chromosome spreading and immunofluorescence
The sequence (5–150 amino acids) of C. gibelio Sycp3 was cloned to produce His-tag fusion protein. A peptide (848–864 amino acids) of C. gibelio Sycp1 was synthesized and coupled to KLH protein. Polyclonal antibodies were raised in rabbits (ABclonal Biotechnology). Oocyte chromosome spreads were performed as described previously110 with minor modifications. In brief, four to six ovaries (80–120 dpf) were dissected using a 20 ml injector 15–20 times and pipetted up and down for 2 min in DMEM. After filtering with a 120-mesh cell strainer, cells were washed with PBS and suspended in 80–120 μl 0.1 M sucrose (pH ~8). Then, 20–25 μl cell suspension was vertically dropped to the centre of the slides that has been covered with 100 μl 1% paraformaldehyde. After drying, slides were rinsed in H2O and in 1:250 Photo-Flo 200 and ready for immunofluorescence.
The slides of chromosome spreads were repaired in boiled citrate–EDTA antigen retrieval buffers for 20 min, permeabilized with 0.1% Tween 20 and 0.1% Triton X-100 in PBS for 10 min, and blocked for 10 min with 10% ADB (10% goat serum, 3% BSA and 0.05% Triton X-100 in PBS) at room temperature. Then, the slides were incubated overnight at 4 °C with primary antibodies (anti-Sycp3 [1:150]; anti-Sycp1 (1:100); anti-hRad51 (1:50; Abcam)). After washing with PBS three times, slides were incubated for 1 h in the dark at 37 °C with secondary antibodies (1:500 Alexa Fluor 546 goat anti-rabbit, Invitrogen, 1:500 Alexa Fluor 488 goat anti-mouse Invitrogen and 5 μg ml−1 DAPI, Sigma). After incubation, slides were washed for 10 min each in PBS containing 0.04% Photo-Flo 200 and 0.03% Triton X-100. Finally, the samples were mounted with VECTASHIELD Antifade Mounting Medium (Vector Labs) and photographed using the Leica SP8 STED (Analytical & Testing Center, IHB, CAS).
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank J. Luo for providing the genome of goldfish; I. Seim and G. Zhang for helpful discussion and M. Eckstut (Edanz, www.liwenbianji.cn) for assistance in editing this manuscript. The research was supported by Analytical & Testing Center and Supercomputing Centre, CAS, China. This work was supported by the Strategic Priority Research Program of the CAS (XDA024030104, XDB31000000), the Key Program of Frontier Sciences of the CAS (QYZDY-SSW-SMC025), the National Key Research and Development Program of China (2018YFD0900204, 2021YFD1200804), the Earmarked Fund for Modern Agro-industry Technology Research System (NYCYTX-49), the National Natural Science Foundation of China (31772839) and the Autonomous Project of the State Key Laboratory of Freshwater Ecology and Biotechnology (2019FBZ04).
Extended data
Source data
Author contributions
J.-F.G. and L.Z. designed the study. W.W., Y.W., X.-D.F. and L.Z. supervised the study. B.-T.Z., B.W., M.X., Y.-L.Y. and Y.W. performed genomic assemblies, validation, and karyotype analysis. B.W., Y.-L.Y., Y.W. and X.-Y.L. assembled and annotated RNA-seq data. W.-J.X., B.W., M.X., Y.W., J.-B.J., C.Z., W.-M.H., J.-G.T and B.-D.F. performed the comparative analysis of the two genomes assembly. B.-T.Z., M.X., Y.-L.Y., Y.W., W.Y., Z.-C.Z. and Q.-Qia Z. performed gene annotation and TE analysis. W.-J.X., B.W., Y.W., Q.G., J.-Y.X. and M.-Z.B. performed genome evolution and gene family analysis. W.-J.X, C.-L.Z, M.-L.H., J.-M.Z. and C.-G.F. performed comparative genomic and transcriptome analysis. X.-Y.L., Z.-W.W., Z.L., X.-J.Z., W.-J.L., R.-H.G., J.-F.T. and Q.-Qin Z. provided samples for sequencing and analysis. Y.C., Y.W., L.-J.M. and Z.L. performed chromosome spread and DNA content assays. X.-Y.L., L.-T.T., Z.-W.W., Z.L. X.-J.Z., L.-J.M., Y.C., P.Y., M.L., F.P. and M.D. performed other related experimental work. Y.W., K.W., X.-Y.L., X.-D.F., L.Z. and J.-F.G. wrote the manuscript with input from all other authors. J.-F.G., W.W., Z.Y., Y.-P.Z., Q.Q. and H.-M.Y. revised the manuscript.
Peer review
Peer review information
Nature Ecology & Evolution thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
The whole genome assembly and the raw resequencing data of C. gibelio are deposited into GenBank under BioProject ID PRJNA546443. The whole genome assembly and the raw resequencing data of C. auratus are deposited into GenBank under BioProject ID PRJNA546444. The transcriptome data of C. gibelio and C. auratus are available in the GenBank (PRJNA836313, PRJNA834570, PRJNA833164, PRJNA837728, PRJNA833750, and PRJNA833167). The gene alignments and trees of specific lost and expanded genes are available at figshare database (10.6084/m9.figshare.19674843.v1). Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Yang Wang, Xi-Yin Li, Wen-Jie Xu, Kun Wang, Bin Wu, Meng Xu, Yan Chen.
Contributor Information
Xiao-Dong Fang, Email: fangxd@bgi.com.
Wen Wang, Email: wenwang@nwpu.edu.cn.
Li Zhou, Email: zhouli@ihb.ac.cn.
Jian-Fang Gui, Email: jfgui@ihb.ac.cn.
Extended data
is available for this paper at 10.1038/s41559-022-01813-z.
Supplementary information
The online version contains supplementary material available at 10.1038/s41559-022-01813-z.
References
- 1.Liu XL, et al. Numerous mtDNA haplotypes reveal multiple independent polyploidy origins of hexaploids in Carassius species complex. Ecol. Evol. 2017;7:10604–10615. doi: 10.1002/ece3.3462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhou L, Gui J. Natural and artificial polyploids in aquaculture. Aquacult. Fish. 2017;2:103–111. [Google Scholar]
- 3.Luo J, et al. Tempo and mode of recurrent polyploidization in the Carassius auratus species complex (Cypriniformes, Cyprinidae) Heredity. 2014;112:415–427. doi: 10.1038/hdy.2013.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Li XY, et al. Evolutionary history of two divergent Dmrt1 genes reveals two rounds of polyploidy origins in gibel carp. Mol. Phylogenet. Evol. 2014;78:96–104. doi: 10.1016/j.ympev.2014.05.005. [DOI] [PubMed] [Google Scholar]
- 5.Li JT, et al. Parallel subgenome structure and divergent expression evolution of allo-tetraploid common carp and goldfish. Nat. Genet. 2021;53:1493–1503. doi: 10.1038/s41588-021-00933-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Yu P, et al. Upregulation of the PPAR signaling pathway and accumulation of lipids are related to the morphological and structural transformation of the dragon-eye goldfish eye. Sci. China Life Sci. 2021;64:1031–1049. doi: 10.1007/s11427-020-1814-1. [DOI] [PubMed] [Google Scholar]
- 7.Gui JF, Zhou L. Genetic basis and breeding application of clonal diversity and dual reproduction modes in polyploid Carassius auratus gibelio. Sci. China Life Sci. 2010;53:409–415. doi: 10.1007/s11427-010-0092-6. [DOI] [PubMed] [Google Scholar]
- 8.Gui JF, Zhou L, Li XY. Rethinking fish biology and biotechnologies in the challenge era for burgeoning genome resources and strengthening food security. Water Biol. Secur. 2022;1:100002. doi: 10.1016/j.watbs.2021.11.001. [DOI] [Google Scholar]
- 9.Lu M, et al. Regain of sex determination system and sexual reproduction ability in a synthetic octoploid male fish. Sci. China Life Sci. 2021;64:77–87. doi: 10.1007/s11427-020-1694-7. [DOI] [PubMed] [Google Scholar]
- 10.Comai L. The advantages and disadvantages of being polyploid. Nat. Rev. Genet. 2005;6:836–846. doi: 10.1038/nrg1711. [DOI] [PubMed] [Google Scholar]
- 11.Butlin R. The costs and benefits of sex: new insights from old asexual lineages. Nat. Rev. Genet. 2002;3:311–317. doi: 10.1038/nrg749. [DOI] [PubMed] [Google Scholar]
- 12.Avise JC. Evolutionary perspectives on clonal reproduction in vertebrate animals. Proc. Natl Acad. Sci. USA. 2015;112:8867–8873. doi: 10.1073/pnas.1501820112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Birky CW. Heterozygosity, heteromorphy, and phylogenetic trees in asexual eukaryotes. Genetics. 1996;144:427–437. doi: 10.1093/genetics/144.1.427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Birky CW., Jr. Bdelloid rotifers revisited. Proc. Natl Acad. Sci. USA. 2004;101:2651–2652. doi: 10.1073/pnas.0308453101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mark Welch DB, Mark Welch JL, Meselson M. Evidence for degenerate tetraploidy in bdelloid rotifers. Proc. Natl Acad. Sci. USA. 2008;105:5145–5149. doi: 10.1073/pnas.0800972105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Brunes TO, da Silva AJ, Marques-Souza S, Rodrigues MT, Pellegrino KCM. Not always young: the first vertebrate ancient origin of true parthenogenesis found in an Amazon leaf litter lizard with evidence of mitochondrial haplotypes surfing on the wave of a range expansion. Mol. Phylogenet. Evol. 2019;135:105–122. doi: 10.1016/j.ympev.2019.01.023. [DOI] [PubMed] [Google Scholar]
- 17.Arai K. Genetics of the loach, Misgurnus anguillicaudatus: recent progress and perspective. Folia Biol. 2003;51:107–117. [PubMed] [Google Scholar]
- 18.Lamatsch DK, Nanda I, Epplen JT, Schmid M, Schartl M. Unusual triploid males in a microchromosome-carrying clone of the Amazon molly, Poecilia formosa. Cytogenet. Cell Genet. 2000;91:148–156. doi: 10.1159/000056836. [DOI] [PubMed] [Google Scholar]
- 19.Liu XL, et al. Wider geographic distribution and higher diversity of hexaploids than tetraploids in Carassius species complex reveal recurrent polyploidy effects on adaptive evolution. Sci. Rep. 2017;7:5395. doi: 10.1038/s41598-017-05731-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Schlupp I. The evolutionary ecology of gynogenesis. Annu. Rev. Ecol. Evol. Syst. 2005;36:399–417. doi: 10.1146/annurev.ecolsys.36.102003.152629. [DOI] [Google Scholar]
- 21.Lampert KP, Schartl M. The origin and evolution of a unisexual hybrid: Poecilia formosa. Philos. T. R. Soc. B. 2008;363:2901–2909. doi: 10.1098/rstb.2008.0040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Jakovlic I, Gui JF. Recent invasion and low level of divergence between diploid and triploid forms of Carassius auratus complex in Croatia. Genetica. 2011;139:789–804. doi: 10.1007/s10709-011-9584-y. [DOI] [PubMed] [Google Scholar]
- 23.Jiang FF, et al. High male incidence and evolutionary implications of triploid form in northeast Asia Carassius auratus complex. Mol. Phylogenet. Evol. 2013;66:350–359. doi: 10.1016/j.ympev.2012.10.006. [DOI] [PubMed] [Google Scholar]
- 24.Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 2020;11:1432. doi: 10.1038/s41467-020-14998-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lawrence RJ, Pikaard CS. Transgene-induced RNA interference: a strategy for overcoming gene redundancy in polyploids to generate loss-of-function mutations. Plant J. 2003;36:114–121. doi: 10.1046/j.1365-313X.2003.01857.x. [DOI] [PubMed] [Google Scholar]
- 26.Ye Y, Zhou J, Wang Z, Zhang J, Wei W. Comparative studies on the DNA content from three strains of crucian carp (Carassius auratus) Acta Hydrobiol. Sin. 2004;28:13–16. [Google Scholar]
- 27.Wei WH, Zhang J, Zhang YB, Zhou L, Gui JF. Genetic heterogeneity and ploidy level analysis among different gynogenetic clones of the polyploid gibel carp. Cytom. A. 2003;56A:46–52. doi: 10.1002/cyto.a.10077. [DOI] [PubMed] [Google Scholar]
- 28.Mou CY, et al. Divergent antiviral mechanisms of two viperin homeologs in a recurrent polyploid fish. Front. Immunol. 2021;12:702971. doi: 10.3389/fimmu.2021.702971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Gan RH, et al. Functional divergence of multiple duplicated foxl2 homeologs and alleles in a recurrent polyploid fish. Mol. Biol. Evol. 2021;38:1995–2013. doi: 10.1093/molbev/msab002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Luo J, et al. From asymmetrical to balanced genomic diversification during rediploidization: subgenomic evolution in allotetraploid fish. Sci. Adv. 2020;6:eaaz7677. doi: 10.1126/sciadv.aaz7677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chen Z, et al. De novo assembly of the goldfish (Carassius auratus) genome and the evolution of genes after whole-genome duplication. Sci. Adv.s. 2019;5:eaav0547. doi: 10.1126/sciadv.aav0547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wang X, Li J, He S. Molecular evidence for the monophyly of East Asian groups of Cyprinidae (Teleostei: Cypriniformes) derived from the nuclear recombination activating gene 2 sequences. Mol. Phylogenet. Evol. 2007;42:157–170. doi: 10.1016/j.ympev.2006.06.014. [DOI] [PubMed] [Google Scholar]
- 33.Xu P, et al. The allotetraploid origin and asymmetrical genome evolution of the common carp Cyprinus carpio. Nat. Commun. 2019;10:4625. doi: 10.1038/s41467-019-12644-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kon T, et al. The genetic basis of morphological diversity in domesticated goldfish. Curr. Biol. 2020;30:1–15. doi: 10.1016/j.cub.2020.04.034. [DOI] [PubMed] [Google Scholar]
- 35.Chen D, et al. The evolutionary origin and domestication history of goldfish (Carassius auratus) Proc. Natl Acad. Sci. USA. 2020;117:29775–29785. doi: 10.1073/pnas.2005545117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Loidl J. Meiotic chromosome pairing in triploid and tetraploid Saccharomyces cerevisiae. Genetics. 1995;139:1511–1520. doi: 10.1093/genetics/139.4.1511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Weber CM, Henikoff S. Histone variants: dynamic punctuation in transcription. Genes Dev. 2014;28:672–682. doi: 10.1101/gad.238873.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wu N, Yue HM, Chen B, Gui JF. Histone H2A has a novel variant in fish oocytes. Biol. Reprod. 2009;81:275–283. doi: 10.1095/biolreprod.108.074955. [DOI] [PubMed] [Google Scholar]
- 39.Xu S, Omilian AR, Cristescu ME. High rate of large-scale hemizygous deletions in asexually propagating Daphnia: implications for the evolution of sex. Mol. Biol. Evol. 2010;28:335–342. doi: 10.1093/molbev/msq199. [DOI] [PubMed] [Google Scholar]
- 40.Warren WC, et al. Clonal polymorphism and high heterozygosity in the celibate genome of the Amazon molly. Nat. Ecol. Evol. 2018;2:669–679. doi: 10.1038/s41559-018-0473-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Halldorsson BV, et al. The rate of meiotic gene conversion varies by sex and age. Nat. Genet. 2016;48:1377–1384. doi: 10.1038/ng.3669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Williams AL, et al. Non-crossover gene conversions show strong GC bias and unexpected clustering in humans. eLife. 2015;4:e04637. doi: 10.7554/eLife.04637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Flot JF, et al. Genomic evidence for ameiotic evolution in the bdelloid rotifer Adineta vaga. Nature. 2013;500:453–457. doi: 10.1038/nature12326. [DOI] [PubMed] [Google Scholar]
- 44.Page SL, Hawley RS. The genetics and molecular biology of the synaptonemal complex. Annu. Rev. Cell Dev. Biol. 2004;20:525–558. doi: 10.1146/annurev.cellbio.19.111301.155141. [DOI] [PubMed] [Google Scholar]
- 45.Inano S, et al. RFWD3-mediated ubiquitination promotes timely removal of both RPA and RAD51 from DNA damage sites to facilitate homologous recombination. Mol. Cell. 2017;66:622–634. doi: 10.1016/j.molcel.2017.04.022. [DOI] [PubMed] [Google Scholar]
- 46.Sanchez A, Reginato G, Cejka P. Crossover or non-crossover outcomes: tailored processing of homologous recombination intermediates. Curr. Opin. Genet. Dev. 2021;71:39–47. doi: 10.1016/j.gde.2021.06.012. [DOI] [PubMed] [Google Scholar]
- 47.Session AM, et al. Genome evolution in the allotetraploid frog Xenopus laevis. Nature. 2016;538:336–343. doi: 10.1038/nature19840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Appels R, et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science. 2018;361:eaar7191. doi: 10.1126/science.aar7191. [DOI] [PubMed] [Google Scholar]
- 49.Edger PP, et al. Origin and evolution of the octoploid strawberry genome. Nat. Genet. 2019;51:541–547. doi: 10.1038/s41588-019-0356-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Hojsgaard D, Schartl M. Skipping sex: a nonrecombinant genomic assemblage of complementary reproductive modules. BioEssays. 2021;43:2000111. doi: 10.1002/bies.202000111. [DOI] [PubMed] [Google Scholar]
- 51.Omilian AR, Cristescu ME, Dudycha JL, Lynch M. Ameiotic recombination in asexual lineages of Daphnia. Proc. Natl Acad. Sci. USA. 2006;103:18638–18643. doi: 10.1073/pnas.0606435103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Hartfield M. Evolutionary genetic consequences of facultative sex and outcrossing. J Evol. Biol. 2016;29:5–22. doi: 10.1111/jeb.12770. [DOI] [PubMed] [Google Scholar]
- 53.Schaefer I, et al. No evidence for the ‘Meselson effect’ in parthenogenetic oribatid mites (Oribatida, Acari) J. Evol. Biol. 2006;19:184–193. doi: 10.1111/j.1420-9101.2005.00975.x. [DOI] [PubMed] [Google Scholar]
- 54.Schön I, Martens K. No slave to sex. Proc. Biol. Sci. 2003;270:827–833. doi: 10.1098/rspb.2002.2314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Li XY, et al. Origin and transition of sex determination mechanisms in a gynogenetic hexaploid fish. Heredity. 2018;121:64–74. doi: 10.1038/s41437-017-0049-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Li XY, et al. Extra microchromosomes play male determination role in polyploid gibel carp. Genetics. 2016;203:1415–1424. doi: 10.1534/genetics.115.185843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Ding M, et al. Genomic anatomy of male-specific microchromosomes in a gynogenetic fish. PLoS Genet. 2021;17:e1009760. doi: 10.1371/journal.pgen.1009760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zhao X, et al. Genotypic males play an important role in the creation of genetic diversity in gynogenetic gibel carp. Front. Genet. 2021;12:691923. doi: 10.3389/fgene.2021.691923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhu YJ, et al. Distinct sperm nucleus behaviors between genotypic and temperature-dependent sex determination males are associated with replication and expression-related pathways in a gynogenetic fish. BMC Genomics. 2018;19:437. doi: 10.1186/s12864-018-4823-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hojsgaard D. Transient activation of apomixis in sexual neotriploids may retain genomically altered states and enhance polyploid establishment. Front. Plant Sci. 2018;9:00230. doi: 10.3389/fpls.2018.00230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Wen M, et al. Sex chromosome and sex locus characterization in goldfish, Carassius auratus (Linnaeus, 1758) BMC Genomics. 2020;21:552. doi: 10.1186/s12864-020-06959-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Li, X. Y., Mei, J., Ge, C. T., Liu, X. L. & Gui, J. F. Sex determination mechanisms and sex control approaches in aquaculture animals. Sci. China Life Sci. 10.1007/s11427-021-2075-x (2022). [DOI] [PubMed]
- 63.Jiang YG, et al. Biological effect of heterologous sperm on gynogenetic offspring in carassius auratus gibelio. Acta Hydrobiol. Sin. 1983;8:1–13. [Google Scholar]
- 64.Zhu LF, Jiang YG. A comparative study of the biological characters of gynogenetic clones of silver crucian carp (Carassius auratus gibelio) Acta Hydrobiol. Sin. 1993;17:112–120. [Google Scholar]
- 65.Wang ZW, et al. A novel nucleo-cytoplasmic hybrid clone formed via androgenesis in polyploid gibel carp. BMC Res. Notes. 2011;4:82. doi: 10.1186/1756-0500-4-82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Chen F, et al. Stable genome incorporation of sperm-derived DNA fragments in gynogenetic clone of gibel carp. Mar. Biotechnol. 2020;22:54–66. doi: 10.1007/s10126-019-09930-w. [DOI] [PubMed] [Google Scholar]
- 67.Li Z, et al. Comparative analysis of intermuscular bones between clone A+ and clone F strains of allogynogenetic gibel carp. Acta Hydrobiol. Sin. 2017;41:860–869. [Google Scholar]
- 68.Li Z, Liang HW, Wang ZW, Zou GW, Gui JF. A novel allotetraploid gibel carp strain with maternal body type and growth superiority. Aquaculture. 2016;458:55–63. doi: 10.1016/j.aquaculture.2016.02.030. [DOI] [Google Scholar]
- 69.Shao GM, et al. Whole genome incorporation and epigenetic stability in a newly synthetic allopolyploid of gynogenetic gibel carp. Genome Biol. Evol. 2018;10:2394–2407. doi: 10.1093/gbe/evy165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Zhou, L., et al. Aquaculture in China: Success stories and modern trends. Ch. 2.4, 149–157 (Oxford: John Wiley & Sons Ltd., 2018).
- 71.Gui JF, Zhu ZY. Molecular basis and genetic improvement of economically important traits in aquaculture animals. Chin. Sci. Bull. 2012;57:1751–1760. doi: 10.1007/s11434-012-5213-0. [DOI] [Google Scholar]
- 72.Peng JX, Xie JL, Zhou L, Hong YH, Gui JF. Evolutionary conservation of Dazl genomic organization and its continuous and dynamic distribution throughout germline development in gynogenetic gibel carp. J. Exp. Zool. B. 2009;312B:855–871. doi: 10.1002/jez.b.21301. [DOI] [PubMed] [Google Scholar]
- 73.Durand NC, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–98. doi: 10.1016/j.cels.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Durand NC, et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 2016;3:99–101. doi: 10.1016/j.cels.2015.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–580. doi: 10.1093/nar/27.2.573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinform. Chapter 4, Unit 4.10 (2009). [DOI] [PubMed]
- 77.Saha S, Bridges S, Magbanua ZV, Peterson DG. Empirical comparison of ab initio repeat finding programs. Nucleic Acids Res. 2008;36:2284–2294. doi: 10.1093/nar/gkn064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008;24:637–644. doi: 10.1093/bioinformatics/btn013. [DOI] [PubMed] [Google Scholar]
- 79.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 80.Birney E, Clamp M, Durbin R. GeneWise and genomewise. Genome Res. 2004;14:988–995. doi: 10.1101/gr.1865504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Kim D, Landmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods. 2015;12:357–360. doi: 10.1038/nmeth.3317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Pertea M, et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 2015;33:290–295. doi: 10.1038/nbt.3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Haas BJ, et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 2008;9:R7. doi: 10.1186/gb-2008-9-1-r7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000;28:45–48. doi: 10.1093/nar/28.1.45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997;278:631–637. doi: 10.1126/science.278.5338.631. [DOI] [PubMed] [Google Scholar]
- 87.Mulder N, Apweiler R. InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol. Biol. 2007;396:59–70. doi: 10.1007/978-1-59745-515-2_5. [DOI] [PubMed] [Google Scholar]
- 88.Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
- 89.Tang HB, et al. Synteny and collinearity in plant genomes. Science. 2008;320:486–488. doi: 10.1126/science.1153917. [DOI] [PubMed] [Google Scholar]
- 90.Krzywinski M, et al. Circos: an information aesthetic for comparative genomics. Genome Res. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 1303.3997 (2013).
- 92.Li H, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. arXiv 1207.3907 (2012).
- 94.Cingolani P, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w(1118); iso-2; iso-3. Fly. 2012;6:80–92. doi: 10.4161/fly.19695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Li R, et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010;20:265–272. doi: 10.1101/gr.097261.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Buchfink B, Reuter K, Drost HG. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods. 2021;18:366–368. doi: 10.1038/s41592-021-01101-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34:W609–W612. doi: 10.1093/nar/gkl315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 2007;56:564–577. doi: 10.1080/10635150701472164. [DOI] [PubMed] [Google Scholar]
- 100.Yang ZH. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
- 101.Zhang C, Rabiee M, Sayyari E, Mirarab S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics. 2018;19:153. doi: 10.1186/s12859-018-2129-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Cavender, T. M. in Cyprinid Fishes: Systematics, Biology and Exploitation (eds Ian J. Winfield & Joseph S. Nelson) 34–54 (Springer, 1991).
- 103.Sytchevskaya, E. K. Palaeogene freshwater fish fauna of the USSR and Mongolia. Transactions of the Joint Soviet-Mongolian Paleontological Expedition29, 1–157.
- 104.Tao W, Yang L, Mayden RL, He S. Phylogenetic relationships of Cypriniformes and plasticity of pharyngeal teeth in the adaptive radiation of cyprinids. Sci. China Life Sci. 2019;62:553–565. doi: 10.1007/s11427-019-9480-3. [DOI] [PubMed] [Google Scholar]
- 105.Patterson, C. in The Fossil Record 2. (ed. M. J. Benton) 621–656 (Chapman & Hall, 1993).
- 106.Wang DP, Wan HL, Zhang S, Yu J. gamma-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates. Biol. Direct. 2009;4:20. doi: 10.1186/1745-6150-4-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Lynch M, et al. A genome-wide view of the spectrum of spontaneous mutations in yeast. Proc. Natl Acad. Sci. USA. 2008;105:9272–9277. doi: 10.1073/pnas.0803466105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Blokhina YP, Nguyen AD, Draper BW, Burgess SM. The telomere bouquet is a hub where meiotic double-strand breaks, synapsis, and stable homolog juxtaposition are coordinated in the zebrafish, Danio rerio. PLoS Genet. 2019;15:1007730. doi: 10.1371/journal.pgen.1007730. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The whole genome assembly and the raw resequencing data of C. gibelio are deposited into GenBank under BioProject ID PRJNA546443. The whole genome assembly and the raw resequencing data of C. auratus are deposited into GenBank under BioProject ID PRJNA546444. The transcriptome data of C. gibelio and C. auratus are available in the GenBank (PRJNA836313, PRJNA834570, PRJNA833164, PRJNA837728, PRJNA833750, and PRJNA833167). The gene alignments and trees of specific lost and expanded genes are available at figshare database (10.6084/m9.figshare.19674843.v1). Source data are provided with this paper.