Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2025 Jun 12;122(24):e2421883122. doi: 10.1073/pnas.2421883122

Biparental inheritance of germline-specific chromosomes in the sea lamprey and their roles in oocytes

Vladimir A Timoshevskiy a, Nataliya Timoshevskaya a, Kaan I Eşkut a, Kasturi Rajandran a, Jeramiah J Smith a,b,1
PMCID: PMC12184396  PMID: 40504158

Significance

Several species possess germline-restricted chromosomes (GRCs) that are present in their germ cells but eliminated from all other cells during normal embryonic development. Previous analyses of lamprey GRCs have shown that these chromosomes have evolved to carry numerous genes that contribute to the development, maturation, and maintenance of germline, though recent studies have questioned whether females possess and eliminate GRCs. To better define the roles of GRCs in both sexes and resolve the activity of GRCs in females and their maternal transmission, we used RNA sequencing, DNA hybridization of oocytes, and female somatic genome sequencing. These analyses show that GRCs are present in oocytes, are transcriptionally active, and carry genes that perform distinct and overlapping functions in both sexes.

Keywords: lamprey, DNA elimination, germline, oocyte, chromosome

Abstract

Many eukaryotic species undergo programmed elimination of specific chromosomes during embryogenesis, typically retaining these chromosomes only in their germ cells. In some species, programmatic elimination of GRCs, or sex chromosomes, also occurs in a sex-specific manner, with specific chromosomes being transmitted or eliminated by only one sex. As such, these chromosomes provide a unique perspective on the evolution of gene functions that are advantageous to the germline and genetic tradeoffs between somatic vs germline or oocyte vs sperm biology. While GRCs have been extensively characterized in male sea lampreys (Petromyzon marinus), the status of GRCs in females has not yet been resolved, though it has been hypothesized that male-specific expression/transmission of these chromosomes might provide a solution to resolving the long-standing mystery of lamprey sex determining mechanisms. To gain insight into the roles of GRCs in female lampreys, we performed several karyological, transcriptomic, and genomic analyses, which demonstrate that GRCs are present in the female lamprey germline, transmitted by oocytes and somatically eliminated in both sexes. These analyses also show that GRCs play important roles in the maintenance and development of female germline but provide no evidence for sex-specific variation in the elimination and transmission of lamprey GRCs. These findings underscore the diversity of germline functions that are carried out by GRCs in both male and female lampreys and highlight the fact that sex-specific transmission/retention of GRCs likely follows no universal rules across the diverse lineages that have independently evolved to undergo developmentally programmed DNA elimination.


Programmed DNA loss is widely distributed across protist, plant, and animal lineages, and the number of taxa that are known to undergo elimination has increased dramatically over the last two decades: reviewed in refs. 14 and also ref. 5. Programmed elimination often targets entire chromosomes, known as germline-restricted chromosomes (GRCs). GRCs have arisen independently in several lineages likely through more than one underlying mechanism (68). As many of these GRCs have resided exclusively within germ cells over the course of tens to hundreds of millions of years, they provide a unique perspective on the evolution of gene functions that are advantageous to the germline, as well as insights into fundamental genetic tradeoffs between somatic vs germline lineages that are often conserved between eliminating and noneliminating species (6, 9, 10).

In some species, programmatically eliminated chromosomes can be variably present during gametogenesis and may vary between sexes. For example, in the insect families, Cecidomyiidae and Sciaridae embryos undergo programmed elimination of GRCs from their soma and also somatically eliminate one or more of their paternal sex chromosomes: reviewed in refs. 3, 11. In both cecidomyiids and sciarids, sex determination is often based on whether or not paternally derived sex chromosomes are eliminated during early embryogenesis (12, 13). Moreover, in cecidomyiids, GRCs are generally only transmitted by females, being lost during male meiosis (14, 15). Similar to cecidomyiids, male songbirds do not typically transmit their GRC, which exists as a single unpaired chromosome that is lost during male gametogenesis (1618). In songbirds, this meiotic loss occurs through a process that involves condensation of the heterochromatized GRC into an exclusion body during the first meiotic division (1618). Though notably rare cases of paternal transmission have been observed in zebra finch (19). Thus far, no study has directly observed the loss of chromosomes during embryogenesis, although it is inferred that the GRC must undergo loss during early embryonic development as the chromosome is present only in the germ cells in adults (16, 20). In contrast to males, female zebra finch possess two copies of the GRC in their meiotic germ cells, which undergo synapsis during meiosis and are presumably transmitted in a haploid state (16). These observations suggest that the chromosome undergoes a secondary duplication, either during early embryogenesis or later in the development of female primordial germ cells (16, 21).

Among vertebrates, the embryonic details of programmed DNA loss are best known in sea lamprey (Petromyzon marinus), which is partly due to the accessibility of its embryos and the fact that eliminated chromosomes are retained during spermatogenesis permitting the efficient isolation of large quantities of pure germline DNA (2224). In lamprey, DNA loss occurs via the formation of lagging chromatin and micronuclei during the blastula phase of development (22, 24, 25). Expression of GRC encoded genes has been reported from transcriptomic studies that have targeted testes and multiple stages of early lamprey embryogenesis and embryos (10, 26). In contrast, much less is known regarding the presence of GRCs in lamprey female germline or the potential roles of GRCs in oocytes. An analysis of gene expression in immature ovaries (which contain large numbers of somatic cells) found little evidence for the expression of genes that are encoded on known lamprey GRCs (27), which were defined on the basis of sequencing studies performed in males (9, 10). This same study (27) confirmed that GRC-encoded genes are expressed in testes and concluded that GRCs are largely silenced in females and in undifferentiated gonads, implying male-specific roles in gonadal maturation and sex determination. A study of genome-wide DNA methylation patterns also reported that GRC read coverage from oocyte libraries was comparable to that of somatic tissues, which was interpreted as supporting the idea that females may lack or meiotically eliminate GRCs, akin to other taxa that eliminate GRCs or sex chromosomes (28).

Taken together, these studies raise questions as to the extent to which GRCs are present in lamprey oocytes, whether they are potentially transmitted by females, and what role (if any) the GRCs might play in female germline or in the process of sex determination (4, 27). Though notably lack of detection of GRCs thus far in female lampreys might also trace to technical challenges in differentiating signals from somatic vs. germ cells in ovaries. To further address these questions, we sought to more directly assay for the presence of previously defined GRCs in mature oocytes and female soma. Using a combination of microscopic imaging and sequence-based approaches, we were able to detect GRCs in female germline and resolve the behavior of these chromosomes prior to ovulation, in mature oocytes, during meiosis 2, and across early stages of fertilization.

Results

Presence, Retention, and Inheritance of Oocyte GRCs.

To test whether female lampreys carry GRCs in their germline, we used two in situ hybridization probes: Germ1 and Germ2, that strongly hybridize to all of the known sea lamprey GRCs and mark distinct regions of these chromosomes (23, 24). Prior to ovulation, during mid-late prophase I, the germline-specific satellite sequence Germ2 (24) localizes to the periphery of the oocyte nucleolus, in a structure similar to that described for the mammalian nucleolus-like body (29). The oocyte nucleolus is the primary site of rRNA synthesis in the growing oocyte and contributes to the production of ribosomes that are essential for early zygote development, particularly before the initiation of zygotic transcription (30). Consistent with this central role in rRNA production, the nucleolus itself shows strong hybridization to probes against the Germ1 repeat (germline-specific rDNAs) (23, 24) (Fig. 1A). Given their relative location on the GRCs, we infer that the Germ2-containing chromatin is held in proximity to the surface of the nucleolus due to the active transcription of linked ribosomal RNAs, suggesting that these GRC encoded sequences contribute to production of the oocyte ribosomal pool.

Fig. 1.

Fig. 1.

Evidence for transcriptional activity and transmission of GRCs in female germline. (A) GRCs are partially decondensed and contribute to the oocyte nucleolus in preovulation oocytes (mid-late prophase I of meiosis). The oocyte nucleolus is labeled with a probe for Germ1, which hybridized to germline and somatic rDNAs (magenta). Other portions of germline-specific chromosomes are hybridized with Germ2 (red), and other chromosomes are hybridized with labeled testes genomic DNA (green). In (BD) Germ1 hybridization is shown in green and Germ2 is in red; DNA (labeled testes genomic DNA) is in blue. GRC probes identify chromosomes in (B) meiosis II metaphase oocytes, (C) maternal and paternal pronuclei at 15 min postfertilization (the actual distance between maternal nucleus and sperm is 65 µm), and (D) maternal and paternal pronuclei during early karyogamy (30 min postfertilization).

In unfertilized (ovulated) oocytes, Germ1 and Germ2 hybridize to both the oocyte nucleus and the first polar body (Fig. 1 A and B and SI Appendix, Fig. S1), indicating that GRCs are present in both daughter cells following meiosis I. Following fertilization, oocytes complete meiosis II, during which GRCs are visible in both the female pronucleus (Fig. 1 C and D) and the second polar nucleus (SI Appendix, Fig. S2). Similar signals are also visible in the paternal pronucleus (Fig. 1 C and D). As parental nuclei approach karyogamy, hybridization patterns are nearly indistinguishable between paternal and maternal pronuclei with the exception of differences in compaction state: The paternal pronucleus shows a higher degree of compaction relative to the maternal pronucleus at early stages postfertilization (Fig. 1 C and D). These observations demonstrate that several GRCs are present in mature oocytes, that GRCs are transmitted to all daughter cells during female meiosis (including the inherited maternal nucleus), and that maternally inherited GRCs contribute to the diploid zygotic nucleus.

Several Germline-specific Genes Show Oocyte-Enriched Expression.

Previous transcriptomic and ontology analyses have suggested that germline-specific genes contribute to the development and maintenance of male germ cells throughout development and particularly in the context of meiotic testes (23, 26, 27). The persistence of GRCs in the oocyte through completion of meiosis II raises the possibility that germline-specific genes might also contribute to the development and maintenance of meiotic female germline. To assess possible contributions of GRC-encoded genes to oocyte biology, we performed RNA sequencing on ovulated but unfertilized oocytes and developed improved annotations of the lamprey reference genome that incorporate information from this and several other recent studies (27, 31, 32) to better resolve the complement of GRC-encoded genes. Ovulated oocytes were expressed from mature females, sorted to remove oocytes with visible adhering follicular tissue, and rinsed thoroughly in deionized water to reduce contamination by other adhering somatic cells. Analyses of these RNA-seq data identified 209 germline-specific genes with detectable expression in oocytes (≥0.5 FPKM, 108 of which have FPKM ≥5: SI Appendix, Table S1), consistent with the observed presence of GRCs in oocytes (Fig. 2). Several GRC-encoded genes showed higher expression in either testes or oocytes, with 66 being more highly expressed (log2 expression ratio ≥1.5) in oocytes and 287 being more highly expressed in meiotic testes (SI Appendix, Table S1). While comparison of these numbers might indicate that germline-specific genes play a greater role in testes, it is notable that meiotic testes contain germ cells at multiple stages of meiosis and spermatogenesis, whereas ovulated (unfertilized) oocytes represent a single stage of meiosis, being arrested at metaphase of meiosis II.

Fig. 2.

Fig. 2.

Expression of germline-specific genes in oocytes and testes. (A) Expression of 324 GRC-encoded genes with a minimum expression level of 5 FKPM in oocytes or testes. Note that there are several GRC-encoded genes with similar expression in both cell types. (FPKM—Fragments Per Kilobase of transcript per Million mapped reads). The dendrogram to the right summarizes relative distances in expression patterns among genes (rows) and was calculated using Ward’s hierarchical clustering. Four long branches are broken to optimize informative use of display space. (B) The distribution of relative expression values for all GRC-encoded genes with expression ≥0.5 FKPM in both testes and oocytes. A green line corresponds to the best-fit normal distribution with the mean value labeled at its peak. (C) Relative expression of several multicopy GRC-encoded genes (blue circles) and their somatic paralogs (red circles) in oocytes vs meiotic testes. Two GRC paralogy groups (CCNBPIP1 and NFASC) do not have distinguishable somatic paralogs, and the RBM46 group has two candidate somatic paralogs.

The observation of GRC gene expression is consistent with the expectation that sampling from ovulated oocytes reduced somatic contamination to a level that permits detection of GRC gene expression; however, it should be recognized that spermatocytes and oocytes are very different cell types that are associated with a variety of somatic cell types in the gonad. To assess the relative purity of testes vs oocyte samples, we considered the expression of a subset of genes with robust expression in both meiotic testes and oocyte (N = 76, FPKM ≥ 5). If one sample was substantially more enriched for germ cells, we expect that expression of these GRC marker genes might generally be biased toward the purer sample. Although it is an imperfect metric of purity, it is notable that the mean of the distribution of relative expression values (following transcriptome-wide normalization) is close to zero and slightly biased toward oocytes (Fig. 2B). Comparison of this subset of GRC marker genes therefore indicates that ovarian and testes germline samples likely contained similarly low proportions of accompanying somatic cells and that our oocyte sample may represent a slightly purer sample of germ cells.

Many of the GRC-encoded genes that are transcriptionally enriched in oocytes have one or more germline-specific paralog, and several of these genes also have paralogs that are enriched in testes (Fig. 2C). These duplicated genes therefore appear to contribute to germ cell biology in general (somatic paralogs are also generally expressed in germline), the presence of paralogs with higher expression in oocytes or testes suggests that individual paralogs might be tuned over evolutionary time to function more specifically in female or male germline. To gain further insights into the contributions of germline-specific genes to male vs female germline, we performed ontology analyses of genes that are upregulated in oocytes (32 nonredundant human paralogs) and testes (52 nonredundant human paralogs). Genes that are upregulated in oocytes are enriched for ontologies associated with cell adhesion/junctions, ephrin signaling, cyclic nucleotide transport, and positive regulation of notch signaling, though in these later three categories, enrichment was due to the presence of two or more deeply divergent paralogs of Ephrin A, LRRC8 volume-gated ion channel subunits and Jagged Canonical Notch Ligands (SI Appendix, Table S2). These include a previously unannotated JAG paralog (PM25152, SI Appendix, Table S1). Genes that are upregulated in the testes showed modest enrichment for the fibroblast growth factor receptor signaling pathway and positive regulation of noncanonical Wnt signaling (SI Appendix, Table S3).

Females do not Retain Male-Eliminated Chromosomes.

Given our new understanding that GRCs are carried, expressed, and transmitted by females, we sought to address whether males and females might differentially retain any of the known GRC sequences that were previously detected using data from males. To better resolve the content of the female somatic genome, we generated and assembled a new set of PacBio HiFi long reads from the somatic (liver) genome of a female sea lamprey. An assembly of these HiFi reads spanned 1.6 Gb in 3074 contigs with an N50 of 3.48 Mb and L50 of 116 contigs, which is slightly smaller and more fragmented than the expected somatic genome size of 1.8 Gb and 1 N = 84 chromosomes (23, 24). Alignment to the male reference assembly confirmed that assembled female contigs cover the known somatic genome, and individually represent large regions of somatic chromosomes, with many chromosomes being spanned by 2-4 contigs (SI Appendix, Fig. S3).

We performed two complementary sets of analyses that use these data to identify candidate sequences that might be retained in female soma but eliminated from male soma. First, we employed the same strategy that was previously used to identify germline-specific regions of the male germline reference genome. We aligned germline (testes, SRR16928914) and somatic (liver, SRR30991392) reads from the same male individual to the new female liver genome assembly and compared alignment germline vs. somatic read depth across all scaffolds to identify candidate regions that would be considered to be germline-specific in male, but not eliminated from female soma. These analyses identified 5.65 megabases of sequence across 64 female contigs (>10 kb intervals) that was overrepresented in germline reads by a factor of log2 > 2 (Fig. 3 A and B and SI Appendix, Table S4). Closer examination of these intervals revealed that each corresponds to a repetitive sequence that is enriched on GRCs, but is also present in somatically retained chromosomes. These regions include portions of contigs that correspond to known somatic chromosomes from the male reference assembly and scaffolds that consist entirely of high-copy repetitive sequence. These repetitive regions were defined by a relatively small number of sequences: CA/GT dinucleotide repeats, a short satellite repeat (TCCCGGGCGG), and previously identified GRC-enriched sequences GERM4, GERM6 (24), and rDNA (GERM1) (23).

Fig. 3.

Fig. 3.

Computational searches for female retained GRCs identify germline-enriched repeats, but not somatically retained GRCs. (A) Log2 coverage ratios for germline (sperm) and somatic (liver) reads aligned to an assembly of a female lamprey genome. (B) The same plot shown in A, but zoomed to show bins of relative coverage that contain 0-3 megabases of sequence. (C) Modal sequence coverage for female long reads across 10 kb intervals of somatic chromosomes vs GRCs that were previously defined in the reference assembly. Somatic regions with zero coverage include those that cannot be uniquely mapped due to repeat content or low sequence diversity. GRCs highlighted in red are described in more detail in SI Appendix, Table S5.

Second, we aligned female long reads to the male germline genome to ask whether the female genome contained any sequences that aligned to known GRCs from the male reference genome. These analyses identified a total of eight 10 kb intervals within GRC scaffolds that aligned to female reads at a broad range of coverage depths: between 0.1 and 10 times modal read coverage for this dataset (modal coverage was 38x) (Fig. 3C and SI Appendix, Table S5). Closer examination of these intervals revealed that each corresponds to a high copy sequence that is present in both somatic chromosomes and GRCs, consistent with misalignment of somatic reads to GRC scaffolds over these repetitive regions. Thus, sequencing and analysis of the female somatic genome did not identify sex-specific chromosomes, indicating that males and females retain the same sets of somatic chromosomes following programmed DNA elimination.

Discussion

In comparison to male germline, lamprey oocytes present challenging targets with respect to sequence- and image-based analysis of genome structure due to their high yolk content and the comparatively low DNA content. Here, we perform extensive in situ hybridization experiments, RNAseq analysis of meiotic germline, and resequencing of a female lamprey to define the presence, transmission, and roles of GRCs in female lampreys. These studies show that GRCs are present in female germline and absent from female somatic cells in a pattern that is nearly indistinguishable from males. Additionally, these analyses shed light on the contributions of GRCs to the female germline, enhancing our understanding of the biology of lamprey GRCs which has thus far been based largely on studies of male germline.

Our analyses indicate that the germline-specific chromosomes play important roles in the maintenance and maturation of oocytes. Analysis of prophase I oocytes reveals that the germline-specific rDNAs are localized to the oocyte nucleolus and appear to contribute substantially to the production of oocyte rRNA. Previous analyses have shown that the germline-specific chromosomes contain ~5× more copies of rDNA than the somatically retained chromosomes (24), with somatic rDNAs being largely localized to a single somatic chromosome (23, 24). The utilization of these additional copies of rDNA seemingly serves a similar purpose to rolling circle rDNA amplicons that are generated during amphibian oogenesis in order to generate large pools of rRNAs that permit high levels of translation necessary for provisioning their large cytoplasmic volume and which contribute to early zygotic translation (33, 34).

Transcriptional analysis indicates that several protein-coding genes are also expressed in oocytes and that some GRC genes may function primarily in oocytes. Contrasting meiotic testes vs meiotic oocyte transcription reveals a few notable pathways that could shed light on the processes of sex differentiation in lampreys and the potential contribution of germline-specific genes. It has been previously noted that there are germline-specific FGF genes and receptors (10, 27) and that these genes are substantially more highly expressed in testes. Additionally, the differential transcription of Notch vs Wnt signaling pathways in meiotic oocytes and testes is also interesting given the generally opposing effects of these two pathways on cell fate decisions (35) and their widespread roles in mammalian reproduction and germ cell biology (36). Studies in the mouse have shown that manipulating the balance between WNT (WNT4) and FGF (FGF9) signaling can override the SRY-mediated sex determination mechanism (37). It is tempting to speculate that germline-specific FGF8, WNT5 play similar roles to mammalian FGF9 and WNT4 in the context of lamprey sex determination, and that signaling through germline-specific JAG cell surface ligands might act to tip the balance to a female fate within the gonad. However, it is important to recognize that germ cells engage in signaling pathways that are not directly related to primary sex determination (e.g., pathways mediating proliferation, development, and growth). Nevertheless, it seems likely that perturbing FGF, WNT, and JAG/Notch signaling pathways during late larval or early metamorphic life (when gonads are differentiating) might provide important insights into mechanisms of sex determination and gonadal development in sea lamprey.

Finally, our analyses of the content and structure of a female lamprey genome further our understanding of the potential for variable patterns of DNA loss between the sexes and constrain potential models of sex determination. Little is known regarding the specific mechanisms that mediate sex determination in sea lamprey; however, it has been suggested that the presence of GRCs in the species raises the potential that sex determination might be related to the differential retention of one or more chromosomes (4, 27). Notably, published reduced representation sequencing (38, 39) and preprint (40) whole genome resequencing surveys from somatic tissues have identified no sex-associated sequences in sea lamprey. From the standpoint of these previous studies, elimination of chromosomes inherited from one parent (similar to the Sciaridae and Cecidomyiidae) would be difficult to distinguish from X/Y, X/O, Z/W, or ZO-type determining mechanisms, as both are diagnosed by the presence of sex-specific sequences or differences in copy number (4, 27, 28, 40). Our analyses of female somatic sequence data further indicate that females do not retain any of the germline-specific sequences that were previously identified in males, suggesting that DNA elimination may be essentially identical in males and females, and by extension that sex is likely not determined in lamprey via somatic retention of any of the previously identified male-eliminated chromosomes in females.

However, it is also important to note that the absence of evidence does not necessarily constitute evidence of absence, which may warrant further efforts to understand the potential roles of GRCs or GRC-encoded genes in relation to mechanisms of sex determination in lamprey. In this context, it is important to recognize that, analyses performed thus far do not address the potential existence of one or more sex-specific chromosomes in the female germline. Datasets generated by the current study are insufficient to rigorously address this possibility, which may ultimately require analysis of multiple female germline genomes. However, we note that comparison of an oocyte transcriptome assembly to sperm DNAseq and gonadal RNAseq datasets identified only a small number of candidate female-specific transcripts (N = 55) that were expressed at >0.5 TMP in at least one oocyte/ovarian dataset, had no detectable expression in testes datasets, and were underrepresented in testes DNAseq data (Dataset S1). These include 18 transcripts that show sequence similarity to one another and align to reverse transcriptase-like proteins from Callorhinchus milii (XP_042202441) and Lampetra planeri (CAN0420282), and four sequences that align to the 3’ UTR of Petromyzon marinus protein coding genes at 85-92% nucleotide sequence identity. While analysis of our oocyte transcriptome assembly seemingly does not identify strong candidates for lamprey female-specific genes, it does appear to have identified a small class of variable reverse transcriptase-like genes that may be more active in female germline.

While our observation of GRCs in female germline is somewhat at odds with previous transcriptome and methylome studies that did not find strong evidence for the presence of female GRCs (27, 28), it is important to note that lamprey oocytes present exceptionally challenging target for sequence-based modes of inquiry due to their large size (~1 mm3 or ~1E9 the size of a single somatic cell) and relatively low DNA content (5 pg in an ovulated oocyte: roughly 1E−6 its mass). Without alternate means of verifying oocyte purity or high sequencing depth, even an imperceptibly small number of adhering somatic cells have the potential to swamp germline signals in sequence-based approaches. Given that it is possible to directly visualize GRCs in oocytes, it seems likely that sequencing approaches that record spatial information alongside RNA and DNA sequencing data will have the potential to further resolve the content and expression of female GRCs, particularly in the context of ovarian development (41, 42).

Our analysis of GRCs in female lampreys underscore the fact that programmed DNA loss has evolved in parallel several times over the course of eukaryotic evolution and that in each case the evolution of GRCs has followed a unique evolutionary path that presumably reflects differences in developmental and reproductive biology, mutational history, and population biology that arise between taxa and over time. In sea lamprey, both sexes retain, transmit, and somatically eliminate a highly similar set of GRCs that have overlapping but distinct functions in males and females, contrasting with other species that show patterns of retention or transmission that differ between the sexes. While it is perhaps not surprising that evolution has shaped sex-specific aspects of GRCs in diverse ways across several lineages that have evolved GRCs, our analyses show that GRCs can contribute to the markedly different developmental biologies and physiologies (including those associated with dramatic differences in cell size and provisioning) of female vs. male germ cells, while still being retained and transmitted by both sexes.

Methods

Animal Use.

This study complied with all relevant ethical guidelines and was performed under protocol number 2011-0848 (University of Kentucky Institutional Animal Care and Use Committee). Sexually mature adults were housed in 300 gallon recirculating tanks filled with artificial spring water (10% Holtfreter’s solution) (43) at 18 to 21 °C. A bead-based biofilter, UV sterilization lamps, and regular water changes were used to maintain water quality. Eggs were manually stripped from ovulating females into crystallizing dishes, and sperm from spermiating males was added to perform in vitro fertilization. When fertilized eggs were used, eggs were observed to ensure that they were activated (as evidenced by ejection of the polar body and expansion of the vitelline membrane), after which they were rinsed briefly with deionized water (18 °C) and subsequently held in artificial spring water at 18 °C. Nonovulated ovaries were collected by dissecting immature females following euthanasia (44) via overdose of MS222 anesthesia (1 g/L buffered to pH 7.0 with NaHCO3) and decapitation.

In Situ Hybridization.

To study female germline chromosomes, nonovulated ovaries, unfertilized mature eggs, zygotes immediately postfertilization, and zygotes at 15 and 30 min postfertilization were collected and fixed in MEMFA for 1 h on a shaking nutator. Fixed eggs were then washed four times in PBST prepared with DEPC-treated water (PBST-DEPC) for 15 min each, followed by sequential transfers into methanol with washes in 25%, 50%, and 75% methanol in PBST-DEPC, and finally in three changes of 100% methanol. The samples were stored in 100% methanol at −20 °C until further use. Prior to clearing and hybridization, MEMFA-fixed samples were rehydrated into 1X PBS by sequential washes in 75%, 50%, and 25% methanol in 1X PBS, followed by three washes in 100% 1X PBS for 30 min each. These eggs were then incubated in a 5% acrylamide hydrogel solution containing 0.5% VA-044 polymerization initiator (FUJIFILM Wako Chemicals, USA) overnight at 4 °C with gentle shaking on a nutator. Polymerization was performed the following morning at 37 °C for 3 h, followed by three rinses in 1X PBS. These hydrogel-stabilized samples were then incubated in stripping solution (8% SDS in 1X PBS) for 3 to 5 d at 37 °C with gentle mixing. After stripping, the eggs were washed with several changes of a large volume (5 mL) of 1X PBS over the course of one day and then transferred into 2 mL of staining solution (1x PBS, pH = 7.5, 0.1 Triton X-100, 0.01% sodium azide).

Probes for the Germ1 repeat and for genomic DNA were produced using nick-translation labeling of an isolated BAC clone (Germ1) and whole genome DNA extracted from the sea lamprey testes following previously published protocols (45, 46) using as labeled nucleotides Cyanine-3 or Cyanine 5-dUTP (Enzo, Farmingdale, NY, USA) for Germ1 probe and Fluorescein-12-dUTP (Thermo Fisher Scientific, Waltham, MA, USA) for genomic DNA. The TelG-FAM PNA-probe (TTAGGG repeats, PNA Bio) was also used as a telomere-specific probe to supplement signals for labeled whole genomic DNA. Probes for other repetitive sequences were labeled by conventional PCR containing 0.1 mM dATP, dCTP, dGTP, and 0.03 mM dTTP, and 0.02 mM fluorophore: Cy3-dUTP, Cy5-dUTP, or Fluorescein-12-dUTP. Each PCR amplification was performed using GoTaq® DNA polymerase (Promega, Madison, WI, USA, 0.6 units/25 mL reaction), Colorless GoTaq® Reaction Buffer, 1 mg of genomic DNA template, and 100 ng of oligonucleotide primer. PCR cycling conditions included a 5-min initial denaturation step at 95 °C, 32 rounds of three-step thermal cycling consisting of a 30-s denaturation at 95 °C, a 30-s primer annealing step at 55 °C, and a 30-s extension step at 72 °C, and final extension at 72 °C for 5 min. Labeled DNA was precipitated with 10 mg sheared unlabeled Salmon Sperm DNA (ThermoFisher), 1/10 vol of 3 M sodium acetate, and 2 vol of 100% ethanol and dissolved in 50% formaldehyde.

FISH on whole eggs was performed according to a previously described protocol for lamprey embryos (22, 45, 47). For individual hybridization experiments, about ten-twenty PACT-cleared eggs from stock staining solution were placed in a 2-mL tube, rinsed in 50% formamide in 2xSSC. For hybridization, 50% formamide/2xSSC solution was replaced with a 30 µL hybridization mix consisting of 50% formamide, 10% dextran sulfate, and 250 ng of labeled DNA-probe. Samples were preincubated overnight at 37 °C for probe penetration and denatured by heating at 85 °C for 10 min, chilled on ice for 1 min, and then hybridized overnight at 37 °C. Three subsequent washes: 50% formamide in 2xSSC; 0.4x SSC, 0.3% Nonidet-P40; and 2xSSC, 0.1% Nonidet-P40 for 30 min each at 45 °C, were performed the following day. Eggs were placed on a slide in a drop of ProLong Glass Antifade Mountant with NucBlue (Invitrogen by Thermo Fisher Scientific) and enclosed under a coverslip with slight pressure.

The prepared slides were analyzed with an Olympus-BX53 microscope using filter sets for DAPI, FITC, Cy3, and Cy5 with 20×, 60×, and 100× objectives. Images were captured using CellSence software (Olympus). To generate deep-focus images, the extended focal imaging function was used as implemented in CellSence. Images from each filter set were captured separately and deconvoluted if necessary. Merging of separately captured color channels was performed using Adobe Photoshop CC 2019.

Supplementation and Improvement of Gene Annotations.

The sea lamprey genome has been subject to various annotation efforts, including a community driven annotation set and RefSeq gene models that were both published as part of the initial genome assembly release (10), and supplemental annotations that have been produced on older or current assembly drafts by separate groups focusing on nervous system (31, 32) and gonad (27). (bioprojects PRJNA439307, PRJNA749754, and PRJEB48230). To integrate these various efforts and incorporate new data from this study, we aligned sequence data from several independent studies to the assembly using HISAT2 v.2.2.1 (48) and generated provisional gene models using Stringtie v.2.2.3 (49). The programs EVM v.1.1.1 (50, 51) and PASA v.2.5.2 (52) were used to assess the quality of independent predictions alongside existing RefSeq gene models, and to develop an optimally inclusive consensus set of predictions that optimized expectations of splice site use and aligned RNAseq data. To ensure consistency with previous annotation efforts, we recovered annotations for 4722 RefSeq gene models and 1456 community-generated gene models that were rejected by EVM but had no overlapping annotations following gene model refinement by PASA (tuning exon boundaries, adding untranslated regions, and modeling alternative splicing). The utility agat_sp_manage_IDs.pl from the AGAT toolkit v1.0.0 (53) was used to unify gene IDs. After incorporation of homologous gene IDs (below), the utilities gff3 and gff3validator from the GenomeTools (gt) suite (54) were used to validate and reformat a final draft of the annotation gff file. The final updated annotation set consists of a total of 25238 gene models with 41975 mRNAs, including 261 new annotations on GRCs (NCBI GEO accession number: GSE280458 (55), SI Appendix, Table S1).

For homolog annotation we used DIAMOND v.0.9.24 (56) blastp with options --more-sensitive, --max-target-seqs 1 to align annotated protein sequences to canonical proteome sets of several species: human (Homo sapiens), chicken (Gallus gallus), mouse (Mus musculus), spotted gar (Lepisosteus oculatus), lancelet (Branchiostoma floridae), and tunicate (Ciona intestinalis) provided by the Reference Proteomes group (Release-2022_02) (57). For each protein sequence that yielded hits with bitscores >50 we calculated a ratio of the maximum bitscore of nonhuman proteins over the maximum bitscore to a human protein. If this ratio was >1.25, then the corresponding gene was annotated with a human gene name, otherwise with a gene name from the species that delivered the maximum bitscore. Proteins that yielded no alignment with a bitscore > 51 were labeled as “unknown”. Of the 35152 protein sequences that aligned to reference proteomes, 31197 were named on the basis of their closest human homolog, 1,909 from spotted gar, 626 from chicken, 460 from lancelet, 193 from the mouse, and 150 from tunicate. We also used CPC2 v1.01 (58) to calculate coding probability scores and annotated sequences as “coding” if they yielded probability scores >0.5, or “noncoding” for scores <0.5.

Oocyte Transcription.

RNA was extracted from two pools of approximately 100 snap frozen oocytes via the standard Trizol extraction protocol, but supplementing the lysis reaction with 10 mM dithiothreitol. Library preparation and sequencing was performed by NovoGene using the company’s “Plant and Animal Eukaryotic mRNA” pipeline. Resulting RNAseq reads from this study and a previous study targeting meiotic testes (26) (SRP009181) were aligned to the reference genome using HISAT2 v.2.2.1 (48), and the resulting alignments were used to quantify the expression of annotated genes using Stringtie v.2.2.3 (49). Differential gene expression was assessed using median of ratios normalization in R (59). For comparisons among paralogs, OrthoFinder v2.5.4 (60, 61) was used to compare the somatically retained gene set to the germline-specific gene set (SI Appendix, Table S1), in order to identify groups of paralogous genes that share a common ancestor with groups of one or more somatic genes. Analyses were performed using the DIAMOND aligner (62) and default OrthoFinder parameters. Ontology analyses were performed using Enrichr (63).

Female Long Read Sequencing and Analysis.

DNA was extracted from the liver of a female lamprey using the Monarch HMW DNA Extraction Kit for Tissue (New England Biolabs). Sequencing was performed by the HudsonAlpha Institute for Biotechnology using a PacBio Revio Sequencer. The resulting long reads were aligned to the lamprey reference genome using pbmm2 v. 1.13.1 wrapper for the Minimap2 v.2.26 aligner (64, 65) in order to identify matching GRC sequences and separately processed using Hifiasm-0.19.9-r616 (66, 67) to generate a draft female genome assembly. The high-order structure of the female genome assembly was compared to the published male reference by aligning the female assembly to the male reference using Minimap2 v.2.26 (64, 65) tuned to the alignment of moderately divergent genomes via the -x asm20 parameter. Whole genome alignments were visualized as dot plots using D-GENIES (68).

Following initial alignment of PacBio reads to the male reference assembly, mapped reads were filtered to retain only high-quality alignments to aid in the identification of candidate GRC sequences: Those with a mapping quality of 60 and gap-compressed sequence identity 90% and higher. In total, 95.9% of the somatic portion of the assembly, 2.56% of the germline-specific fraction of the assembly, and 86% of reads yielded alignments at this threshold. It is important to note that not all sequences are mappable at this threshold due to the presence of repeats and long stretches of low-complexity sequence (23, 24). Depth of coverage for female long reads aligned to the male reference assembly was calculated using the genomecov function of bedtools v.2.30.0 (69). Modal coverage for each nonoverlapping 10 kb interval was computed using bedtools map function with the “-o mode” option enabled.

Candidate GRC sequences were also identified by mapping male germline and somatic short reads (PRJNA779416) with bwa v.0.7.17 (70) to the female assembly, filtering out alignments to remove those with mapping quality < 30 (samtools v.1.14) and calculating the degree of germline enrichment using DifCover v.3.0.1 (9, 10) to process all discontiguous 1 kb intervals of low-copy sequence with read depth of at least one of the samples being > 1/3X its modal coverage and upper limit of 3X in both samples.

For analysis of candidate female-specific sequences, paired-end reads sequenced from RNAs of unfertilized oocytes were assembled with Trinity (v.2.13.2) (71) yielding 236734 contigs/transcripts. These transcripts were evaluated based on their match to the male reference genome, testes DNAseq data, testes RNAseq data, and oocyte/ovary RNAseq data. To identify matches to the male genome assembly, transcripts were aligned to the current reference (GCF_010993605.1) using Minimap2 (64, 65), allowing for splicing (-x splice). Transcripts that yielded no genomic alignments, genomic alignments covering less than 75% of their bases, or an approximate per-base sequence divergence (dv) of more than 0.1 were considered candidate female-specific transcripts, contingent on the results of subsequent analyses. To identify matches in male DNAseq data, we computed mean depth of coverage for testes DNA reads SRR16928914) aligned with bwa mem (v.0.7.17) (70). Transcripts that were represented by <1X mean coverage in male DNAseq reads were considered candidate female-specific transcripts, contingent on the results of the previous analysis. To assess representation in RNAseq reads from male and female samples (PRJNA749754), as well as RNAseq from unfertilized oocytes and from meiotic testes (SRR369904), these datasets were aligned to assembled transcripts using HiSAT2 (48) and expression values (TPM: Transcripts per million mapped reads) were calculated using Stringtie (49). Transcripts that were absent from male RNASeq data and expressed at a level of >0.5 transcript per million mapped reads in at least one female study were considered candidate female-specific transcripts, contingent on the results of all other analyses.

Supplementary Material

Appendix 01 (PDF)

Dataset S01 (XLSX)

pnas.2421883122.sd01.xlsx (25.7KB, xlsx)

Acknowledgments

This work was funded by grants from the NIH (R35GM130349) and NSF (MCB1818012) to JJS. We acknowledge the support of the University of Kentucky High-Performance Computing complex.

Author contributions

V.A.T., N.T., and J.J.S. designed research; V.A.T., N.T., K.I.E., K.R., and J.J.S. performed research; V.A.T., N.T., K.I.E., and J.J.S. analyzed data; and V.A.T., N.T., and J.J.S. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission.

Data, Materials, and Software Availability

Sequence Data data have been deposited in Genbank (GSE280458, PRJNA1174724) (55, 72). Previously published data were used for this work (SRP009181).

Supporting Information

References

  • 1.Smith J. J., Timoshevskiy V. A., Saraceno C., Programmed DNA elimination in vertebrates. Annu. Rev. Anim. Biosci. 9, 173–201 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Zagoskin M. V., Wang J., Programmed DNA elimination: Silencing genes and repetitive sequences in somatic cells. Biochem. Soc. Trans. 49, 1891–1903 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hodson C. N., Ross L., Evolutionary perspectives on germline-restricted chromosomes in flies (Diptera). Genome Biol. Evol. 13, evab072 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Drotos K. H. I., Zagoskin M. V., Kess T., Gregory T. R., Wyngaard G. A., Throwing away DNA: Programmed downsizing in somatic nuclei. Trends Genet. 38, 483–500 (2022). [DOI] [PubMed] [Google Scholar]
  • 5.Ruban A., et al. , Supernumerary B chromosomes of *Aegilops speltoides* undergo precise elimination in roots early in embryo development. Nat. Commun. 11, 2764 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kinsella C. M., et al. , Programmed DNA elimination of germline development genes in songbirds. Nat. Commun. 10, 5468 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hodson C. N., Jaron K. S., Gerbi S., Ross L., Gene-rich germline-restricted chromosomes in black-winged fungus gnats evolved through hybridization. PLoS Biol. 20, e3001559 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Vontzou N., et al. , Songbird germline-restricted chromosome as a potential arena of genetic conflicts. Curr. Opin. Genet. Dev. 83, 102113 (2023). [DOI] [PubMed] [Google Scholar]
  • 9.Smith J. J., et al. , The sea lamprey germline genome provides insights into programmed genome rearrangement and vertebrate evolution. Nat. Genet. 50, 270–277 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Timoshevskaya N., et al. , An improved germline genome assembly for the sea lamprey Petromyzon marinus illuminates the evolution of germline-specific chromosomes. Cell Rep. 42, 112263 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.White M. J. D., Animal Cytology and Evolution (University Press, Cambridge, ed. 3, 1973), pp. viii, 961 p. [Google Scholar]
  • 12.Du Bois A. M., Chromosome behavior during cleavage in the eggs of *Sciara coprophila* (Diptera) in the relation to the problem of sex determination. Z. Zellforsch. Mikrosk. Anat. 19, 595–614 (1933). [Google Scholar]
  • 13.Benatti T. R., et al. , A neo-sex chromosome that drives postzygotic sex determination in the hessian fly (Mayetiola destructor). Genetics 184, 769–777 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.White M. J., The cytology of the Cecidomyidae (Diptera); The chromosome cycle and anomalous spermatogenesis of Miastor. J. Morphol. 79, 323–369 (1946). [DOI] [PubMed] [Google Scholar]
  • 15.Geyer-Duszynska I., Chromosome behavior in spermatogenesis of Cecidomyiidae (Diptera). Chromosoma 11, 499–513 (1961). [DOI] [PubMed] [Google Scholar]
  • 16.Pigozzi M. I., Solari A. J., The germ-line-restricted chromosome in the zebra finch: Recombination in females and elimination in males. Chromosoma 114, 403–409 (2005). [DOI] [PubMed] [Google Scholar]
  • 17.del Priore L., Pigozzi M. I., Histone modifications related to chromosome silencing and elimination during male meiosis in Bengalese finch. Chromosoma 123, 293–302 (2014). [DOI] [PubMed] [Google Scholar]
  • 18.Malinovskaya L. P., et al. , Germline-restricted chromosome (GRC) in the sand martin and the pale martin (Hirundinidae, Aves): Synapsis, recombination and copy number variation. Sci. Rep. 10, 1058 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Pei Y., et al. , Occasional paternal inheritance of the germline-restricted chromosome in songbirds. Proc. Natl. Acad. Sci. U.S.A. 119, e2103960119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pigozzi M. I., Solari A. J., Germ cell restriction and regular transmission of an accessory chromosome that mimics a sex body in the zebra finch. *Taeniopygia guttata*. Chromosoma Res. 6, 105–113 (1998). [DOI] [PubMed] [Google Scholar]
  • 21.Borodin P., et al. , Mendelian nightmares: The germline-restricted chromosome of songbirds. Chromosoma Res. 30, 255–272 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Timoshevskiy V. A., Herdy J. R., Keinath M. C., Smith J. J., Cellular and molecular features of developmentally programmed genome rearrangement in a vertebrate (sea lamprey: *Petromyzon marinus*). PLoS Genet. 12, e1006103 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Smith J. J., Antonacci F., Eichler E. E., Amemiya C. T., Programmed loss of millions of base pairs from a vertebrate genome. Proc. Natl. Acad. Sci. U.S.A. 106, 11212–11217 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Timoshevskiy V. A., Timoshevskaya N. Y., Smith J. J., Germline-specific repetitive elements in programmatically eliminated chromosomes of the Sea Lamprey (Petromyzon marinus). Genes (Basel) 10, 832 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Saraceno C., Timoshevskiy V. A., Smith J. J., Functional analyses of the polycomb-group genes in sea lamprey embryos undergoing programmed DNA loss. J. Exp. Zool. B Mol. Dev. Evol. 342, 260–270 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bryant S. A., Herdy J. R., Amemiya C. T., Smith J. J., Characterization of somatically-eliminated genes during development of the Sea Lamprey (Petromyzon marinus). Mol. Biol. Evol. 33, 2337–2344 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Yasmin T., Grayson P., Docker M. F., Good S. V., Pervasive male-biased expression throughout the germline-specific regions of the sea lamprey genome supports key roles in sex differentiation and spermatogenesis. Commun. Biol. 5, 434 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Angeloni A., et al. , Extensive DNA methylome rearrangement during early lamprey embryogenesis. Nat. Commun. 15, 1977 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bonnet-Garnier A., et al. , Genome organization and epigenetic marks in mouse germinal vesicle oocytes. Int. J. Dev. Biol. 56, 877–887 (2012). [DOI] [PubMed] [Google Scholar]
  • 30.Kresoja-Rakic J., Santoro R., Nucleolus and rRNA gene chromatin in early embryo development. Trends Genet. 35, 868–879 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lamanna F., et al. , A lamprey neural cell type atlas illuminates the origins of the vertebrate brain. Nat. Ecol. Evol. 7, 1714–1728 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Hockman D., et al. , A genome-wide assessment of the ancestral neural crest gene regulatory network. Nat. Commun. 10, 4689 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Brown D. D., Dawid I. B., Specific gene amplification in oocytes. Oocyte nuclei contain extrachromosomal replicas of the genes for ribosomal RNA. Science 160, 272–280 (1968). [DOI] [PubMed] [Google Scholar]
  • 34.Gall J. G., Differential synthesis of the genes for ribosomal RNA during amphibian oogenesis. Proc. Natl. Acad. Sci. U.S.A. 60, 553–560 (1968). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Acar A., et al. , Inhibition of Wnt signalling by Notch via two distinct mechanisms. Sci. Rep. 11, 9096 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Moldovan G. E., Miele L., Fazleabas A. T., Notch signaling in reproduction. Trends Endocrinol. Metab. 32, 1044–1057 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kim Y., et al. , Fgf9 and Wnt4 act as antagonistic signals to regulate mammalian sex determination. PLoS Biol 4, e187 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Sard N. M., et al. , Rapture (RAD capture) panel facilitates analyses characterizing sea lamprey reproductive ecology and movement dynamics. Ecol. Evol. 10, 1469–1488 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Baltazar-Soares M., et al. , Seascape genomics reveals limited dispersal and suggests spatially varying selection among European populations of sea lamprey (Petromyzon marinus). Evol. Appl. 16, 1169–1183 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Grayson P., Wright A., Garroway C. J., Docker M. F., SexFindR: A computational workflow to identify young and old sex chromosomes. bioXriv [Preprint] (2022), 10.1101/2022.02.21.481346 (Accessed 19 October 2024). [DOI]
  • 41.Baysoy A., Bai Z., Satija R., Fan R., The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 24, 695–713 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Vandereyken K., Sifrim A., Thienpont B., Voet T., Methods and applications for single-cell and spatial multi-omics. Nat. Rev. Genet. 24, 494–515 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Armstrong D. S., Malacinski G.M., “Raising the axolotl in captivity” in Developmental Biology of the Axolotl, Armstrong M. G., Ed. (Oxford University Press, New York, 1989), pp. 220–227. [Google Scholar]
  • 44.Leary S., et al. , AVMA guidelines for the euthanasia of animals: 2020 edition. https://www.avma.org/sites/default/files/2020–02/Guidelines-on-Euthanasia-2020.pdf. Accessed 19 October 2024.
  • 45.Timoshevskiy V. A., Sharma A., Sharakhov I. V., Sharakhova M. V., Fluorescent in situ hybridization on mitotic chromosomes of mosquitoes. J. Vis. Exp. 10, e4215 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Sambrook J., Russell D. W., Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, New York, ed. 3, 2001), vol. 1. [Google Scholar]
  • 47.Timoshevskiy V. A., Lampman R. T., Hess J. E., Porter L. L., Smith J. J., Deep ancestry of programmed genome rearrangement in lampreys. Dev. Biol. 429, 31–34 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kim D., Paggi J. M., Park C., Bennett C., Salzberg S. L., Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Pertea M., Kim D., Pertea G. M., Leek J. T., Salzberg S. L., Transcript-level expression analysis of RNA-seq experiments with HISAT. Stringtie and ballgown. Nat. Protoc. 11, 1650–1667 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Haas B. J., et al. , Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 9, R7 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Haas B. J., Zeng Q., Pearson M. D., Cuomo C. A., Wortman J. R., Approaches to fungal genome annotation. Mycology 2, 118–141 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Haas B. J., et al. , Improving the *Arabidopsis* genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Dainat J., AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format. Version v1.0.0). Zenodo 10.5281/zenodo.3552717 (2022). [DOI]
  • 54.Gremme G., Steinbiss S., Kurtz S., GenomeTools: A comprehensive software library for efficient processing of structured genome annotations. IEEE/ACM Trans. Comput. Biol. Bioinform. 10, 645–656 (2013). [DOI] [PubMed] [Google Scholar]
  • 55.Timoshevskiy V. A., et al. , Data from “GSE280458: Expression of GRCs during male and female meiosis in sea lamprey (Petromyzon marinus).” Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE280458. Deposited 28 October 2024.
  • 56.Buchfink B., Xie C., Huson D. H., Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015). [DOI] [PubMed] [Google Scholar]
  • 57.Altenhoff A. M., et al. , Standardized benchmarking in the quest for orthologs. Nat. Methods 13, 425–430 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Kang Y. J., et al. , CPC2: A fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res. 45, W12–W16 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.R Core Team, R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria.), 2024). [Google Scholar]
  • 60.Emms D. M., Kelly S., OrthoFinder: Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Emms D. M., Kelly S., OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Buchfink B., Reuter K., Drost H. G., Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods 18, 366–368 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Xie Z., et al. , Gene set knowledge discovery with enrichr. Curr. Protoc. 1, e90 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Li H., Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Li H., New strategies to improve minimap2 alignment accuracy. Bioinformatics 37, 4572–4574 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Cheng H., Concepcion G. T., Feng X., Zhang H., Li H., Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Cheng H., et al. , Haplotype-resolved assembly of diploid genomes without parental data. Nat. Biotechnol. 40, 1332–1335 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Cabanettes F., Klopp C., D-GENIES: Dot plot large genomes in an interactive, efficient and simple way. PeerJ 6, e4958 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Quinlan A. R., Hall I. M., BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Li H., Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv [Preprint] (2013), 10.48550/arXiv.1303.3997 (Accessed 19 October 2024). [DOI]
  • 71.Grabherr M. G., et al. , Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Timoshevskiy V. A., et al. , Data from “PRJNA1174724: Assembly of a female somatic genome for sea lamprey” NCBI BioProject. https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1174724/. Deposited 18 October 2024.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

Dataset S01 (XLSX)

pnas.2421883122.sd01.xlsx (25.7KB, xlsx)

Data Availability Statement

Sequence Data data have been deposited in Genbank (GSE280458, PRJNA1174724) (55, 72). Previously published data were used for this work (SRP009181).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES