Abstract
We present the first chromosome-level genome assembly of the grasshopper, Locusta migratoria, one of the largest insect genomes. We use coverage differences between females (XX) and males (X0) to identify the X Chromosome gene content, and find that the X Chromosome shows both complete dosage compensation in somatic tissues and an underrepresentation of testis-expressed genes. X-linked gene content from L. migratoria is highly conserved across seven insect orders, namely Orthoptera, Odonata, Phasmatodea, Hemiptera, Neuroptera, Coleoptera, and Diptera, and the 800 Mb grasshopper X Chromosome is homologous to the fly ancestral X Chromosome despite 400 million years of divergence, suggesting either repeated origin of sex chromosomes with highly similar gene content, or long-term conservation of the X Chromosome. We use this broad conservation of the X Chromosome to test for temporal dynamics to Fast-X evolution, and find evidence of a recent burst evolution for new X-linked genes in contrast to slow evolution of X-conserved genes.
Grasshoppers (order Orthoptera, suborder Caelifera) represent an important phylogenetic and developmental comparison to many insect model systems. The first grasshoppers likely arose 250 million years ago during the Triassic period (Misof et al. 2014), and species within the group have since become some of the most prevalent herbivores on earth. The suborder, which contains more than 12,000 species, exhibits a worldwide distribution, with the greatest diversity in the tropics.
Grasshoppers normally possess XX/X0 sex chromosomes (Mao et al. 2020). X0 sex determination systems are thought to derive from XY systems with highly differentiated X and Y Chromosomes in species in which sex is determined by X Chromosome dose rather than Y Chromosome gene content (Furman et al. 2020). Because the Y Chromosome is completely lost in X0 systems, they represent the ultimate example of sex chromosome heterogamety. Extreme examples are often useful in revealing evolutionary patterns; however, despite their inherent utility for the study of sex chromosomes, X0 sex chromosomes are relatively rare compared with XY systems (Bachtrog et al. 2014; The Tree of Sex Consortium 2014), and therefore, their dynamics are not well understood. In XX/X0 systems, the entirely of the X Chromosome is haploid in males, and although theory predicts that extreme heterogamety will accelerate Fast-X evolution (Charlesworth et al. 1987) and the evolution of dosage compensation (Charlesworth 1996), empirical tests of this remain rare (Pal and Vicoso 2015).
Grasshoppers were an early genetic model, and Walter Sutton proposed the chromosome theory of heredity based in part on his work on grasshoppers at the start of the twentieth century (Crow and Crow 2002). Sutton's success was partly attributed to the large chromosomes in grasshoppers, which result from extreme genome size, which in turn has hampered effective genome assembly and subsequent molecular studies. As a result, the two currently available grasshopper genome assemblies, Locusta migratoria (Wang et al. 2014) and Schistocerca gregaria (Verlinden et al. 2020), remain fragmented, with contig N50s of 9.3 kb and 12.0 kb, respectively.
In this study, we combined the Pacific Biosciences (PacBio) HiFi reads (Wenger et al. 2019) and Hi-C technology (Belton et al. 2012) to assemble the first high-quality chromosome-level genome of a grasshopper, the migratory locust, L. migratoria. Our genome assembly allows unprecedented insight into the role of extreme heterogamety in sex chromosome evolution, and our results reveal widespread conservation of the X Chromosome gene content across broad swathes of the insect phylogeny, as well as temporal dynamics to the rate of X Chromosome evolution (Meisel et al. 2019; Chauhan et al. 2021).
Results
Genome features
We used PacBio HiFi sequencing to generate genome sequences for a female (XX) migratory locust and then used Hi-C reads to scaffold the contigs into a chromosome-level genome assembly comprising 12 chromosome-level scaffolds (Fig. 1A). The final assembled genome size is 6.3 Gb with a contig N50 value of 52.8 Mb, the largest to date among published chromosome-level insect genome assemblies. To assess the completeness of our assembly, we performed BUSCO analyses against the insect orthologous groups and recovered a score of 96%, a major improvement on the previous migratory locust and desert locust assemblies (Fig. 1B). Using our own and previously published RNA sequencing (RNA-seq) data sets, we identified 26,636 protein-coding genes with a total of 37,981 transcripts and 59,466 UTRs. Among the 26,636 genes, 19,481 were annotated by BLASTP against the RefSeq arthropod proteins, including top-hits to Zootermopsis nevadensis.
Figure 1.
Chromosome-scale genome assembly. (A) Hi-C contact map comprises 12 chromosome-level scaffolds, designated as linkage groups (LGs). (B) BUSCO assessment of our assembly, the previous migratory locust, and desert locust assemblies. (C) Male read depth along the genome in 250 kb windows.
The proliferation of repetitive elements is the main reason for the large size of the L. migratoria genome, and repetitive elements constituted 76.57% of the assembled genome, of which DNA transposons (19.87%) and LINE retrotransposons (28.13%) were the most abundant elements. The total repetitive content is much higher than previous reported (60%) (Wang et al. 2014), showing the advantage of the PacBio HiFi reads in assembly of highly repetitive genomes. To investigate the genome quality, we also quantified the satellite DNA distribution along each chromosome (Supplemental Fig. S1). The most dominant satellites are LmiSat02A-176 and LmiSat27A-57. In addition, we identified several centromere- and telomere-specific satellites (i.e., LmiSat01A-185 and LmiSat07A-5-tel), suggesting that centromere and telomere repetitive elements have successfully integrated into some chromosomes, further demonstrating the high quality of our genome assembly. Notably, after removing repetitive sequences, the L. migratoria nonrepetitive genome is ∼1.5 Gb, larger than many other insects.
X Chromosome identification and characteristics
To identify the X Chromosome, we sequenced a male (X0) to an average of 30× coverage, mapping the Illumina reads to our genome and calculating read depth in 100 kb windows. Chromosome 3 has a read depth nearly half of other chromosomes (Fig. 1C), consistent with an X0 male karyotype and previous cytogenetic work (Cabrero et al. 2009).
We next compared features between the X Chromosome and autosomes (Supplemental Table S1; Supplemental Fig. S2). Compared with the autosomes, the X Chromosome has lower gene density (Wilcoxon test, P < 0.001) (Supplemental Fig. S3) and larger intron length (Wilcoxon test, P < 0.001). The population recombination rate (ρ) is lower on average across the X Chromosome compared with all autosomes (t-test, P < 0.001), except for LG4 and LG12 (t-test, P > 0.05) (Supplemental Fig. S4). The X Chromosome also exhibits some differences in repetitive element distribution (Supplemental Fig. S2), with lower LINE transposon density (Wilcox test, P < 0.001) and higher DNA transposon density (Wilcox test, P < 0.001) compared with the autosomes. Particularly, the Maverick transposon is significantly enriched on the X Chromosome (Wilcox test, P < 0.001), in which it is nearly twice as dense compared with the autosomes.
Next, we calculated the Kimura two-parameter (K2P) (Kimura 1980) distance of all transposons (Supplemental Fig. S5). The profile of the X Chromosome is similar to the small chromosomes, with a wave of Helitron proliferation in both chromosome classes.
X-linked gene content conservation across insect orders
We identified conservation of L. migratoria X-linked gene content across eight insect orders: Odonata, Orthoptera, Phasmatodea, Hemiptera, Neuroptera, Coleoptera, Diptera, and Lepidoptera (Fig. 2A). In each comparison except with the Lepidoptera, the gene content shared on the X Chromosome was greater than expected by chance based on the relative proportion of protein coding sites (chi-squared test, 1 d.f., P < 0.05) (Supplemental Tables S4, S5), suggesting either repeated origin of sex chromosomes with highly similar gene content, or long-term conservation of the X Chromosome. Differentiating these alternatives phylogenetically would require X and Y Chromosome orthologs, which our species lacks as it is X0. Notably, the 800 Mb grasshopper X Chromosome shares significant gene content to Muller element F in Drosophila melanogaster (the ancestral fly X Chromosome; 55.6% of fly Muller F genes, chi-squared test, 1 d.f., P = 3.84 × 10−31) (Vicoso and Bachtrog 2013) despite 400 million years of divergence. Through functional enrichment analysis, we show that these conserved X-linked genes include GO terms such as learning and memory, neuron recognition, and growth hormone synthesis (Supplemental Fig. S6). The Lepidoptera Z Chromosome does not show significant conservation with the L. migratoria X and, instead, is more likely homologous to LG5, indicating the Z likely represents an independent evolution origin.
Figure 2.
X- and Z-linked gene content conservation across eight insect orders. (A) Bar plot showing the proportion of X/Z genes in other species represented in the L. migratoria genome. The segment in black is the one interpreted as X/Z gene content conservation. The second column shows the gene proportion of each chromosome in L. migratoria. Genome comparisons include the following (Iele) Ischnura elegans (Chauhan et al. 2021), (Spic) Schistocerca piceifrons (NCBI:GCA_021461385.2), (Sgre) Schistocerca gregaria (Verlinden et al. 2020), (Toce) Teleogryllus oceanicus (Pascoal et al. 2020), (Tcri) Timema cristinae (Parker et al. 2022), (Nlug) Nilaparvata lugens (Ye et al. 2021), (Lstr) Laodelphax striatellus (Zhu et al. 2017), (Sfur) Sogatella furcifera (Wang et al. 2017), (Pven) Pachypsylla venusta (Li et al. 2020), (Apis) Acyrthosiphon pisum (Li et al. 2020), (Rmai) Rhopalosiphum maidis (Chen et al. 2019), (Ccar) Chrysoperla carnea (NCBI:GCA_905475395.1), (Pcha) Pogonus chalceus (NCBI:GCA_002278615.1), (Ppyr) Photinus pyralis (Fallon et al. 2018), (Rful) Rhagonycha fulva (NCBI:GCA_905340355.1), (Tcas) Tribolium castaneum (Tribolium Genome Sequencing Consortium 2008), (Pser) Pyrochroa serraticornis (NCBI:GCA_905333025.1), (Csep) Coccinella septempunctata (NCBI:GCA_907165205.1), (Haxy) Harmonia axyridis (Chen et al. 2021a), (Hill) Hermetia illucens (Generalovic et al. 2021), (Dmel) Drosophila melanogaster (Celniker et al. 2002), (Cpom) Cydia pomonella (Wan et al. 2019), (Hmel) Heliconius melpomene (Davey et al. 2016), and (Bmor) Bombyx mori (Lu et al. 2020). For details about statistical analysis of conservation and phylogenetic reconstruction, see Supplemental Tables S3–S5 and Methods. (B) Circos plot showing synteny analysis of X-linked genes across five insect orders: (SPX) S. piceifrons X, (TCX) T. castaneum X, (APX) A. pisum X, (CCX) C. carnea X, and (HIX) H. illucens X.
Next, we asked if the X Chromosome gene order is also conserved across insect orders. Cross-order genome collinearity is usually weak owing to the long evolutionary distances. As expected, we detect strong collinearity between grasshoppers but few syntenic blocks across orders on the X Chromosome. However, with more gaps allowed (which is less meaningful), we could identify significant enrichment of gene synteny on the X Chromosome (Fig. 2B).
Variation in the tempo of Fast-X evolution
For the temporal variation study, we classified genes in L. migratoria into five, partially overlapping, categories. X-conserved genes (Supplemental Fig. S6) are X-linked in at least eight species from Figure 2 and represent genes with long-term X-linkage. X-Lmig genes are X-linked only in L. migratoria and are autosomal in all other species from Figure 2 and represent genes with the shortest history of X-linkage. X–X genes are X-linked in L. migratoria and only one other insect species and represent an intermediate period of X-linkage. A–A genes are autosomal in L. migratoria and all other species. A-X genes are autosomal in L. migratoria and are X-linked in other species and also represent an intermediate period of X-linkage. For each of these categories, we calculated average dN/dS (Fig. 3; Supplemental Table S2), comparing each category to X-Lmig via bootstrapping (1000 replicates). Tests between all categories and A–A were also significant, with P-values in all cases <0.005.
Figure 3.
Gene evolution rate between X Chromosome and autosomes. (A) Box plot of dN/dS values from different autosomal and X-linked categories. We classified genes in L. migratoria into five, partially overlapping, categories: X-conserved genes are X-linked in at least eight species from Figure 2, X-Lmig are X-linked only in L. migratoria and autosomal in all other species from Figure 2, X–X genes are X-linked in L. migratoria and only one other species, A–A genes are autosomal in L. migratoria and all other species, and A–X genes are autosomal in L. migratoria and X-linked in other species. Significant difference to the autosomal in the all-species (A–A) category shown. (****) P < 0.001. (B) The moving average values of dN/dS along the X Chromosome. (C) The moving average values of recombination rate (Rho) along the X Chromosome.
We observe elevated rates of evolution in X-Lmig genes compared with all the other categories of genes, consistent with Fast-X evolution (Fig. 3; Supplemental Fig. S9; Supplemental Table S2). Notably, we did not observe elevated dN/dS for X-conserved or X–X genes, suggesting that Fast-X primarily results from genes specific to the L. migratoria X Chromosome. Importantly, X-conserved genes showed significantly slower rates of average evolution compared with A–A genes (P = 0.002), suggesting both Fast-X and Slow-X in the same species depending on the age of X-linkage. Fast-X and Slow-X are mainly owing to differences in dN values (Supplemental Table S2). To validate this, we performed the same analysis in true bugs (Hemiptera) and recovered similar results (Supplemental Fig. S7).
We next analyzed dN/dS patterns across the X Chromosome (Fig. 3B), recovering a region (360–670 Mb) of elevated dN/dS compared with both the autosomes (P < 0.0001 based on 1000 bootstraps) and the remainder of the X Chromosome (P < 0.0001 based on 1000 bootstrap replicates) level. This suggests that Fast-X might be explained by regional variation along the X Chromosome. We observe no clear difference in recombination rate for this region (Fig. 3C; Supplemental Fig. S8), suggesting the effect is independent of recombination. However, two points of very low recombination appeared at ∼450 Mb and 550 Mb, which might represent the legacy of an inactivated centromere. The higher dN/dS region is marginally enriched for genes that are X-linked only in L. migratoria (chi-squared test, P = 0.0275). This suggests that temporal dynamics in Fast-X coupled with a nonrandom distribution of genes X-linked only in L. migratoria likely drive this result.
X Chromosome dosage compensation
We next tested for the presence of complete dosage compensation in L. migratoria (Fig. 4A). In female (XX) grasshoppers, X-linked genes showed higher expression levels than autosomal genes in both somatic and gonad tissues. In male (X0) grasshoppers, X-linked gene expression is higher than or equal to autosomal genes in somatic tissues but is significantly lower in testis (P < 0.001). The overall male X-linked expression was equal to female X-linked expression in somatic tissues but was significantly lower in testis (P < 0.001) consistent with complete X Chromosome dosage compensation in somatic cells. The dosage compensation is overall homogeneous along the X Chromosome, although absent in gonads (Fig. 4B; Supplemental Fig. S10).
Figure 4.
Dosage compensation in L. migratoria. (A) Box plots of gene expression from different tissues, sexes, and chromosomes. P-values were calculated based on 1000 bootstrap replicates. (B) log2 (female expression level:male expression level) along the X Chromosome.
Discussion
Hi-C and long-read sequencing resolves a large, complex insect genome into chromosomes
We used a combination of long-read DNA and Hi-C sequencing to successfully resolve and assemble an unusually large and highly repetitive insect genome. To date, this is the largest insect genome and one of the largest arthropod genomes, assembled to chromosome scale. The assembly of such relatively large and highly repetitive insect genomes into highly contiguous chromosomes was until very recently unattainable, largely owing to the difficulties presented by high amounts of repetitive content. Indeed, the unusually large size of the grasshopper genome is primarily caused by the high proportion of repetitive content, corresponding to 76.57% of the genome. Therefore, we show that using the highly accurate long-read PacBio HiFi sequencing is an effective assembly strategy for highly repetitive genomes. During the repeat annotation, we constructed a de novo repeat library based on our assembly using RepeatModeler (Flynn et al. 2020). Combining our custom repeat library, the Dfam_Consensus (Storer et al. 2021), and the Repbase libraries (Bao et al. 2015), we identified far more repeats than the previous short reads (Wang et al. 2014). Although we cannot say we annotated every repeat in the genome, our repeat annotation is the most complete to date and nearly complete. Using our new high-quality genome assembly and annotation of L. migratoria, we investigated X Chromosome dynamics.
Conservation of X Chromosome gene content
We observe high conservation of X Chromosome gene content across Insecta (Fig. 2). This is in contrast to previous observations of turnover in sex chromosomes within the Diptera from the ancestral Dipteran X Chromosome, the dot chromosome (Vicoso and Bachtrog 2015). Conservation of X gene content has been observed in a few well-chosen Insecta species (Meisel et al. 2019; Chauhan et al. 2021), and the X may in fact predate the origin of Insecta (Toups and Vicoso 2023). Our work illustrates a broader conservation across the class, with the exception of the Lepidoptera (Pease and Hahn 2012).
Previous work has documented convergent gene content on the mammalian Y, avian W, and snake W Chromosomes (Bellott and Page 2021) and convergent evolution of mammalian X and avian Z Chromosome gene content via gene acquisition (Bellott et al. 2017). It remains unclear whether the conservation in gene content in Insecta reflects conservation of the sex chromosome itself, repeated origin of X Chromosomes from the same underlying syntenic regions, or repeated movement of the same gene content to the sex chromosomes. These are difficult alternatives to differentiate without knowledge of the specific sex-determining locus throughout this group.
Fast-X and Slow-X evolution
The X Chromosome has several properties that distinguish it from the autosomes (Vicoso and Charlesworth 2006; Meisel and Connallon 2013) and that have the potential to influence the rate and pattern of evolution of X-linked genes (Charlesworth et al. 1987). Because males have only one copy of the X Chromosome and therefore only one copy of X-linked genes, recessive mutations on the X Chromosome are directly exposed to selection in males (Charlesworth et al. 1987). This can lead either to rapid fixation of recessive beneficial variation (Fast-X) or to more efficient purging of recessive deleterious mutations (Slow-X) (Xu et al. 2012). Additionally, the reduced effective population size of the X Chromosome compared with the autosomes can lead to elevated rates of genetic drift, a nonadaptive cause of Fast-X evolution (Charlesworth et al. 1987). This is a process distinct from the steady rate of divergence between X and Y orthologs observed in sex chromosome strata, regions where recombination between the sex chromosomes was halted at different times (for review, see Furman et al. 2020).
We observe Fast-X on genes that are X-linked only in L. migratoria and Slow-X for genes that are conserved on the X Chromosome across insects (Fig. 3A). This may suggest that the pool of adaptive recessive variation is quickly depleted following X-linkage, resulting in a limited burst of Fast-X. Over time, this dynamic appears to shift such that recessive deleterious variation is purged more effectively on the X, resulting in Slow-X over greater evolutionary distances. Slower-X has been predicted theoretically in the early stages of X Chromosome evolution (Mrnjavac et al. 2023). Interaction of Fast-X and Slow-X has previously been observed over far shorter timescales (Xu et al. 2012); however, the extraordinary conservation of X Chromosome coding content that we observe here across Insecta makes it possible to discern temporal dynamics in X Chromosome evolution across far larger timespans. Recombination is reduced on the X compared with the autosomes (Supplemental Fig. S3), and we would expect the reduction in recombination both to reduce the amount of standing variation on the X, all other forces being equal, and to reduce the adaptive potential of the X compared with the autosomes. However, we observed no clear relationship between recombination rate and rates of evolution along the X (Fig. 3C; Supplemental Fig. S7). The temporal dynamics of X evolution that we observe, in addition to the fact that Fast-X in L. migratoria is largely confined to a restricted region (360–670 Mb) (Fig. 3B), suggests that this Fast-X region may represent a recent addition to the X Chromosome.
Complete dosage compensation
Complete sex chromosome dosage compensation has evolved many times (Mank 2009, 2013) through a variety of mechanisms (Gu and Walters 2017) in both invertebrates and vertebrates (Nguyen and Disteche 2006; Rupp et al. 2016; Darolti et al. 2019; Huylmans et al. 2019; Hu et al. 2022; Metzger et al. 2023). Although dosage compensation is nearly always associated with high degrees of sex chromosome heterogamety, not all cases of heterogametic sex chromosomes evolve complete dosage compensation (Arnold et al. 2008; Chauhan et al. 2021), and it has been suggested that the evolution of complete sex chromosome dosage compensation may in fact accelerate the degeneration of the Y or W Chromosome and therefore exaggerate sex chromosome heterogamety (Lenormand and Roze 2022).
In addition to accelerating sex chromosome divergence, complete sex chromosome dosage compensation, which we observe for somatic tissues in L. migratoria (Fig. 4), is expected to exacerbate Fast-X and Slow-X evolution (Vicoso and Charlesworth 2009; Mank et al. 2010). The extreme heterogamety represented in X0 sex chromosome systems may also increase the rate of X evolution (Darolti et al. 2023). Fast-X, accelerated by dosage compensation and extreme heterogamety, may in turn increase the role of the X Chromosome in adaptation and speciation relative to its size and coding content, termed the Large-X effect (Lasne et al. 2017).
Approximately 60 years ago, Susumu Ohno predicted that the conservation of gene content on the X Chromosome across placental mammals evolved to maintain dosage relationships between X-linked genes and their autosomal counterparts (Ohno 2013). Linkage conservation is another notable feature of the mammalian X Chromosome, which may be a result of selection against rearrangements that might disrupt X Chromosome inactivation (Brashear et al. 2021). Our analysis shows that, compared with mammals, the insect X Chromosome follows Ohno's prediction but shows much weaker linkage conservation (Fig. 2B). Currently, Drosophila (Conrad and Akhtar 2012), aphids (Richard et al. 2017), and grasshoppers all exhibit X Chromosome upregulation, which is different from the X Chromosome inactivation in mammals. We hypothesize that, on the one hand, the dosage effect leads to the gene content conservation in the insect X Chromosome, as Ohno predicted. On the other hand, the insect dosage compensation mechanisms allowed some gene reshuffling between X Chromosome and autosomes. Further studies—especially dosage compensation mechanisms in more insect species—are needed to provide more evidence for this hypothesis.
Methods
Library construction and sequencing
For PacBio sequencing, genomic DNA of a female migratory locust was isolated and sheared to an average size of 20 kb using a g-TUBE device (Covaris). The sheared DNA was purified and end-repaired using polishing enzymes, followed by blunt end ligation and exonuclease treatment to create a SMRTbell template according to the PacBio 20 kb template preparation protocol. A BluePippin device (Sage Science) was used to size-select the SMRTbell template and enrich large (>10 kb) fragments. SMRTbell libraries were sequenced on a PacBio sequel II system, and consensus reads (HiFi reads) were generated using ccs software (https://github.com/pacificbiosciences/unanimity).
For Hi-C sequencing, Hi-C libraries were prepared from a male migratory locust at the BioMarker Technologies Company. Briefly, sample was collected and spun down, and the cell pellet was resuspended and fixed in formaldehyde solution. DNA was isolated, and the fixed chromatin was digested with the restriction enzyme DpnII overnight. The cohesive ends were labeled with biotin-14-DCTP using the Klenow enzyme and then religated with T4 DNA ligation enzyme. Subsequent DNA was sheared by sonication to a mean size of 350 bp. Hi-C libraries were generated using NEBNext Ultra enzymes and Illumina-compatible adaptors. Biotin-containing fragments were isolated using streptavidin beads. All libraries were quantified by Qubit2.0, and insert size was checked using an Agilent 2100 and then quantified by quantitative polymerase chain reaction (PCR). Hi-C sequencing was performed by a Illumina HiSeq 2500 platform, using the paired-end of 150 bp reads.
To assist gene prediction and dosage compensation analysis, 24 RNA-seq libraries were generated from brain, hindleg, and gonads with four biological replicates for each sex. Total RNA was extracted from each tissue using a TRIzol kit (Invitrogen). The mRNA fractions were isolated from the total RNA extracts with the MicroPoly(A)Purist kit (Ambion). cDNA libraries were prepared for each tissue with the RNA-seq library kit (Gnomegen) following the manufacturer's instructions. Each paired-end cDNA library was sequenced with a read length of 150 bp using the Illumina HiSeq 2500 sequencing platform. All sequencing was performed by the Biomarker Technologies Company.
Genome assembly
The PacBio long (∼12 kb) and highly accurate (>99%) HiFi reads were assembled to a contig-level assembly using Hifiasm (Cheng et al. 2021). The Hi-C data were mapped to Hifiasm contigs with BWA-MEM (Li and Durbin 2009). Uniquely mapped data were used for chromosome-level scaffolding. HiC-Pro version 2.8.1 (Servant et al. 2015) was used for duplicate removal and quality controls, and the filtered Hi-C data were then used to correct misjoins as well as to order and orient contigs. Preassembly was performed for contig correction by splitting contigs into segments with an average length of 300 kb, and then, the segments were preassembled with Hi-C data. Misassembled points were defined and broken when split segments could not be placed to the original position. Then, the corrected contigs were assembled using LACHESIS (Burton et al. 2013) with the parameters CLUSTER_MIN_RE_SITES = 225, CLUSTER_MAX_LINK_ DENSITY = 2; ORDER_MIN_N_RES_IN_TRUN = 105; ORDER_MIN_N_RES_IN_SHREDS = 105 with Hi-C valid pairs. Gaps between ordered contigs were filled with 100 N's.
To evaluate the quality of the genome assembly, we performed BUSCO (Simão et al. 2015) analyses using 1367 core conserved insect genes on the old assembly (Wang et al. 2014), the recent desert locust assembly (Verlinden et al. 2020), and our assembly.
Repeat annotation and gene prediction
De novo identification of repeats was performed by the RepeatModeler (Flynn et al. 2020) under default parameters. We also recovered 107 satellite DNA sequences belonging to 62 families in L. migratoria (Ruiz-Ruano et al. 2016). Using the ab initio repeat library and satellite DNA library and the Dfam_Consensus (Storer et al. 2021) and the Repbase libraries (Bao et al. 2015), we estimated the repeat content of the assembled genome using RepeatMasker (Smit et al. 2013–2015). Ab initio gene prediction was performed using AUGUSTUS (Stanke et al. 2008). GenomeThreader, implemented in BRAKER2 (Brůna et al. 2021), was run for homology-based prediction using protein sequences of D. melanogaster, Anopheles gambiae, Tribolium castaneum, Apis mellifera, Bombyx mori, Acyrthosiphon pisum, and Z. nevadensis. Publicly available NCBI transcriptome data (obtained from the NCBI Sequence Read Archive [SRA; https://www.ncbi.nlm.nih.gov/sra] under accession numbers SRR4041955, SRR6823282, SRR6823283, SRR6823285–SRR6823287, SRR1974300, SRR3315158, SRR3318255, SRR7221659, SRR2051024, SRR3372608–SRR3372610) and our own transcriptome data were aligned by HISAT2 (Kim et al. 2019) and assembled with StringTie (Pertea et al. 2015), and then coding regions were identified with TransDecoder (https://github.com/TransDecoder/TransDecoder). Finally, EVidenceModeler (Haas et al. 2008) was used to integrate the prediction results obtained with the above three methods. PASA v2.4.1 (Haas et al. 2008) was run to further improve the gene structure annotation.
X Chromosome identification via coverage in males
To identify the X Chromosome, a male migratory locust was sequenced to nearly 30× coverage. The Illumina reads were aligned to our genome assembly with BWA-MEM (Li and Durbin 2009), and SAMtools (Li et al. 2009) was used to extract uniquely mapped reads and remove PCR duplicates. mosdepth (Pedersen and Quinlan 2018) was used to calculate read coverage along the genome (parameters: -t 3 -n ‐‐fast-mode ‐‐by 250,000).
Gene density, GC content, nucleotide diversity
The gene density of each chromosome was calculated as the number of genes divided by chromosome length. When comparing gene or repeat density among chromosomes, we use 5 Mb sliding windows to calculate the number of genes or repeats. GC content along chromosomes was calculated within 50 kb sliding windows. VCFtools v0.1.13 (Danecek et al. 2011) was used to determine nucleotide diversity within 500 kb sliding windows.
Recombination rate estimation
To explore the recombination across the locust genome, we estimated the population recombination rate (ρ) using FastEPRR (Gao et al. 2016). First, the resequencing data of five female grasshoppers were downloaded from NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/) under accession number PRJNA433455. BCFtools (Danecek et al. 2021) mpileup was used to call SNPs. Then Beagle 5.0 (Browning and Browning 2007) was used to phase the SNPs, and phased data were then input into the FastEPRR_VCF_step1 function in FastEPRR to scan each 10 and 50 kb window (with parameters inSNPThreshold = 30 and qualThreshold = 20). Next, FastEPRR_VCF_step2 was used to estimate the recombination rate for each window. Finally, we applied FastEPRR_VCF_step3 to merge the files generated by step 2 for each chromosome.
K2p analysis
RepeatMasker (Smit et al. 2013–2015) was used to construct the TE expansion history in the migratory locust genome by first recalculating the divergence of the identified TE copies in the genome with the corresponding consensus sequence in the TE library using the Kimura distance and then estimating the percentage of TEs in the genome at different divergence levels.
Gene content and synteny in insect orders
The species phylogenetic tree and divergence time were achieved from TimeTree (Kumar et al. 2022). Insect genomes with assembled sex chromosomes were retrieved from InSexBase (Chen et al. 2021b), InsectBase 2.0 (Mei et al. 2022), and NCBI GenBank database (https://www.ncbi.nlm.nih.gov/genbank/). The proportion of X-linked genes of each species with their L. migratoria homologs were identified by reciprocal best BLAST hit, including Ischnura elegans, Schistocerca piceifrons, S. gregaria, Teleogryllus oceanicus, Timema cristinae, Nilaparvata lugens, Laodelphax striatellus, Sogatella furcifera, Pachypsylla venusta, A. pisum, Rhopalosiphum maidis, Chrysoperla carnea, Pogonus chalceus, Photinus pyralis, Rhagonycha fulva, T. castaneum, Pyrochroa serraticornis, Coccinella septempunctata, Harmonia axyridis, Hermetia illucens, D. melanogaster, Cydia pomonella, Heliconius Melpomene, and B. mori. OrthoFinder v2.3.3 (Emms and Kelly 2019) together with DIAMOND v0.9.14 (Buchfink et al. 2015), MAFFT v7.305 (Katoh and Standley 2013), and FastTree v2.1.7 (Price et al. 2009) was used to cluster proteins into orthogroups, reconstruct gene trees, and estimate the species tree. Daphnia pulex (Colbourne et al. 2011) was used as an outgroup in phylogenetic tree construction.
To identify genome synteny over large evolutionary distances, we ran MCScanX (Wang et al. 2012) with the parameters “ -s 2 -k 100 -m 50.” MCScanX results were visualized with Circos (Krzywinski et al. 2009).
Functional enrichment of genes
Gene functions and Gene Ontology (GO) annotations were retrieved with eggNOG-mapper (Cantalapiedra et al. 2021). Because L. migratoria is not a model organism, a local OrgDb database was constructed based on eggNOG-mapper results. The functional enrichment was then determined using clusterProfiler (Yu et al. 2012).
Fast-X analysis
The evolution rate of genes (dN/dS) was calculated by comparing the L. migratoria protein sequence with grasshopper Oedaleus asiaticus, which belongs to the same subfamily, Oedipodinae, as L. migratoria. Because of a lack of reference genome, a gene set of O. asiaticus was constructed from transcriptomic data, which were downloaded from the NCBI SRA database (SRR IDs SRR2051024, SRR3372608, SRR3372609, and SRR3372610). Trinity (Grabherr et al. 2011) was used to assemble a transcriptome representing a nonredundant gene set of this species. Then, these Trinity-based genes and our genome-based genes were used for the dN/dS analysis. The reciprocal best BLAST hit pairs were used to identify orthogroups. KaKs_Calculator (v2.0; https://sourceforge.net/projects/kakscalculator2/) was used to calculate dN/dS values. Orthologous genes with dN/dS > 2 were removed.
To assess temporal variation in Fast-X evolution, we classified genes in L. migratoria into five, partially overlapping, categories. X-Conserved genes are X-linked in at least eight species and represent genes with the longest period of X-linkage. X-Lmig genes are X-linked only in L. migratoria and are autosomal in all other species and represent genes with the shortest history of X-linkage. X–X genes are X-linked in L. migratoria and only one other species and represent an intermediate period of X-linkage. A–A genes are autosomal in L. migratoria and all other species. A-X genes are autosomal in L. migratoria and are X-linked in other species and also represent an intermediate period of X-linkage. For each of these categories, we calculated average dN/dS, comparing each category to X-Lmig (1000 bootstrap replicates).
Dosage compensation analysis
RNA-seq reads from heads, hindlegs, and gonads of four adult females and four adult males were trimmed for adapter and low-quality bases (Q < 20) using fastp (Chen et al. 2018). Next, the RNA-seq reads were mapped to the genome using HISAT2 (Kim et al. 2019). An abundance estimation was performed with featureCounts (Liao et al. 2014). The raw counts were normalized by TPM methods. Genes with low expression support (sum of normalized read count of all samples less than one) were removed from downstream analysis. Dosage compensation was assessed by comparing the average expression between female autosomal and X genes, between male autosomal and male X genes, between female autosomal and male autosomal genes, and between female X and male X genes.
Statistical analysis
All statistical tests were performed using R Software (v4.1.1) (R Core Team 2021). A Student's t-test was used when the two groups of data follow a normal distribution and exhibit equal variances. If the variances were unequal, a wilcox.test() implemented in R was used. Bootstrapping was used as an alternative to statistical inference based on the assumption of a parametric model when that assumption is in doubt or when parametric inference is impossible.
A chi-squared test was used for the test of X-linked genes distribution against L. migratoria chromosomes. In short, we used a chi-squared test to determine whether any L. migratoria chromosome contained statistically more or fewer genes from a test organism X Chromosome, based on the total number of genes on that L. migratoria chromosome as a proportion of the total number of genes in the genome (chi-squared test, 1 d.f.). The expected value in the chi-squared test is calculated as
where Ei denotes the expected value for chi-squared test, NxLmig denotes the number of L. migrotoria X-linked genes, NLmig denotes the number of all genes in the L. migrotoria genome, and Nx denotes the number of X-linked genes in the test species. The formula of the chi-square statistic is shown as below:
where χ2 denotes the chi-square statistic; Oi denotes the observed value, namely, the number of X-linked genes in the test species, which is also X-linked in L. migrotoria; and Ei denotes the expected value calculated by the previous formula. Bootstrapping was used for comparing dN/dS and gene expression between groups. Briefly, the mean value was calculated using the raw data. Then, we used the sample() function in R v4.1.1 to resampling the raw data with “replace=TURE.” Then the t.test() function was used to calculate statistical difference based on 1000 or 10,000 replicates.
Data access
All raw and processed sequencing data generated in this study have been submitted to the NCBI Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra/) under accession number SRP402730. The genome assembly CAU_Lmig_1.0 from this study has been submitted to the NCBI Assembly database (https://www.ncbi.nlm.nih.gov/assembly/) under accession number GCA_026315105.1. The genome annotations are openly available on Zenodo (https://doi.org/10.5281/zenodo.7538512) databases. The custom code for the data analysis can be openly accessed on the Zenodo database at https://doi.org/10.5281/zenodo.7538512 and in the Supplemental Scripts.
Supplementary Material
Acknowledgments
This work was supported by National Key R&D Program of China 2022YFD1400500 and Key Research and Development Program of Ningxia (China) 2023BCF01019. We thank Lujiang Qu and Zhonghua Ning for the support and help on this project. We thank Yanqi Liu for help on tissue sampling.
Author contributions: L.B. conceived and designed the research. X.L. performed the experiments, analyzed and interpreted the data, and wrote the manuscript. J.E.M. analyzed the data, interpreted the results, and wrote the manuscript. All authors read and approved the final manuscript.
Footnotes
[Supplemental material is available for this article.]
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.278794.123.
Competing interest statement
The authors declare no competing interests.
References
- Arnold AP, Itoh Y, Melamed E. 2008. A bird's-eye view of sex chromosome dosage compensation. Annu Rev Genomics Hum Genet 9: 109–127. 10.1146/annurev.genom.9.081307.164220 [DOI] [PubMed] [Google Scholar]
- Bachtrog D, Mank JE, Peichel CL, Kirkpatrick M, Otto SP, Ashman T-L, Hahn MW, Kitano J, Mayrose I, Ming R, et al. 2014. Sex determination: why so many ways of doing it? PLoS Biol 12: e1001899. 10.1371/journal.pbio.1001899 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bao W, Kojima KK, Kohany O. 2015. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA 6: 11. 10.1186/s13100-015-0041-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bellott DW, Page DC. 2021. Dosage sensitive functions in embryonic development drove survival of genes on sex-specific chromosomes in snakes, birds and mammals. Genome Res 31: 198–210. 10.1101/gr.268516.120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bellott DW, Skaletsky H, Cho TJ, Brown L, Locke D, Chen N, Galkina S, Pyntikova T, Koutseva N, Graves T, et al. 2017. Avian W and mammalian Y chromosomes convergently retained dosage-sensitive regulators. Nat Genet 49: 387–394. 10.1038/ng.3778 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Belton J-M, McCord RP, Gibcus JH, Naumova N, Zhan Y, Dekker J. 2012. Hi–C: a comprehensive technique to capture the conformation of genomes. Methods 58: 268–276. 10.1016/j.ymeth.2012.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brashear WA, Bredemeyer KR, Murphy WJ. 2021. Genomic architecture constrained placental mammal X chromosome evolution. Genome Res 31: 1353–1365. 10.1101/gr.275274.121 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Browning SR, Browning BL. 2007. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81: 1084–1097. 10.1086/521987 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brůna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. 2021. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom Bioinform 3: lqaa108. 10.1093/nargab/lqaa108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buchfink B, Xie C, Huson DH. 2015. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12: 59–60. 10.1038/nmeth.3176 [DOI] [PubMed] [Google Scholar]
- Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J. 2013. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31: 1119–1125. 10.1038/nbt.2727 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cabrero J, López-León MD, Teruel M, Camacho JPM. 2009. Chromosome mapping of H3 and H4 histone gene clusters in 35 species of acridid grasshoppers. Chromosome Res 17: 397–404. 10.1007/s10577-009-9030-5 [DOI] [PubMed] [Google Scholar]
- Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. 2021. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol 38: 5825–5829. 10.1093/molbev/msab293 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Celniker SE, Wheeler DA, Kronmiller B, Carlson JW, Halpern A, Patel S, Adams M, Champe M, Dugan SP, Frise E, et al. 2002. Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol 3: RESEARCH0079. 10.1186/gb-2002-3-12-research0079 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charlesworth B. 1996. The evolution of chromosomal sex determination and dosage compensation. Curr Biol 6: 149–162. 10.1016/S0960-9822(02)00448-7 [DOI] [PubMed] [Google Scholar]
- Charlesworth B, Coyne JA, Barton NH. 1987. The relative rates of evolution of sex chromosomes and autosomes. Am Nat 130: 113–146. 10.1086/284701 [DOI] [Google Scholar]
- Chauhan P, Swaegers J, Sánchez-Guillén RA, Svensson EI, Wellenreuther M, Hansson B. 2021. Genome assembly, sex-biased gene expression and dosage compensation in the damselfly Ischnura elegans. Genomics 113: 1828–1837. 10.1016/j.ygeno.2021.04.003 [DOI] [PubMed] [Google Scholar]
- Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34: i884–i890. 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen W, Shakir S, Bigham M, Richter A, Fei Z, Jander G. 2019. Genome sequence of the corn leaf aphid (Rhopalosiphum maidis Fitch). GigaScience 8: giz033. 10.1093/gigascience/giz033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen M, Mei Y, Chen X, Chen X, Xiao D, He K, Li Q, Wu M, Wang S, Zhang F, et al. 2021a. A chromosome-level assembly of the harlequin ladybird Harmonia axyridis as a genomic resource to study beetle and invasion biology. Mol Ecol Resour 21: 1318–1332. 10.1111/1755-0998.13342 [DOI] [PubMed] [Google Scholar]
- Chen XI, Mei Y, Chen M, Jing D, He Y, Liu F, He K, Li F. 2021b. InSexBase: an annotated genomic resource of sex chromosomes and sex-biased genes in insects. Database 2021: baab001. 10.1093/database/baab001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng H, Concepcion GT, Feng X, Zhang H, Li H. 2021. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18: 170–175. 10.1038/s41592-020-01056-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Colbourne JK, Pfrender ME, Gilbert D, Thomas WK, Tucker A, Oakley TH, Tokishita S, Aerts A, Arnold GJ, Basu MK, et al. 2011. The ecoresponsive genome of Daphnia pulex. Science 331: 555–561. 10.1126/science.1197761 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conrad T, Akhtar A. 2012. Dosage compensation in Drosophila melanogaster: epigenetic fine-tuning of chromosome-wide transcription. Nat Rev Genet 13: 123–134. 10.1038/nrg3124 [DOI] [PubMed] [Google Scholar]
- Crow EW, Crow JF. 2002. 100 years ago: Walter Sutton and the chromosome theory of heredity. Genetics 160: 1–4. 10.1093/genetics/160.1.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, et al. 2011. The variant call format and VCFtools. Bioinformatics 27: 2156–2158. 10.1093/bioinformatics/btr330 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, et al. 2021. Twelve years of SAMtools and BCFtools. GigaScience 10: giab008. 10.1093/gigascience/giab008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Darolti I, Wright AE, Sandkam BA, Morris J, Bloch NI, Farré M, Fuller RC, Bourne GR, Larkin DM, Breden F, et al. 2019. Extreme heterogeneity in sex chromosome differentiation and dosage compensation in livebearers. Proc Natl Acad Sci 116: 19031–19036. 10.1073/pnas.1905298116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Darolti I, Fong LJM, Sandkam B, Metzger DCH, Mank JE. 2023. Sex chromosome heteromorphism and the fast-X effect in poeciliids. Mol Ecol 32: 4599–4609. 10.1111/mec.17048 [DOI] [PubMed] [Google Scholar]
- Davey JW, Chouteau M, Barker SL, Maroja L, Baxter SW, Simpson F, Merrill RM, Joron M, Mallet J, Dasmahapatra KK, et al. 2016. Major improvements to the Heliconius melpomene genome assembly used to confirm 10 chromosome fusion events in 6 million years of butterfly evolution. G3 (Bethesda) 6: 695–708. 10.1534/g3.115.023655 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Emms DM, Kelly S. 2019. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20: 238. 10.1186/s13059-019-1832-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fallon TR, Lower SE, Chang C-H, Bessho-Uehara M, Martin GJ, Bewick AJ, Behringer M, Debat HJ, Wong I, Day JC, et al. 2018. Firefly genomes illuminate parallel origins of bioluminescence in beetles. eLife 7: e36495. 10.7554/eLife.36495 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci 117: 9451–9457. 10.1073/pnas.1921046117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Furman BLS, Metzger DCH, Darolti I, Wright AE, Sandkam BA, Almeida P, Shu JJ, Mank JE. 2020. Sex chromosome evolution: so many exceptions to the rules. Genome Biol Evol 12: 750–763. 10.1093/gbe/evaa081 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao F, Ming C, Hu W, Li H. 2016. New software for the fast estimation of population recombination rates (FastEPRR) in the genomic era. G3 (Bethesda) 6: 1563–1571. 10.1534/g3.116.028233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Generalovic TN, McCarthy SA, Warren IA, Wood JMD, Torrance J, Sims Y, Quail M, Howe K, Pipan M, Durbin R, et al. 2021. A high-quality, chromosome-level genome assembly of the black soldier fly (Hermetia illucens L.). G3 (Bethesda) 11: jkab085. 10.1093/g3journal/jkab085 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. 2011. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat Biotechnol 29: 644–652. 10.1038/nbt.1883 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu L, Walters JR. 2017. Evolution of sex chromosome dosage compensation in animals: a beautiful theory, undermined by facts and bedeviled by details. Genome Biol Evol 9: 2461–2476. 10.1093/gbe/evx154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. 2008. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol 9: R7. 10.1186/gb-2008-9-1-r7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu Q-L, Ye Y-X, Zhuo J-C, Huang H-J, Li J-M, Zhang C-X. 2022. Chromosome-level assembly, dosage compensation and sex-biased gene expression in the small brown planthopper, Laodelphax striatellus. Genome Biol Evol 14: evac160. 10.1093/gbe/evac160 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huylmans AK, Toups MA, Macon A, Gammerdinger WJ, Vicoso B. 2019. Sex-biased gene expression and dosage compensation on the Artemia franciscana Z-chromosome. Genome Biol Evol 11: 1033–1044. 10.1093/gbe/evz053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30: 772–780. 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. 2019. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37: 907–915. 10.1038/s41587-019-0201-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16: 111–120. 10.1007/BF01731581 [DOI] [PubMed] [Google Scholar]
- Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. 2009. Circos: an information aesthetic for comparative genomics. Genome Res 19: 1639–1645. 10.1101/gr.092759.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S, Suleski M, Craig JM, Kasprowicz AE, Sanderford M, Li M, Stecher G, Hedges SB. 2022. TimeTree 5: an expanded resource for species divergence times.Mol Biol Evol 39: msac174. 10.1093/molbev/msac174 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lasne C, Sgrò CM, Connallon T. 2017. The relative contributions of the X chromosome and autosomes to local adaptation. Genetics 205: 1285–1304. 10.1534/genetics.116.194670 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lenormand T, Roze D. 2022. Y recombination arrest and degeneration in the absence of sexual dimorphism. Science 375: 663–666. 10.1126/science.abj1813 [DOI] [PubMed] [Google Scholar]
- Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25: 1754–1760. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y, Zhang B, Moran NA. 2020. The aphid X chromosome is a dangerous place for functionally important genes: diverse evolution of hemipteran genomes based on chromosome-level assemblies. Mol Biol Evol 37: 2357–2368. 10.1093/molbev/msaa095 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liao Y, Smyth GK, Shi W. 2014. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30: 923–930. 10.1093/bioinformatics/btt656 [DOI] [PubMed] [Google Scholar]
- Lu F, Wei Z, Luo Y, Guo H, Zhang G, Xia Q, Wang Y. 2020. SilkDB 3.0: visualizing and exploring multiple levels of data for silkworm. Nucleic Acids Res 48: D749–D755. 10.1093/nar/gkz919 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mank JE. 2009. The W, X, Y and Z of sex-chromosome dosage compensation. Trends Genet 25: 226–233. 10.1016/j.tig.2009.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mank JE. 2013. Sex chromosome dosage compensation: definitely not for everyone. Trends Genet 29: 677–683. 10.1016/j.tig.2013.07.005 [DOI] [PubMed] [Google Scholar]
- Mank JE, Vicoso B, Berlin S, Charlesworth B. 2010. Effective population size and the faster-X effect: empirical results and their interpretation. Evolution (N Y) 64: 663–674. 10.1111/j.1558-5646.2009.00853.x [DOI] [PubMed] [Google Scholar]
- Mao Y, Zhang N, Nie Y, Zhang X, Li X, Huang Y. 2020. Genome size of 17 species from Caelifera (Orthoptera) and determination of internal standards with very large genome size in Insecta. Front Physiol 11: 567125. 10.3389/fphys.2020.567125 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mei Y, Jing D, Tang S, Chen X, Chen H, Duanmu H, Cong Y, Chen M, Ye X, Zhou H, et al. 2022. InsectBase 2.0: a comprehensive gene resource for insects. Nucleic Acids Res 50: D1040–D1045. 10.1093/nar/gkab1090 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meisel RP, Connallon T. 2013. The faster-X effect: integrating theory and data. Trends Genet 29: 537–544. 10.1016/j.tig.2013.05.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meisel RP, Delclos PJ, Wexler JR. 2019. The X chromosome of the German cockroach, Blattella germanica, is homologous to a fly X chromosome despite 400 million years divergence. BMC Biol 17: 100. 10.1186/s12915-019-0721-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Metzger DCH, Porter I, Mobley B, Sandkam BA, Fong LJM, Anderson A, Mank JE. 2023. Transposon wave remodeled the epigenomic landscape in the rapid evolution of novel X chromosome dosage compensation mechanism. Genome Res 33: 1917–1931. 10.1101/gr.278127.123 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Misof B, Liu S, Meusemann K, Peters RS, Donath A, Mayer C, Frandsen PB, Ware J, Flouri T, Beutel RG, et al. 2014. Phylogenomics resolves the timing and pattern of insect evolution. Science 346: 763–767. 10.1126/science.1257570 [DOI] [PubMed] [Google Scholar]
- Mrnjavac A, Khudiakova KA, Barton NH, Vicoso B. 2023. Slower-X: reduced efficiency of selection in the early stages of X chromosome evolution. Evol Lett 7: 4–12. 10.1093/evlett/qrac004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nguyen DK, Disteche CM. 2006. Dosage compensation of the active X chromosome in mammals. Nat Genet 38: 47–53. 10.1038/ng1705 [DOI] [PubMed] [Google Scholar]
- Ohno S. 2013. Sex chromosomes and sex-linked genes. Springer-Verlag, New York. [Google Scholar]
- Pal A, Vicoso B. 2015. The X chromosome of hemipteran insects: conservation, dosage compensation and sex-biased expression. Genome Biol Evol 7: 3259–3268. 10.1093/gbe/evv215 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker DJ, Jaron KS, Dumas Z, Robinson-Rechavi M, Schwander T. 2022. X chromosomes show relaxed selection and complete somatic dosage compensation across Timema stick insect species. J Evol Biol 35: 1734–1750. 10.1111/jeb.14075 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pascoal S, Risse JE, Zhang X, Blaxter M, Cezard T, Challis RJ, Gharbi K, Hunt J, Kumar S, Langan E, et al. 2020. Field cricket genome reveals the footprint of recent, abrupt adaptation in the wild. Evol Lett 4: 19–33. 10.1002/evl3.148 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pease JB, Hahn MW. 2012. Sex chromosomes evolved from independent ancestral linkage groups in winged insects. Mol Biol Evol 29: 1645–1653. 10.1093/molbev/mss010 [DOI] [PubMed] [Google Scholar]
- Pedersen BS, Quinlan AR. 2018. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34: 867–868. 10.1093/bioinformatics/btx699 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. 2015. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33: 290–295. 10.1038/nbt.3122 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Price MN, Dehal PS, Arkin AP. 2009. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26: 1641–1650. 10.1093/molbev/msp077 [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team. 2021. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/. [Google Scholar]
- Richard G, Legeai F, Prunier-Leterme N, Bretaudeau A, Tagu D, Jaquiéry J, Le Trionnaire G. 2017. Dosage compensation and sex-specific epigenetic landscape of the X chromosome in the pea aphid. Epigenetics Chromatin 10: 30. 10.1186/s13072-017-0137-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ruiz-Ruano FJ, López-León MD, Cabrero J, Camacho JPM. 2016. High-throughput analysis of the satellitome illuminates satellite DNA evolution. Sci Rep 6: 28333. 10.1038/srep28333 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rupp SM, Webster TH, Olney KC, Hutchins ED, Kusumi K, Wilson Sayres MA. 2016. Evolution of dosage compensation in Anolis carolinensis, a reptile with XX/XY chromosomal sex determination. Genome Biol Evol 9: 231–240. 10.1093/gbe/evw263 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Servant N, Varoquaux N, Lajoie BR, Viara E, Chen CJ, Vert JP, Heard E, Dekker J, Barillot E. 2015. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol 16: 259. 10.1186/s13059-015-0831-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31: 3210–3212. 10.1093/bioinformatics/btv351 [DOI] [PubMed] [Google Scholar]
- Smit AFA, Hubley R, Green P. 2013–2015. RepeatMasker Open-4.0. http//www.repeatmasker.org.
- Stanke M, Diekhans M, Baertsch R, Haussler D. 2008. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24: 637–644. 10.1093/bioinformatics/btn013 [DOI] [PubMed] [Google Scholar]
- Storer J, Hubley R, Rosen J, Wheeler TJ, Smit AF. 2021. The Dfam community resource of transposable element families, sequence models, and genome annotations. Mob DNA 12: 2. 10.1186/s13100-020-00230-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Toups MA, Vicoso B. 2023. The X chromosome of insects likely predates the origin of class Insecta. Evolution (N Y) 77: 2504–2511. 10.1093/evolut/qpad169 [DOI] [PubMed] [Google Scholar]
- The Tree of Sex Consortium. 2014. Tree of sex: a database of sexual systems. Sci Data 1: 140015. 10.1038/sdata.2014.15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tribolium Genome Sequencing Consortium. 2008. The genome of the model beetle and pest Tribolium castaneum. Nature 452: 949–955. 10.1038/nature06784 [DOI] [PubMed] [Google Scholar]
- Verlinden H, Sterck L, Li J, Li Z, Yssel A, Gansemans Y, Verdonck R, Holtof M, Song H, Behmer ST, et al. 2020. First draft genome assembly of the desert locust, Schistocerca gregaria. F1000Res 9: 775. 10.12688/f1000research.25148.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vicoso B, Bachtrog D. 2013. Reversal of an ancient sex chromosome to an autosome in Drosophila. Nature 499: 332–335. 10.1038/nature12235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vicoso B, Bachtrog D. 2015. Numerous transitions of sex chromosomes in Diptera. PLoS Biol 13: e1002078. 10.1371/journal.pbio.1002078 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vicoso B, Charlesworth B. 2006. Evolution on the X chromosome: unusual patterns and processes. Nat Rev Genet 7: 645–653. 10.1038/nrg1914 [DOI] [PubMed] [Google Scholar]
- Vicoso B, Charlesworth B. 2009. Effective population size and the faster-X effect: an extended model. Evolution (N Y) 63: 2413–2426. 10.1111/j.1558-5646.2009.00719.x [DOI] [PubMed] [Google Scholar]
- Wan F, Yin C, Tang R, Chen M, Wu Q, Huang C, Qian W, Rota-Stabelli O, Yang N, Wang S, et al. 2019. A chromosome-level genome assembly of Cydia pomonella provides insights into chemical ecology and insecticide resistance. Nat Commun 10: 4237. 10.1038/s41467-019-12175-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Tang H, Debarry JD, Tan X, Li J, Wang X, Lee T, Jin H, Marler B, Guo H, et al. 2012. MCScanx: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res 40: e49. 10.1093/nar/gkr1293 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X, Fang X, Yang P, Jiang X, Jiang F, Zhao D, Li B, Cui F, Wei J, Ma C, et al. 2014. The locust genome provides insight into swarm formation and long-distance flight. Nat Commun 5: 2957. 10.1038/ncomms3957 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang L, Tang N, Gao X, Chang Z, Zhang L, Zhou G, Guo D, Zeng Z, Li W, Akinyemi IA, et al. 2017. Genome sequence of a rice pest, the white-backed planthopper (Sogatella furcifera). GigaScience 6: 1–9. 10.1093/gigascience/giw004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wenger AM, Peluso P, Rowell WJ, Chang P-C, Hall RJ, Concepcion GT, Ebler J, Fungtammasan A, Kolesnikov A, Olson ND, et al. 2019. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol 37: 1155–1162. 10.1038/s41587-019-0217-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu K, Oh S, Park T, Presgraves DC, Yi SV. 2012. Lineage-specific variation in slow- and fast-X evolution in primates. Evolution (N Y) 66: 1751–1761. 10.1111/j.1558-5646.2011.01556.x [DOI] [PubMed] [Google Scholar]
- Ye Y-X, Zhang H-H, Li D-T, Zhuo J-C, Shen Y, Hu Q-L, Zhang C-X. 2021. Chromosome-level assembly of the brown planthopper genome with a characterized Y chromosome. Mol Ecol Resour 21: 1287–1298. 10.1111/1755-0998.13328 [DOI] [PubMed] [Google Scholar]
- Yu G, Wang L-G, Han Y, He Q-Y. 2012. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16: 284–287. 10.1089/omi.2011.0118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu J, Jiang F, Wang X, Yang P, Bao Y, Zhao W, Wang W, Lu H, Wang Q, Cui N, et al. 2017. Genome sequence of the small brown planthopper, Laodelphax striatellus. GigaScience 6: 1–12. 10.1093/gigascience/gix109 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




