Abstract
Among over 2,000 species of mealybugs (Hemiptera: Pseudococcidae), only 13 genomes have been published so far, seriously limiting the researches on the phylogeny and adaptive evolution of this group. The continuous publication of mealybug genomes will significantly facilitate our exploration of the biological characteristics, detrimental attributes, and control strategies of the Pseudococcidae family. Jack Beardsley mealybug (Pseudococcus jackbeardsleyi) as one of the hazardous invasive pests, it could cause enormous losses to the fruit and vegetable industries worldwide. Herein, we combined Nanopore long-read, short-read Illumina and Hi-C sequencing, generating a high-quality chromosome-level genome assembly of P. jackbeardsleyi. The genome size was determined to be 334.818 Mb, which was assembled into 5 linkage groups with a N50 of 67.233 Mb. The BUSCO analysis demonstrated the completeness of the genome assembly and annotation are 95.7% and 92.8%, respectively. The developed high-quality genome will serve as an asset for delving into the genetic mechanisms underlying the invasiveness of P. jackbeardsleyi, thereby offering a crucial theoretical foundation for the prevention and management of Pseudococcidae pests.
Subject terms: Entomology, Sequencing
Background & Summary
Mealybugs (Hemiptera: Pseudococcidae) are significant pests including over 2,000 species, of which 20% are polyphagous, affecting a wide variety of agricultural, horticultural, and ornamental plants worldwide1,2. They feed on plant sap and excrete honeydew, leading to plant growth restriction, yields reduction, or even plant death, which could cause huge crop losses3. Their broad host range, rapid reproduction cycle, global distribution, and ability to transmit important plant viruses, contribute to their considerable potential for causing damage4. Among the members of Pseudococcidae, Jack Beardsley mealybug (Pseudococcus jackbeardsleyi Gimpel and Miller) is a polyphagous species originating from the neotropical region5. This species is known to infest plant species including 88 genera of hosts in 38 plant families, including various vegetables, fruits, and ornamental crops, such as Luffa cylindrica, Nephelium lappaceum and Ficus microcarpain, which are of significant economic importance6. As one important invasive species, P. jackbeardsleyi is widely distributed in 46 countries and regions till now, and is still expanding its invasion ranges rapidly6. Except for the direct impact on host plant, P. jackbeardsleyi is also a potential vectors of plant virus such as CaMMV (Cacao Mild Mosaic Virus)7. Therefore, its broad spectrum of economic hosts and the capacity to extend its geographical make it a candidate pest target in the future8. However, there is limited information concerning control strategies for this species. Here, whole genome sequencing was performed to construct a high-quality genome assembly for this species, which will help to study the role of Pseudococcidae in ecosystems, protect biodiversity and promote sustainable development.
Among the multitude of species within the Pseudococcidae family, only 13 genomes have been published, with merely 5 of them assembled to the chromosomal level thus far (Table 1), which severely impacted our understanding of their adaptability, systematic evolution and invasive strategies. In the present study, we have successfully generated a high-quality reference genome of P. jackbeardsleyi at a chromosomal level, utilizing a comprehensive approach that combines Nanopore long-read sequencing, high-throughput chromosome conformation capture (Hi-C) technology, and Illumina platform paired-end short-read sequencing. The assembled genome size is 334.818 Mb, which were clustered into 5 linkage groups with an N50 of 67.233 Mb. Moreover, we have identified 10,908 annotated protein-coding genes and Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis demonstrated the completeness of 95.7% of chromosome-level genome and 92.8% of annotated genes. Additionally, 12 gene families covering detoxification and chemosensory genes were also predicted. Overall, the high-quality P. jackbeardsleyi genome not only provides a useful resource for understanding the phylogenomic and comparative genomics of Pseudococcidae, but also facilitates the development of potential control strategies for these pests.
Table 1.
Assembly features for genomes of Pseudococcus jackbeardsleyi and other scale insects.
Feature | Pseudococcus jackbeardsleyi | Pseudococcus viburni | Pseudococcus longispinus | Phenacoccus solenopsis | Planococcus citri | Paracoccus marginatus | Acanthococcus lagerstroemiae | Balanococcus diminutus | Coronaproctus castanopsis | Ferrisia virgata | Hypogeococcus pungens | Maconellicoccus hirsutus | Trionymus perrisii |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Level | Chr. | Scaf. | Scaf. | Chr. | Chr. | Scaf. | Chr. | Chr. | Chr. | Scaf. | Scaf. | Scaf. | Scaf. |
Size (Mb) | 334.818 | 435.4 | 285 | 292.5 | 403.6 | 191.2 | 658.1 | 313.1 | 700.1 | 304.6 | 238.2 | 163 | 237.6 |
No. Scaf./Chr. | 5 | 2,392 | 66,857 | 588 | 5 | 60,102 | 9 | 5 | 3 | 32,723 | 250,844 | 12,889 | 80,386 |
Scaf. N50 (Mb) | 67.233 | 0.875 | 0.0099 | 49 | 83.7 | 0.0065 | 70.5 | 63.3 | 273.8 | 0.0254 | 0.0019 | 0.0468 | 0.0046 |
No. contig | 96 | 2,465 | 67,377 | 1,500 | 35 | 61,408 | 1,035 | 75 | 143 | 33,491 | 258,686 | 13,288 | 80,611 |
Contig N50 (Mb) | 7.767 | 0.8266 | 0.0097 | 0.4898 | 23.6 | 0.0063 | 5.5 | 6.7 | 12.4 | 0.0243 | 0.0017 | 0.0449 | 0.0046 |
Methods
Samples collection, DNA and RNA preparation
Pseudococcus jackbeardsleyi were collected from mangosteens in Pingxiang, Guangxi Zhuang Autonomous Region of China (22.1178° N, 106.7394° E) for genome sequencing. Genomic DNA for the Nanopore and the Illumina paired-end library preparation was extracted from 20 females using Blood & Cell Culture DNA Kits (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. The purity and concentration of all DNA extracts were verified using the QubitTM dsDNA Quantification Assay Kits (Life Technologies Corporation, Eugene, OR, USA) with a NanoDrop (NanoDrop Products, Wilmington, DE, USA) and a Qubit®3.0 Fluorometer (Life Technologies Corporation, Eugene, OR, USA). The Blue Pippin system (Sage Science, Beverly, MA, USA) was used to retrieve large DNA fragments by gel cutting. For RNA-seq, total RNA from one female was extracted using TRIzol Reagent (Invitrogen, Carlsbad, CA, USA) and quantified with a NanoDrop ND-2000 spectrophotometer (NanoDrop Products, Wilmington, DE, USA), with three biological replicates prepared.
Genome sequencing and assembly
Genomic DNA was repaired and purified using the same methods employed in previous study9. Ligation Sequencing Kit (Cat# SQK-LSK109, Oxford Nanopore Technologies, Oxford, UK) was used for adaptors ligation and purification. Subsequently, the DNA library was constructed, and its quantitation was measured using the Qubit®3.0 Fluorometer (Life Technologies Corporation, Eugene, OR, USA). Approximately 700 ng DNA library was prepared and performed on an Oxford Nanopore PromethION P48 device with MinKNOW v22.03.4 using flow cell R9.4.1 (FLO-PRO002) (Oxford Nanopore Technologies, Oxford, UK) at the Genome Center of Grandomics (Wuhan, China) for real-time single-molecule sequencing. The Guppy basecaller (v6.0.7) was used to convert the raw signal into canonical DNA bases for basecalling with the specific parameters (--flowcell FLO-PRO002 --kit SQK-LSK109 --basecaller r9.4.1 --min_length 1000 --compress_fastq). After sequencing and basecalling, the fastq data can be used for genome assembly. For short-read sequencing, a paired-end library construction and sequencing followed previously described methods10. After quality control, we obtained 15.468 Gb of short reads (coverage: 47.741×) from the Illumina platform and 42.105 Gb of long reads (coverage: 129.954×) from the Nanopore platform for genome assembly (Table 2).
Table 2.
Statistics for sequencing data for Pseudococcus jackbeardsleyi genome assembly.
Method | Insert size (bp) | Data (Gb) | Coverage (×) | Usage |
---|---|---|---|---|
Illumina NovaSeq | 500 | 15.468 | 47.741 | Survey, correction |
Nanopore | 20,000 | 42.105 | 129.954 | De novo assembly |
Hi-C library | 100–500 | 33.114 | 102.204 | Chromosome-level assembly |
Total | / | 90.687 | 279.899 | / |
Oxford Nanopore long reads were utilized for de novo genome assembly. Raw reads were corrected and assembled using NextDenovo v2.4.0 (https://github.com/Nextomics/NextDenovo) with default parameters to generate a draft assembly. After assembly, NextPolish v1.3.111 was used to further improve single base accuracy using standard parameters as described in previous publication9. This process resulted in 96 contigs with a contig N50 length of 7.767 Mb (Table 1). Genome size, heterozygosity, and duplication were estimated by the K-mer method. K-mers were counted by jellyfish v2.2.912 with 21-base oligonucleotide based on Illumina short reads (Fig. 1a). Parameters were determined by GenomeScope v1.013. BUSCO v4.1.4 was used to assess the completeness of the assembly based on the insecta_odb10 database (1,367 genes)14, revealing that 95.7% of the genes were complete in the contig-level genome (Table 3).
Fig. 1. Assembly features of Pseudococcus jackbeardsleyi genome.
(a) Kmer (21) distribution and estimated genome size, heterozygosity and duplication rate; (b) genome-wide all-by-all Hi-C interaction.
Table 3.
Completeness of Pseudococcus jackbeardsleyi genome assembly and annotation evaluated by BUSCO based on insecta_odb10 database (1,367 genes).
Source | Complete (C) | Single copy (S) | Duplicated (D) | Fragmented (F) | Missing (M) |
---|---|---|---|---|---|
Contig-level | 95.7% | 93.0% | 2.7% | 0.7% | 3.6% |
Chromosome-level | 95.7% | 93.4% | 2.3% | 0.8% | 3.5% |
Annotation | 92.8% | 89.5% | 3.3% | 2.3% | 4.9% |
Hi-C sequencing and chromosome anchoring
The Hi-C library was constructed by a standard protocol described previously with certain modifications15. Briefly, 20 females of P. jackbeardsleyi were ground in 2% formaldehyde for cross-linking cellular protein. Cross-linking was halted by adding glycine and applying additional vacuum infiltration. Fixed tissue was then ground into a powder, resuspended in nuclei isolation buffer, and the purified nuclei were digested with 100 units of DpnII restriction enzyme, marked by incubating with biotin-14-dATP as described in previous studies9,10. Hi-C libraries were quantified using quantitative real-time PCR with a library quantification kit/Illumina GA Universal (KAPA, Wilmington, MA, USA). Subsequently, the libraries were sequenced on the Illumina NovaSeq platform, generating 150 bp paired-end reads. In total, 33.114 Gb (coverage: 102.204×) of Hi-C data for P. jackbeardsleyi was generated (Table 2). Juicer v1.6 and 3D de novo assembly (3D-DNA) pipelines were used to assemble the scaffolds into a chromosome-level genome16,17. The results showed 84.12% normal paired reads, while the others were chimeric paired (12.24%), chimeric ambiguous (2.21%) or unmapped reads (1.43%), with 26.08% of the read pairs showing Hi-C contacts (Table 4). The assembled contigs were clustered into 5 linkage groups with an N50 of 67.233 Mb (Fig. 1b, Table 1). BUSCO was also used to evaluate the completeness of the chromosome-level genome, which showed that 95.7% was identified as complete genes (Table 3).
Table 4.
Summary of Hi-C data for chromosome-level assembly of Pseudococcus jackbeardsleyi genome.
Parameter | Value |
---|---|
Sequenced Read Pairs | 109,359,863 |
Normal Paired | 91,989,238 (84.12%) |
Chimeric Paired | 13,386,340 (12.24%) |
Chimeric Ambiguous | 2,420,030 (2.21%) |
Unmapped | 1,564,255 (1.43%) |
Ligation Motif Present | 67,777,133 (61.98%) |
Alignable (Normal + Chimeric Paired) | 105,375,578 (96.36%) |
Unique Reads | 45,133,938 (41.27%) |
PCR Duplicates | 59,592,184 (54.49%) |
Optical Duplicates | 649,456 (0.59%) |
Library Complexity Estimate | 52,123,569 |
Intra-fragment Reads | 8,501,126 (7.77%/18.84%) |
Below MAPQ Threshold | 8,110,642 (7.42%/17.97%) |
Hi-C Contacts | 28,522,170 (26.08%/63.19%) |
Ligation Motif Present | 20,951,834 (19.16%/46.42%) |
3’ Bias (Long Range) | 78% - 22% |
Pair Type %(L-I-O-R) | 25% - 25% - 25% - 25% |
Inter-chromosomal | 5,178,347 (4.74%/11.47%) |
Intra-chromosomal | 23,343,823 (21.35%/51.72%) |
Short Range (<20Kb) | 19,074,662 (17.44%/42.26%) |
Long Range (>20Kb) | 4,268,693 (3.90%/9.46%) |
RNA-seq
The cDNA libraries were constructed with the TruSeqTM RNA sample preparation Kit (Illumina, San Diego, CA, USA) using 1 μg of total RNA. Libraries were size-selected for 300 bp target fragments on 2% low range ultra-agarose, followed by PCR amplification for 15 cycles using Phusion DNA polymerase (NEB, Ipswich, MA, USA). After quantification by TBS380 (Picogreen, Waltham, MA, USA), the paired-end library was sequenced on an Illumina NovaSeq 6000 sequencer (Illumina, San Diego, CA, USA) at Majorbio Bio-pharm Technology Co., Ltd (Shanghai, China). Trinity v2.11.0 was used for de novo assembly to obtain corresponding transcripts from RNA-seq raw data with default parameters18 (--seqType fq --max_memory 200 G --left R1.raw.fastq --right R2.raw.fastq --CPU 60 --trimmomatic --output pj_trinity).
Gene structure and function annotation
Gene structure annotation was performed using Maker v3.01.03 genome annotation pipeline, following established protocols10,19. RNA-seq evidence described above was utilized for genome annotation to improve exon nucleotide accuracy. Gene functions were annotated using eggnog-mapper v2.1.720. A total of 10,908 annotated protein-coding genes were identified, and BUSCO analysis showed that 92.8% of the evaluated single-copy genes were identified as complete, with 2.3% fragmented and 4.9% missing gene (Table 3). In scale insects, species with genomes assembled to the chromosome level tend to have fewer genes compared to those at the scaffold level, which may be due to redundancy caused by insufficient assembly levels (Table 5).
Table 5.
Statistics for number of protein-coding genes in the genome of Pseudococcus jackbeardsleyi and other scale insects.
Species Name | Assemble Level | NO. protein-coding genes |
---|---|---|
Pseudococcus jackbeardsleyi | Chr. | 10,908 |
Pseudococcus viburni | Scaf. | 23,629 |
Phenacoccus solenopsis | Chr. | 11,880 |
Planococcus citri | Chr. | 18,954 |
Coronaproctus castanopsis | Chr. | 10,542 |
Ferrisia virgata | Scaf. | 47,978 |
Maconellicoccus hirsutus | Scaf. | 21,623 |
Repeats and non-coding RNA (ncRNA) annotation
RepeatMasker v4.0.7 was used to detect repetitive elements in scaffolds longer than 1,000 bp against the Insecta repeats within RepBase Update21 (http://www.girinst.org). For ab initio prediction, RepeatModeler v2.0.1 (http://www.repeatmasker.org/RepeatModeler.html, RRID: SCR_015027) were first used for de novo candidate database constructing of repetitive elements. Among the repetitive sequences, retroelements and DNA transposons accounted for 5.22% and 5.61% of the whole genome, respectively (Table 6). Totally 4,380 satellites and 63,734 simple repeats were identified as tandem repeats (TRs), accounting for 0.19% and 0.86% of the P. jackbeardsleyi genome, respectively (Table 6). For ncRNA annotation, transfer RNA (tRNA) and ribosome RNA (rRNA) were predicted by tRNAscan-SE and RNAmmer with default parameters22,23. MicroRNAs (miRNA) were predicted by aligning the genomic sequence against RFAM v14.10 database (http://rfam.xfam.org/) using BLASTN24. A total of 219 tRNAs, 83 rRNAs, and 28 microRNAs were predicted in the P. jackbeardsleyi genome (Table 7). A circular diagram illustrating gene count, repeat density and GC content was generated using Circos25 (Fig. 2).
Table 6.
Statistics for repeat elements in the genome of Pseudococcus jackbeardsleyi.
Types | Number | Length (bp) | Percentage (%) |
---|---|---|---|
Retroelements | 23,414 | 17,479,410 | 5.22 |
SINEs | 61 | 9,813 | 0 |
Penelope | 31 | 29,910 | 0.01 |
LINEs | 6,205 | 1,671,215 | 0.5 |
CRE/SLACS | 0 | 0 | 0 |
L2/CR1/Rex | 927 | 328,377 | 0.1 |
R1/LOA/Jockey | 2,848 | 570,516 | 0.17 |
R2/R4/NeSL | 0 | 0 | 0 |
RTE/Bov-B | 136 | 27,046 | 0.01 |
L1/CIN4 | 0 | 0 | 0 |
LTR elements | 17,148 | 15,798,382 | 4.72 |
BEL/Pao | 590 | 733,282 | 0.22 |
Ty1/Copia | 2,294 | 2,695,829 | 0.81 |
Gypsy/DIRS1 | 7,417 | 9,319,913 | 2.78 |
Retroviral | 1,004 | 109,343 | 0.03 |
DNA transposons | 82,721 | 18,792,160 | 5.61 |
hobo-Activator | 29,616 | 6,473,573 | 1.93 |
Tc1-IS630-Pogo | 16,404 | 3,948,341 | 1.18 |
En-Spm | 0 | 0 | 0 |
MuDR-IS905 | 0 | 0 | 0 |
PiggyBac | 190 | 101,936 | 0.03 |
Tourist/Harbinger | 2,530 | 622,867 | 0.19 |
Other (Mirage, P-element, Transib) | 0 | 0 | 0 |
Rolling-circles | 7,887 | 2,366,428 | 0.71 |
Unclassified | 313,462 | 79,851,041 | 23.85 |
Total interspersed repeats | 419,597 | 116,122,611 | 34.68 |
Small RNA | 248 | 228,707 | 0.07 |
Satellites | 4,380 | 624,937 | 0.19 |
Simple repeats | 63,734 | 2,869,998 | 0.86 |
Low complexity | 14,688 | 1,558,602 | 0.47 |
Table 7.
Statistics for noncoding RNA genes in the genome of Pseudococcus jackbeardsleyi.
Types | Number | |
---|---|---|
Infernal stats | Candidate tRNAs read | 244 |
Infernal-confirmed tRNAs | 219 | |
Bases scanned by Infernal | 24,287 | |
tRNA count | tRNAs decoding Standard 20 AA | 195 |
Selenocysteine tRNAs (TCA) | 0 | |
Possible suppressor tRNAs (CTA,TTA,TCA) | 0 | |
tRNAs with undetermined/unknown isotypes | 10 | |
Predicted pseudogenes | 14 | |
Total tRNAs | 219 | |
tRNAs with introns | 17 | |
rRNA | 5s rRNA | 26 |
5.8s rRNA | 14 | |
18s rRNA | 19 | |
28s rRNA | 24 | |
miRNA | 28 |
Fig. 2. Overview of assembled Pseudococcus jackbeardsleyi genome.
The outer layer of coloured blocks is a circular representation of the 5 linkage-groups and circos demonstration of gene count (histogram), repeat density (heatmap) and GC content (line) from the outer to the inner circle, respectively.
Genome family analysis
Twelve gene families associated with detoxification and chemosensory functions were manually annotated in P. jackbeardsleyi, including cytochrome P450 monooxygenase (P450s), glutathione S-transferase (GSTs), carboxyl/cholinesterase (CCEs), UDP-glycosyltransferases (UGTs), ATP-binding cassette (ABC) transporter, heat shock protein (HSP), odorant binding protein (OBP), odorant receptor (OR), gustatory receptor (GR), Ionotropic receptors (IR), chemosensory proteins (CSP), and sensory neuron membrane protein (SNMP). The bioinformatic pipeline BITACORA (full mode) conducted HMMER and BLAST analyses26. Genes were annotated with a default cutoff E-value of 10e-5 and manually verified based on gene length and conserved domains sourced from the SMART database27. In total, we identified 83 P450s, 16 GSTs, 150 CCEs, 38 UGTs, 83 ABC transporters and 47 HSPs in P. jackbeardsleyi genome (Table 8). Additionally, there are 81 chemosensory genes in P. jackbeardsleyi, including 21 OBPs, 5 ORs, 15 GRs, 19 IRs, 9 CSPs and 12 SNMPs (Table 8).
Table 8.
Statistics for 12 gene families of Pseudococcus jackbeardsleyi.
Gene Family | Number of annotated genes Identified | Number of manually annotated genes | Total number of identified genes | Total number of identified genes clustering identical sequences |
---|---|---|---|---|
P450 | 83 | 0 | 83 | 83 |
GST | 16 | 1 | 17 | 17 |
CCE | 150 | 0 | 150 | 150 |
UGT | 38 | 0 | 38 | 38 |
ABC | 83 | 3 | 86 | 86 |
OBP | 21 | 1 | 22 | 22 |
OR | 5 | 16 | 21 | 21 |
GR | 15 | 8 | 23 | 23 |
IR | 19 | 8 | 27 | 27 |
CSP | 9 | 2 | 11 | 11 |
SNMP | 12 | 0 | 12 | 12 |
HSP | 47 | 17 | 64 | 62 |
Data Records
The dataset is available at the National Center for Biotechnology Information (NCBI), under the genome accession number of JAZDXF00000000028. The NCBI BioProject accession number is PRJNA1070360. RNA-seq, Hi-C and Illumina raw reads have been deposited in the Sequence Read Archive (SRA) repository with the accession number of SRP48660429. In addition, the annotation files for genome, ncRNA and repeat content had been submitted at the figshare30–32.
Technical Validation
The integrity of the extracted DNA was assessed by agarose gel electrophoresis, and DNA concentration was determined using NanoDrop and Qubit 3.0 Fluorometer with an absorbance of approximately 2.0 at 260/280. The scaffold N50, indicating the length at which half of the genome assembly is in scaffolds of this size, notably improved to 67.233 Mb, surpassing many other genomes (Table 1). We evaluated the completeness of the genome assembly using the sequence identity method, aligning small fragment library reads with the assembled genome using BWA software. The BUSCO analysis demonstrated 95.7% completeness (Table 3), affirming the high quality of the genome assembly. The percentage of duplicated single-copy genes assessed by BUSCO was minimal at 2.3% (Table 3), indicating that duplication was not a significant issue in the assembly process. Furthermore, BlobTools was utilized to detect potential contamination in the assembly, revealing no indications of contamination. These results indicated that we successfully acquired a high-quality genome of P. jackbeardsleyi.
Acknowledgements
This research was supported by the National Key Research and Development Programme of China (2021YFF0601901), Beijing Natural Science Foundation (6244049), Hainan Natural Science Foundation (323MS065) and the China Agriculture Research System of MOF and MARA.
Author contributions
Shaokun Guo conceived and designed the study; Shaokun Guo and Bo Liu conducted molecular works; Guoping Zhan and Qingying Zhao provided the insect pictures; Shaokun Guo, Guoping Zhan and Zhihong Li discussed the results; Shaokun Guo analyzed the data and wrote the manuscript.
Code availability
The data analyses were performed according to the manuals and protocols by the developers of corresponding bioinformatics tools and all software, and codes used in this work are publicly available, with corresponding versions indicated in Methods.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Garcia Morales, M. et al. ScaleNet: a literature-based model of scale insect biology and systematics. Database2016, bav118 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Miller, D., Miller, G. & Watson, G. Invasive species of mealybugs (Hemiptera: Pseudococcidae) and their threat to US agriculture. P. Entomol. Soc. Wash.104, 825–836 (2002). [Google Scholar]
- 3.Bellotti, A. C., et al.Cassava pests in Latin America, Africa and Asia. (Centro Internacional de Agricultura Tropical (CIAT), 2011).
- 4.Meyer, J. B., Kasdorf, G. G. F., Nel, L. H. & Pietersen, G. Transmission of activated-episomal Banana streak OL (badna) virus (BSOLV) to cv. Williams Banana (Musa sp.) by three mealybug species. Plant Dis.92, 1158–1163 (2008). 10.1094/PDIS-92-8-1158 [DOI] [PubMed] [Google Scholar]
- 5.Williams, D. The distribution of the neotropical mealybug Pseudococcus elisae Borchsenius in the Pacific region and Southern. Asia (Hem.-Hom., Pseudococcidae). Entomologist’s Monthly Magazine124, 123–124 (1988). [Google Scholar]
- 6.CABI. Pseudococcus jackbeardsleyi (Jack Beardsley mealybug). CABI Compendium, https://www.cabi.org/cpc/datasheet/45087 (2021).
- 7.Puig, A. S., Wurzel, S., Suarez, S., Marelli, J. P. & Niogret, J. Mealybug (Hemiptera: Pseudococcidae) species associated with cacao mild mosaic virus and evidence of virus acquisition. Insects12, 994 (2021). 10.3390/insects12110994 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Williams, D. J. & Watson, G. W. Scale insects of the tropical South Pacific region. Part 2. Mealybugs (Pseudococcidae). (CAB International, 1988).
- 9.Guo, S. et al. Chromosome-level genome assembly of an important wolfberry fruit fly (Neoceratitis asiatica Becker). Sci. Data10, 675 (2023). 10.1038/s41597-023-02601-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Guo, S. et al. Chromosome-level assembly of the melon thrips genome yields insights into evolution of a sap-sucking lifestyle and pesticide resistance. Mol. Ecol. Resour.20, 1110–1125 (2020). 10.1111/1755-0998.13189 [DOI] [PubMed] [Google Scholar]
- 11.Hu, J., Fan, J. P., Sun, Z. Y. & Liu, S. L. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics36, 2253–2255 (2020). 10.1093/bioinformatics/btz891 [DOI] [PubMed] [Google Scholar]
- 12.Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics27, 764–770 (2011). 10.1093/bioinformatics/btr011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics33, 2202–2204 (2017). 10.1093/bioinformatics/btx153 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Manni, M., Berkeley, M. R., Seppey, M., Simao, F. A. & Zdobnov, E. M. BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol.38, 4647–4654 (2021). 10.1093/molbev/msab199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Belaghzal, H., Dekker, J. & Gibcus, J. H. Hi-C 2.0: An optimized Hi-C procedure for high-resolution genome-wide mapping of chromosome conformation. Methods123, 56–65 (2017). 10.1016/j.ymeth.2017.04.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science356, 92–95 (2017). 10.1126/science.aal3327 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst.3, 95–98 (2016). 10.1016/j.cels.2016.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Grabherr, M. G. et al. Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nat. Biotechnol.29, 644–652 (2011). 10.1038/nbt.1883 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res.18, 188–196 (2008). 10.1101/gr.6743907 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol.38, 5825–5829 (2021). 10.1093/molbev/msab293 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinf. 25, unit 4.10 (2009). [DOI] [PubMed]
- 22.Lagesen, K. et al. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res.35, 3100–3108 (2007). 10.1093/nar/gkm160 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res.25, 955–964 (1997). 10.1093/nar/25.5.955 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kalvari, I. et al. Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res.49, D192–D200 (2020). 10.1093/nar/gkaa1047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Krzywinski, M. et al. Circos: An information aesthetic for comparative genomics. Genome Res.19, 1639–1645 (2009). 10.1101/gr.092759.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Vizueta, J., Sanchez-Gracia, A. & Rozas, J. BITACORA: A comprehensive tool for the identification and annotation of gene families in genome assemblies. Mol. Ecol. Resour.20, 1445–1452 (2020). 10.1111/1755-0998.13202 [DOI] [PubMed] [Google Scholar]
- 27.Letunic, I., Khedkar, S. & Bork, P. SMART: recent updates, new developments and status in 2020. Nucleic Acids Res.49, D458–D460 (2020). 10.1093/nar/gkaa937 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.NCBI Assemblyhttps://identifiers.org/ncbi/insdc.gca:GCA_038380155.1 (2024).
- 29.NCBI Sequence Read Archivehttps://identifiers.org/ncbi/insdc.sra:SRP486604 (2024).
- 30.Guo, S. Pseudococcus jackbeardsleyi genome annotation. figshare10.6084/m9.figshare.25622025.v1 (2024). 10.6084/m9.figshare.25622025.v1 [DOI]
- 31.Guo, S. Pseudococcus jackbeardsleyi noncoding RNA prediction. figshare10.6084/m9.figshare.26268106.v1 (2024). 10.6084/m9.figshare.26268106.v1 [DOI]
- 32.Guo, S. Pseudococcus jackbeardsleyi repeat content annotation. figshare10.6084/m9.figshare.26268229.v1 (2024).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- NCBI Assemblyhttps://identifiers.org/ncbi/insdc.gca:GCA_038380155.1 (2024).
- NCBI Sequence Read Archivehttps://identifiers.org/ncbi/insdc.sra:SRP486604 (2024).
- Guo, S. Pseudococcus jackbeardsleyi genome annotation. figshare10.6084/m9.figshare.25622025.v1 (2024). 10.6084/m9.figshare.25622025.v1 [DOI]
- Guo, S. Pseudococcus jackbeardsleyi noncoding RNA prediction. figshare10.6084/m9.figshare.26268106.v1 (2024). 10.6084/m9.figshare.26268106.v1 [DOI]
Data Availability Statement
The data analyses were performed according to the manuals and protocols by the developers of corresponding bioinformatics tools and all software, and codes used in this work are publicly available, with corresponding versions indicated in Methods.