Abstract
The soft tick family Argasidae contains vectors of medical and veterinary importance, but few molecular resources are available compared to hard ticks (Ixodidae). One example is Ornithodoros turicata, a recognized vector of Borrelia turicatae, causal agent of human relapsing fever, and a putative vector of African swine fever virus. To address the current lack of molecular resources for the Argasidae, we generated a chromosome-level genome assembly for O. turicata using PacBio sequencing in conjunction with an Illumina Hi-C library. The resulting reference genome has a total of 1.1 Gb in length and was assembled into 10 chromosomes and 368 unplaced scaffolds, an N50 of 2.7 Mb, and a QV score of 43.58. The orthology analysis indicated high-quality with a 97.8% BUSCO completeness score and 36,149 annotated genes. 33% of the genome was identified as repeats and masked. The generation of the first soft tick reference genome establishes a basis for future studies to utilize genomic tools to support vector-borne disease management.
Keywords: reference genome, soft tick, relapsing fever tick, genome assembly
Introduction
Argasidae is estimated to consist of approximately 200 species (Nicholson et al. 2019). Recent studies of Argasidae phylogenetics have updated and reorganized Ornithodoros species (Mans et al. 2019), including expansion of Afrotropical Ornithodoros species (Bakkes et al. 2018) and elevation of the Ornithodoros subgenus Alectorobius to genus (Kneubehl et al. 2022). Species within the Ornithodoros genus are generally characterized as long-lived nidicolous rapid feeders taking multiple blood-meals over their lifetime. Ornithodoros turicata (Dugès) typically inhabits burrows, outcroppings, and caves, as well as spaces under buildings and other cavity environments where is has access to a broad range (BR) of hosts including mammals, birds, and reptiles (Cooley and Kohls 1944; Balasubramanian et al. 2024; Mays Maestas et al. 2025). These ticks will complete blood-feeding in 15–60 min (Beck et al. 1986 ; Kim et al. 2017) and adults can survive many years without a blood meal (Beck et al. 1986). O. turicata is a vector of Borrelia turicatae, a pathogen causing human relapsing fever (Brumpt 1933), and a putative vector of African swine fever virus.
Two geographically distinct populations of O. turicata exist in the United States. A western population is distributed across California to Texas and into Mexico, while a southeastern population occurs in Florida (Donaldson et al. 2016; Botero-Canola et al. 2024). A suspected ecological barrier exists between Louisiana, Mississippi, and Alabama where O. turicata has not been identified. Reproductive, genetic, and biological characterizations including vector competence studies of B. turicatae indicate that the Florida population is likely a subspecies and has been designated as O. turicata americanus (Beck et al. 1986; Krishnavajhala et al. 2018; Kneubehl et al. 2022; Botero-Canola et al. 2024).
There is currently no complete genome sequenced from Argasidae. The genome size of O. turicata was expected to be 1,090 Mbp based on flow cytometry of synganglion nuclei (Geraci et al. 2007). Comparisons to Ornithodoros moubata suggest that the chromosome number of O. turicata would be 10 (2n = 20) with a XX/XY sex determination system (Ashman et al. 2014). In our current study, the genome of O. turicata was sequenced as part of the USDA-ARS Ag100Pest Initiative in collaboration with scientists from the Baylor College of Medicine and Texas A&M AgriLife Research. To assemble the genome PacBio long reads with Illumina Hi-C were used to produce a chromosome-level O. turicata reference genome. This assembly produced a genome size of 1.1 Gb and a BUSCO score of 97.8%. This soft tick reference assembly will be beneficial for vector-borne disease control efforts by providing a molecular resource for future investigations. Comparative vector population genomics and insight into the molecular mechanisms of vector competence are now possible.
Materials and methods
Sample source and sequencing
O. turicata ticks originated from specimens collected in 1992 from an infested cave in Travis County, Texas, and maintained in colony at the Tick Research Laboratory at Texas A&M AgriLife Research. Ticks were periodically blood-fed on chicken under IACUC approved AUP #2020-0301. Between feedings ticks were kept in screened glass vials placed in paper-covered trays to subdue direct light, at incubator conditions of 80–85% RH, 25 ± 3°C, and 14:10 light:dark cycle. Ticks used in this project were estimated to be in the 9th to 10th colony generation. High molecular weight DNA was extracted from a single mated female using the MagAttract HMW DNA Kit (Qiagen, Hilden Germany) following the fresh or frozen tissue protocol. DNA was purified using 2:1 polyethylene glycol and solid-phase reversible immobilization beads. To quantify DNA, a Qubit (Thermo Fisher Scientific, Waltham, Massachusetts, USA) with a dsDNA BR assay kit was used. Additionally using a DS-11 Spectrophotometer and Fluorometer (DeNovix Inc, Wilmington, Delaware, USA) was used for fluorometry and purity. The resulting high molecular weight DNA sample was needle sheared 10× with a 26-gauge needle and prepared for PacBio sequencing using the SMRTbell Express Template Prep Kit 2.0 for Continuous Long Read (CLR) generation. The library was size-selected on a BluePippin (Sage Science Inc, Beverly, Massachusetts, USA) with a 10 kb cutoff. Sequencing was performed on a Sequel II System using Binding Kit v2.0, Sequencing kit v2.0, and SMRT Cell 8 M. To target CLR reads, the library was sequenced using a 20 h of movie time on one SMRTcell.
For error-correction, an aliquot of the same DNA utilized for the CLR library was prepared for Illumina sequencing with the TruSeq Nano DNA Low Throughput Library Prep Kit with IDT for Illumina TruSeq DNA UD Indexes using standard protocols for a library with a 350 bp insert size and 50 bp standard deviation. Paired-end 150 bp sequence reads were generated on a NovaSeq 6000 System (Illumina) at the HudsonAlpha Genome Sequencing Center (Huntsville, AL, USA). Whole tissue from a second single female adult tick was crosslinked using the Arima Hi-C low input protocol, and proximity ligation was performed using the Arima Genomics Hi-C Kit. After proximity ligation, the DNA was sheared using a Diagenode Bioruptor using a 30/90 On/Off cycle for 10 min and then size-selected to enrich for DNA fragments of 200–600 bp. An Illumina library was prepared from the sheared and size-selected DNA using the Swift Biosciences Accel-NGS 2S Plus DNA Library Kit. The final Illumina Hi-C library was sequenced on a NovaSeq 6000 System (Illumina) at the HudsonAlpha Genome Sequencing Center (Huntsville, AL, USA).
RNA sequencing data was generated for annotation. RNA was extracted from whole, individual males and females either unfed or nine days post-feeding (four ticks per sex per fed state, total of 16 ticks) using Zymo's DirectZol kit following manufacturer's instructions with DNase I digestion. Total RNA was submitted to the Texas A&M Genomics and Bioinformatics Service for Sequencing. PolyA RNA was selected using NEXTFLEX polyA beads 2.0 and a PerkinElmer NEXTFLEX rapid directional RNAseq kit 2.0 was used to generate the libraries. Prepared libraries were sequenced on an Illumina NovaSeq 6000 using the SP 2 × 100 v1.5 flow cell.
Sex-specific genomic features
In order to identify sex-specific genomic features, DNA was extracted from three male and three female ticks from the Travis county colony and prepared for Illumina short read sequencing. Individuals were homogenized in 1.5-mL tubes in liquid nitrogen baths with liquid nitrogen-cooled pestles. DNA was isolated using the E.Z.N.A. Insect DNA Kit (Omega Bio-tek, Inc., Norcross, GA, USA). Following isolation, genomic DNA was evaluated using the NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Illumina libraries were prepared using the FS DNA Library Prep Kit and NEBNext Multiplex Oligos for Illumina (New England Biolabs, Ipswich, MA, USA). Size selection and cleanup were performed with 0.8× SPRIselect beads (Beckman Coulter, Inc., Brea, CA, USA). The prepared libraries were sequenced at the USDA-ARS Veterinary Pest Genetics Research Unit in Kerrville, Texas, on a NextSeq 2000 system with a P3 flow cell and a 200 cycle reagent cartridge (Illumina, Inc., San Diego, CA, USA).
Assembly
The genome assembly was completed using raw CLR subread BAM files and converting to FASTQ format using BamTools v2.5.1 (Barnett et al. 2011). These were inputted into the FALCON assembler (Chin et al. 2016) using the pb-assembly conda environment v.0.0.8.1 (Pacific Biosciences; default parameters). Primary and alternate contigs were created with FALCON-Unzip and one round of haplotype-aware polishing by Arrow (Pacific Biosciences). To remove duplicated sequences at the ends of primary contigs purge_dups v1.2.5 was used and cutoffs were estimated from an automatically generated histogram of k-mers (Guan et al. 2020). Purged primary sequences were added to the alternate haplotype contigs after which duplicates were purged again, purged primary sequences were added to the alternate haplotype contigs. Scaffolding was completed through the Arima Genomics mapping pipeline (Ghurye et al. 2019). In addition, Hi-C data were aligned to the scaffolds using the −5SP chimeric read alignment option of bwa-mem v2.2.1 (Li 2013) and put into YaHS for scaffolding (Zhou et al. 2023) using the “no contig error correcting” option separately. Data were then uploaded to Juicebox v1.11.08 (Durand et al. 2016) and a manually curated contact map was created to relate the primary scaffolds to the aligned Hi-C reads (Fig. 1). With the Hi-C contact map, mis-joins were manually broken, inverted contigs were re-oriented, and contigs were joined into scaffolds. Finally, because the manually curated scaffolds still retained visual signs of duplicated haplotypes, assembly was converted back to FASTA and a final purge_dups was performed. Arima and bwa mapping were done to confirm mapping results and Hi-C scaffolding.
Fig. 1.
Hi-C contact map. Generated from the final genome assembly in Juicebox.
To assemble and polish the mitochondrial genome a custom workflow was used called mitoPolishCLR (Formenti and Stahlke 2022) which was improved for arthropods and modified from mitoVGP (Formenti et al. 2021). Low-quality reads were reduced by filtering raw CLR reads and removing those more than 1.5-times the 14,458 bp mitochondrial genome. Filtered reads were retained if they shared homology to the mitochondrial genome according to BLASTn searches (default parameters; (Camacho et al. 2009 )). These were assembled with parameters for organelle assemblies using Canu v.2.2 (Koren et al. 2017). Contig polishing was completed with FreeBayes and evaluated using QV scores from Merqury (Rhie et al. 2020) and the Meryl 31-mer database of Illumina short reads (Miller et al. 2008). The mitochondrial assembly was linearized after a final round of trimming and annotated with MitoFinder (Allio et al. 2020).
After scaffolding, the primary, alternate, and mitochondrial assemblies were merged and polished with one additional round of Arrow (Pacific Biosciences), followed by two rounds of variant identification using FreeBayes v1.0.2 (Garrison and Marth 2012). FreeBayes identified variants were filtered via Merfin (Formenti et al. 2022) before being incorporated into a new polished consensus. The second round of Arrow-identified variants were also filtered via Merfin as the PacBio CLR and Illumina reads came from the same individual. Genome assembly quality scores were generated via Merqury before, between, and after each round of polishing to assess quality improvements. The polishCLR pipeline used here is a reproducible Nextflow workflow (Di Tommaso et al. 2017 ) inspired by the Vertebrate Genome Project assembly pipeline (Rhie et al. 2021).
To screen for biological contamination such as microbial symbionts, contigs were screened using the BlobToolkit using default settings (Challis et al. 2020). Identification of contigs was completed using Diamond (Buchfink et al. 2015). The best-sum hits for each contig were parsed using a python script (https://github.com/sheinasim/FindContaminantsFromBlob) which removes non-arthropod contigs. Final assembly completeness was assessed by BUSCO v.5.2.2 with the arachnida_odb10 gene set of 2,934 orthologs (Simão et al. 2015; Seppey et al. 2019; Manni et al. 2021), and composition evaluated by a 21-mer k-mer spectra of the paired-end Illumina reads and assembly using KAT v.2.4.2 (Mapleson et al. 2017). Genome size, repetitive content, and heterozygosity were estimated from 21-mer histograms of the paired-end Illumina reads generated with jellyfish (Marçais and Kingsford 2011) and modeled in GenomeScope 2.0 (Ranallo-Benavidez et al. 2020).
Annotation
Scaffolds were submitted to the National Center for Biotechnology Information (NCBI) for automated eukaryotic genome annotation using the NCBI Eukaryotic Annotation Pipeline (Thibaud-Nissen et al. 2016). The NCBI Eukaryotic Annotation Pipeline integrates ab initio and evidence-based processes to produce the reference sequence (RefSeq) gene models.
Analysis of sex-specific short read sequencing
Genomic paired-end Illumina reads were mapped to the reference assembly with bwa-mem2 v2.2.1 (Vasimuddin et al. 2019) or minimap2 v2.28-r1209 (Li 2018). Calculation of mapping statistics and collection of unmapped read pairs was done with SAMtools v1.17 (Danecek et al. 2021). Per-base read depth was calculated using SAMtools and variants were called with BCFtools v1.20 (Danecek et al. 2021). Per-scaffold average depth and variant count were calculated with custom bash scripts. Mean normalized scaffold depth and per-base variant frequency were tested for scaffold sex-bias by ANOVA then plotted using R v4.4.1 and the tidyverse package (Wickham et al. 2019). Depth along scaffold position was calculated in 1 Mb windows and plotted using custom python script. Scripts used in analysis are available at github.com/dluecke/Ornithodoros_turicata_G3.
Unmapped reads were assembled with ABySS v2.3.5 (Jackman et al. 2017). The low mapping rate for library F1 (Table 1) suggests contamination in these reads, so separate assemblies were generated from F1 reads using k-mer size 96 (assembly F1-k96) and from F2 and F3 reads using k-mer size 64 (F2,3-k64). All unmapped male reads were combined and assembled with k-mer size 64 (All Male-k64). Assemblies were screened for contamination and cleaned with FCS-GX (Astashyn et al. 2024) using NCBI:txid34597. Screened Male-k64 contigs were further parsed to retain those longer than 1 kb with primary taxon assignment of “anml:insect” (the lowest FCS grouping including ticks). Contigs passing this filtering were then searched against the cleaned F2,3-k64 contigs and the full ASM3712646v1 genome reference sequence using blastn (Altschul et al. 1997 ) from BLAST+ v2.14.1 (Sayers et al. 2022). All Male-k64 contigs without matches to either reference with a bit score above 100 were retained as candidate male-specific sequences and characterized using web-based BLASTn (blast.ncbi.nlm.nih.gov/Blast.cgi). All computational steps except minimap2, python and R processing/plotting and web-based BLASTn were carried out on the USDA SCINet Ceres HPC cluster.
Table 1.
Feature counts of genes annotated from Ornithodoros turicata genome assembly.
| Feature | Count | Mean length (bp) | Median length (bp) | Min length (bp) | Max length (bp) |
|---|---|---|---|---|---|
| Genes | 35,518 | 16,636 | 3,053 | 53 | 1,123,122 |
| All transcripts | 51,512 | 2,177 | 1,488 | 53 | 60,786 |
| mRNA | 35,840 | 2,773 | 2,067 | 180 | 60,786 |
| misc_RNA | 1,029 | 2,687 | 1,839 | 100 | 22,327 |
| tRNA | 4,668 | 74 | 73 | 53 | 100 |
| lncRNA | 8,771 | 1,082 | 620 | 89 | 21,363 |
| snoRNA | 32 | 190 | 216 | 75 | 219 |
| snRNA | 1,123 | 135 | 122 | 103 | 200 |
| rRNA | 48 | 189 | 119 | 118 | 1,817 |
| Single exon | 4,299 | 1,571 | 1,146 | 246 | 9,465 |
| Coding | 4,299 | 1,571 | 1,146 | 246 | 9,465 |
| CDSs | 35,853 | 1,807 | 1,326 | 156 | 59,664 |
| Exons | 203,364 | 324 | 146 | 4 | 28,467 |
| In coding | 177,015 | 322 | 147 | 4 | 28,467 |
| In non-coding | 30,159 | 317 | 139 | 10 | 20,237 |
| Introns | 171,507 | 3,876 | 646 | 30 | 587,567 |
| In coding | 153,517 | 3,813 | 686 | 30 | 587,567 |
| In non-coding | 21,506 | 4,216 | 424 | 30 | 497,338 |
Sex chromosome X homology
To evaluate homology between the putative O. turicata X chromosome and X chromosomes in other tick lineages, the positions of single-copy ortholog (SCO) benchmarks were compared with recent assemblies of Ixodes scapularis (Nuss et al. 2023) and Rhipicephalus microplus (Tidwell et al. 2024) that have identified X chromosomes. For each assembly BUSCO (Simão et al. 2015, version 5.7.1) was run on genome mode with the arachnida_odb10 lineage. For each single-copy complete benchmark gene found in the O. turicata genome the chromosome locations were recorded in all three assemblies, then counts of shared SCO were tallied across chromosome pairs between O. turicata and each reference. Scripts for BUSCO runs and pairwise overlap counting are available at github.com/dluecke/Ornithodoros_turicata_G3.
Results and discussion
Genome assembly
The final O. turicata genome assembly resulted in a genome size of 1.1Gb and 10 chromosomes. The total number of PacBio bases were 125,673,900,101 for a genome-wide average estimate coverage of 104 × of the nuclear genome, assuming genome size of 1.2 Gb from flow cytometry (Geraci et al. 2007). The number of Illumina pairs was 442,251,458 for polishing and 250,817,500 pairs for Hi-C. The resulting 378 scaffolds had an N50 of 84.4 Mb contain 1,223 contigs with an N50 of 2.7 Mb and a coverage of 30×. The genome assembly quality QV score from Merqury was 43.58 and comp-kmer (comparison k-mer) plots are included as Supplementary Figs. 1–5). The final BUSCO analysis score, after removing duplicate haplotypes from the contig assembly with purge_dups, was 97.8% with 92.9% being single-copy genes, 4.8% duplicated, 0.2% fragmented, and 2.0% missing of the expected 2,934 Arachnida odb10 database orthologs. Before removing duplicates, the FALCON and FALCON-Unzip yielded a haplotype-resolved primary assembly 1,474.594 MB total in 1,636 contigs, with an N/L50 of 2.473 MB/137 contigs. The BUSCO score for this was C:95.1% [S:57.8%, D:37.3%], F:2.3%, M:2.6%. The k-mer based genome size estimate from GenomeScope2 was 1.04 Gb (Fig. 2a). This, combined with the k-mer plot (Fig. 2b), suggests that purge_dups improved the assembly overall, but there were still some duplicated contigs in this assembly. For annotation 2 billion Illumina RNAseq reads were obtained (BioProject: PRJNA837009). These transcripts were aligned to a masked genome using WindowMasker which resulted in 33.33% of the genome being masked. The NCBI RefSeq annotation contains 36,149 genes with 22,416 protein coding genes (Table 1). The mitogenome for O. turicata was removed from the genome assembly and published separately in a study by Kneubehl et al. (2022).
Fig. 2.
Genome size analyses for Ornithodoros turicata. a) GenomeScope Profile results and b) k-mer analysis used to estimate genome size.
Sex-specific short read analysis
Short read libraries were used in an attempt to identify the sex chromosome for O. turicata, as these libraries were sex-specific. Genomic reads from libraries were mapped separately to the reference assembly to characterize sex differences in genome content. Library and mapping statistics are reported in Table 2. All libraries map at very high rates (>96%) except F1 which had less than 80% of its reads map to the female assembly, although this library still generated high mapping depth due to high total read count. Mean depth is above 100 for all libraries except F2, and generally higher from the male libraries. Average depth across the chromosome-level scaffolds ranged from 23.05–32.28 for female libraries and 38.07–41.28 for male libraries. No chromosome shows significant sex bias in average normalized read depth, but NC_088201.1 has the largest excess of female reads (Fig. 3a). Read depth along this chromosome shows two distinct regions; for the first ∼90 Mb, male reads have only around half the depth as females, while past this region, libraries from both sexes map at similar rates (Fig. 3b).
Table 2.
Illumina read mapping statistics of Ornithodoros turicata sex-specific reads.
| Sex | Library | Total reads | Mapped reads | Map rate | Mean depth | Mean depth chromosomes |
|---|---|---|---|---|---|---|
| Female | F1 | 540,055,895 | 426,290,784 | 78.93% | 103.17 | 32.28 |
| F2 | 299,418,327 | 290,842,684 | 97.14% | 65.51 | 23.05 | |
| F3 | 407,691,073 | 395,012,070 | 96.89% | 102.78 | 32.28 | |
| Male | M1 | 474,347,128 | 460,265,749 | 97.03% | 125.64 | 39.07 |
| M2 | 498,927,232 | 486,291,670 | 97.47% | 124.84 | 41.28 | |
| M3 | 456,097,890 | 444,506,836 | 97.46% | 107.17 | 38.07 |
Fig. 3.
Short read analysis. a) Mean chromosomal read depths normalized by mean library depth for the 10 chromosomes in the ASM3712646v1 assembly. Female libraries on the left and male libraries on the right, with sex replicate indicated by shape. Chromosome 1 (NC_088201.1) has the lowest male-to-female depth ratio of all chromosomes. No chromosomes show significant sex-biased depth, by ANOVA. b) Normalized read depth for 1 Mb windows by position along chromosome 1 (NC_088201.1). Each line is a library. From position 0 to ∼90 Mb all male libraries have depth ∼half of the female libraries, indicating a single copy of the region in males vs 2 copies in females. Beyond position ∼90 Mb male and female libraries have similar depth, indicating both sexes have the same copy number. c) Average frequency of variants (SNPs and indels) for ASM3712646v1 chromosomes and chromosome regions. Female libraries on the left and male libraries on the right, with sex replicate indicated by shape. For chromosome 1 the whole-chromosome average is shown at far left, then averages for the two regions split based on read depth in dotted boxes with lines to the chromosome positions in panel b indicating region breakpoints, then averages for chromosomes 2–10. No chromosomes/regions show significant sex bias after correcting for multiple tests, by ANOVA.
Frequency of variant sites per chromosome ranges from 0.46 to 2.48%. Only scaffold NC_088209.1 shows significant sex divergence, with male libraries more diverged from the reference than females (Fig. 3c). When looking at variant density between the sexes, scaffold NC_088201.1 appears to have no apparent skew when averaged along the entire length. However, when it was split into regions based on the difference in read depth according to sex, a difference in variant density was found (dashed boxes within Fig. 3c). Males had fewer variant sites in the previously mentioned 90 Mb region (6.6 per kb) while females had more (10.8 per kb). Beyond the first 90 Mb, the pattern switched with more male variants (14.6 per kb) and fewer female variants (12.8 per kb). However, the difference between sexes in variant frequency in these regions has a similar magnitude as is seen on other chromosomes.
Assembly of unmapped reads
To investigate if a male sex chromosome was present in the male sequencing libraries, sex-specific unmapped read assemblies were generated, and their statistics, before and after foreign contamination screening by FCS-GX, are reported in Table 3. Most assembled sequences are very short, as expected from a short read de novo approach without deep, targeted sequencing. However, some large contigs that either clear taxa-based screening or match known Ornithodoros commensals (Gomard et al. 2021) were produced, indicating the ability to reconstruct DNA inputs to the Illumina libraries that are not reflected in the reference assembly. The assembly produced from male unmapped reads is of highest quality based on number of contigs and length statistics (mean, longest, N50), which is expected given these were the largest set of uncontaminated unmapped reads. FCS-GX screening removed the majority of contigs from all assemblies. Longer contigs were removed at a higher rate, presumably because the program had an easier time assigning them to non-tick lineages.
Table 3.
Statistics for unmapped read assemblies.
| Assembly | Libraries | k-mer size | n sequences (n > 1 kb) | Total length | Max length | N50 |
|---|---|---|---|---|---|---|
| F1-k96 | F1 | 96 | 25,949 (12) | 22,478,843 | 2,079 | 670 |
| F1-k96 after FCS clean | 355 (4) | 92,150 | 1,652 | 233 | ||
| F2,3-k64 | F2, F3 | 64 | 126,920 (4,780) | 397,235,064 | 69,622 | 2,588 |
| F2,3-k64 after FCS clean | 21,211 (371) | 7,128,546 | 7,339 | 323 | ||
| All Male-k64 | M1, M2, M3 | 64 | 264,868 (4,929) | 754,076,078 | 59,181 | 1769 |
| All Male-k64 after FCS clean | 46,240 (790) | 15,559,878 | 10,133 | 325 | ||
Contaminant contigs in assembly F1-k96 were predominantly classified as from Gallus gallus (22,164 of 22,599 flagged sequences) which can be explained as colony ticks were fed on this host. The remaining contaminant sequences classified as prokaryote (426), virus (7), or vertebrate (2). Contaminant contigs in F2,3-k64 were mostly prokaryotic (34,725 of 34,987), along with fungal (171), viral (87), vertebrate (3), or synthetic (1). Gallus gallus contamination was also detected in assembly All Male-k64 (4,993 sequences comprising 660,928 total bp) however contaminants were most often classified as prokaryote (77,067 of 83,608 flagged sequences), alongside fungus (951), other vertebrate (434), virus (160), or synthetic (1).
Of the 790 All Male-k64 cleaned contigs longer than 1 kb, 40 had a primary taxonomic assignment of “anml:insect” (which includes ticks). Of these, eight returned no quality match when searched against the cleaned F2,3-k64 assembly or the reference genome assembly. Characterization of these eight candidate male-specific O. turicata genome sequences is reported in Table 4. The total amount of sequences to pass the male-specific filtering is limited, only 12,146 bp in total. None of these candidate male-specific sequences match known male-specific genes. Most of them do not match to any known arthropod at all, while matches to ticks fall into autosomal regions. Overall, very little male-specific genomic sequence is detectable.
Table 4.
BLAST characterization of top male-specific contigs.
| ID | Length | Top hit accession | Notes |
|---|---|---|---|
| 5864 | 1549 | CP062936.1 | Brevibacterium genome sequence |
| 20307 | 2874 | N/A | No hits |
| 398176 | 1111 | XM_064598778.1 | O. turicata predicted transcript from reference genome assembly chromosome 8 |
| 398377 | 1178 | OZ022208.1 | Partial hits to Prunus dulcis (almond) genome |
| 399697 | 1093 | HG964426.1 | Partial hits to Phaseolus vulgaris (kidney bean) |
| 401637 | 2118 | N/A | No hits |
| 414688 | 1085 | N/A | No hits |
| 439667 | 1138 | XM_040502788.2 | Ixodes scapularis (black-legged tick) predicted transcript |
Sex chromosomes can be identified bioinformatically by different signatures depending on the age and resulting degeneracy of the associated Y chromosome (Palmer et al. 2019). As males carry a single X copy, they will have lower read depth along regions of the X highly differentiated from the Y. If the X and Y are not sufficiently differentiated to reduce male X mapping depth (because reads from the Y still map to the reference X), genetic divergence is still expected to accumulate more rapidly than on autosomes due to suppressed recombination between X and Y. Regions of the Y that are distinct enough to prevent read mapping will also be represented in the unmapped reads from males, seen both in reduced map rates and in de novo contigs assembled from male but not female unmapped reads (Palmer et al. 2019).
Sex chromosome signals are surprisingly muted in these sequencing data. Chromosome 1 (NC_088209.1) does show lower average read mapping from males than females, but not at the expected 50% reduction (Fig. 3a). The signal only stands out when mapping depth is discriminated across the span of the chromosome, at which point a male haploid region spanning the first ∼90 Mb is apparent (Fig. 3b). If chromosome 1 is the X then the truncated version of chromosome 1 seen only in males is the Y. However, the expected signature of higher diversity in males vs females compared to the autosomes is not observed (Fig. 3c). If the two versions of chromosome 1 are indeed the X/Y pair, they are very recently established and possibly still fully recombining along the shared arm. The lack of evidence for a fully differentiated male-specific chromosome, both in the similar mapping rates of male and female libraries (Table 1) and the general inability to assemble unique contigs from the unmapped male reads (Table 4), further points to the conclusion that chromosome 1 represents a sex chromosome system at the earliest stages of evolutionary emergence. A male assembly of equal quality and karyotyping studies will more conclusively determine the status of sex chromosomes in O. turicata.
An investigation into the homology between Acari X chromosomes found 2,734 of 2,934 (93.2%) genes in the BUSCO arachnida_odb10 set were found as complete single copies in the O. turicata assembly, of which 342 (12.5%) are on the putative X (NC_088201.1). Of these unambiguous X-candidate markers 261 (76.3%) were also on the I. scapularis X scaffold (CM063360.1) and 301 (88.0%) on the R. microplus X scaffold (CM091779.1). Counts of shared BUSCO genes for all chromosome pairs are reported in Supplementary Table 1. The majority of BUSCO genes found on the putative O. turicata X chromosome are also found on the X chromosomes in hard-bodied ticks I. scapularis and R. microplus. These chromosomes are likely homologous, with conserved sex determination function dating at least to the divergence of soft- and hard-bodied tick lineages.
Conclusions
This is the first soft tick reference genome and the first genome of the relapsing fever tick O. turicata. This chromosome-level assembly will provide genomic tools and support study areas of tick evolution including resolution of biogeographic species determinations and vector–host–pathogen interactions (Mans 2023). Potential applications include identifying anti-tick technologies and vaccine targets. Additionally, insights from looking for the Y chromosome include providing markers for genetic sexing and possibly identifying a sex chromosome system at the earliest stages of evolutionary emergence which could give us a better understanding of the evolution of sex chromosomes. Since O. turicata is a putative vector of African swine fever virusand has been shown to feed on wild pigs (Balasubramanian et al. 2024), having this genome available could be critical if African swine fever virus was introduced in the United States or Mexico. Additionally, this genome can be used to study the differential vector competence of the geographic populations of O. turicata and their ability for local adaptation. This genome will aid in pest management tool development and protecting against vector-borne diseases.
Supplementary Material
Acknowledgments
The genome assembly was generated as part of the USDA-ARS Ag100Pest Initiative. Biological materials were provided by Baylor College of Medicine. The authors thank the members of the USDA-ARS Ag100Pest Team for sequencing and analysis support. All opinions expressed in this paper are the authors and do not necessarily reflect the policies and views of USDA. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. USDA is an equal opportunity provider and employer.
Contributor Information
Mackenzie Tietjen, USDA Agricultural Research Service, Livestock Arthropod Pest Research Unit, 2700 Fredericksburg Road, Kerrville, TX 78028, USA.
Amanda R Stahlke, Department of Biology, Colorado Mesa University, 1100 North Ave, Grand Junction, CO 81501, USA; USDA Agricultural Research Service, Beltsville Agricultural Research Center, Bee Research Laboratory, 10300 Baltimore Avenue, Beltsville, MD 20705, USA.
David Luecke, USDA Agricultural Research Service, Veterinary Pest Genetics Research Unit, 2700 Fredericksburg Road, Kerrville, TX 78028, USA.
Perot Saelao, USDA Agricultural Research Service, Veterinary Pest Genetics Research Unit, 2700 Fredericksburg Road, Kerrville, TX 78028, USA.
Sheina B Sim, USDA Agricultural Research Service, US Pacific Basin Agricultural Research Center, Tropical Crop and Commodity Protection Research Unit, 64 Nowelo Street, Hilo, HI 96720, USA.
Scott M Geib, USDA Agricultural Research Service, US Pacific Basin Agricultural Research Center, Tropical Crop and Commodity Protection Research Unit, 64 Nowelo Street, Hilo, HI 96720, USA.
Brian E Scheffler, USDA Agricultural Research Service, Jamie Whitten Delta States Research Center, Genomics and Bioinformatics Research Unit, 141 Experiment Station Road, Stoneville, MS 38776, USA.
Anna K Childers, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Bee Research Laboratory, 10300 Baltimore Avenue, Beltsville, MD 20705, USA.
Alexander R Kneubehl, Department of Pediatrics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.
Pete D Teel, Department of Entomology, Texas A&M AgriLife Research, College Station, TX 77843, USA.
Job E Lopez, Department of Pediatrics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA; Department of Molecular Virology and Microbiology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.
Data Availability
All raw reads used to assemble and scaffold this genome assembly were deposited at DDBJ/ENA/GenBank within the BioProject PRJNA832905. This includes the following Sequence Read Archive (SRA) accessions: SRR18960659 PacBio CLR data used to generate primary and alternate haplotig assemblies, SRR18960658 Illumina whole-genome sequencing used for contig polishing and SRR18960657 Illumina Hi-C library. The primary and alternate haplotype assembly versions discussed in this manuscript are available in GenBank accessions SAMN27925140 and SAMN27925141 respectively. BioProject PRJNA837009 contains RNAseq reads. Scripts used in analyses can be found at https://github.com/dluecke/Ornithodoros_turicata_G3. This project is under the Ag100Pest umbrella project, BioProject PRJNA555319. The primary assembly and annotations are also available at the i5k Workspace@NAL (Poelchau et al. 2015).
Supplemental material available at G3 online.
Funding
This work was supported by the U.S. Department of Agriculture Agricultural Research Service (USDA-ARS). This research used resources provided by the SCINet project of the USDA-ARS project number 0500-00093-001-00-D. The genome assembly was generated as part of the USDA-ARS Ag100Pest Initiative.
Literature cited
- Allio R, Schomaker-Bastos A, Romiguier J, Prosdocimi F, Nabholz B, Delsuc F. 2020. MitoFinder: efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics. Mol Ecol Resour. 20(4):892–905. doi: 10.1111/1755-0998.13160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ashman TL, Bachtrog D, Blackmon H, Goldberg EE, Hahn MW, Kirkpatrick M, Kitano J, Mank JE, Mayrose I, Ming R, et al. 2014. Tree of sex: a database of sexual systems. Sci Data. 1(1):140015. doi: 10.1038/sdata.2014.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Astashyn A, Tvedte ES, Sweeney D, Sapojnikov V, Bouk N, Joukov V, Mozes E, Strope PK, Sylla PM, Wagner L. 2024. Rapid and sensitive detection of genome contamination at scale with FCS-GX. Genome Biol. 25(1):60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bakkes DK, De Klerk D, Latif AA, Mans BJ. 2018. Integrative taxonomy of Afrotropical Ornithodoros (Ornithodoros) (Acari: Ixodida: Argasidae). Ticks Tick Borne Dis. 9(4):1006–1037. doi: 10.1016/j.ttbdis.2018.03.024. [DOI] [PubMed] [Google Scholar]
- Balasubramanian S, Busselman RE, Fernandez-Santos NA, Grunwald AP, Wolff N, Hathaway N, Hillhouse A, Bailey JA, Teel PD, Ferreira FC. 2024. Blood meal metabarcoding of the argasid tick (Ornithodoros turicata Dugès) reveals extensive vector-host associations. Environ DNA. 6(1):e522. doi: 10.1002/edn3.522. [DOI] [Google Scholar]
- Barnett DW, Garrison EK, Quinlan AR, Strömberg MP, Marth GT. 2011. BamTools: a C++ API and toolkit for analyzing and managing BAM files. Bioinformatics. 27(12):1691–1692. doi: 10.1093/bioinformatics/btr174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beck AF, Holscher KH, Butler J. 1986. Life cycle of Ornithodoros turicata americanus (Acari: Argasidae) in the laboratory. J Med Entomol. 23(3):313–319. doi: 10.1093/jmedent/23.3.313. [DOI] [PubMed] [Google Scholar]
- Botero-Canola S, Torhorst C, Canino N, Beati L, O’Hara KC, James AM, Wisely SM. 2024. Integrating systematic surveys with historical data to model the distribution of Ornithodoros turicata americanus, a vector of epidemiological concern in North America. Ecol Evol. 14(11):e70547. doi: 10.1002/ece3.70547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brumpt E. 1933. Étude du Spirochaeta turicatae, n. sp. agent de la fièvre récurrente sporadique des Etats-Unis transmis par Ornithodorus turicata. Comptes Rendus Séances Société Biol. 113:1369–1372. [Google Scholar]
- Buchfink B, Xie C, Huson DH. 2015. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 12(1):59–60. doi: 10.1038/nmeth.3176. [DOI] [PubMed] [Google Scholar]
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. BLAST+: architecture and applications. BMC Bioinf. 10(1):1–9. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Challis R, Richards E, Rajan J, Cochrane G, Blaxter M. 2020. BlobToolKit–interactive quality assessment of genome assemblies. G3 (Bethesda). 10(4):1361–1374. doi: 10.1534/g3.119.400908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chin CS, Peluso P, Sedlazeck FJ, Nattestad M, Concepcion GT, Clum A, Dunn C, O'Malley R, Figueroa-Balderas R, Morales-Cruz A, et al. 2016. Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods. 13(12):1050–1054. doi: 10.1038/nmeth.4035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooley RA, Kohls GM. 1944. The Argasidae of North America, central America and Cuba. Am Midl Nat Monogr. (1):1–152. [Google Scholar]
- Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy S, Davies RM, et al. 2021. Twelve years of SAMtools and BCFtools. GigaScience. 10(2). doi: 10.1093/gigascience/giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. 2017. Nextflow enables reproducible computational workflows. Nat Biotechnol. 35(4):316–319. doi: 10.1038/nbt.3820. [DOI] [PubMed] [Google Scholar]
- Donaldson TG, Pèrez de León AA, Li AI, Castro-Arellano I, Wozniak E, Boyle WK, Hargrove R, Wilder HK, Kim HJ, Teel PD, et al. 2016. Assessment of the geographic distribution of Ornithodoros turicata (Argasidae): climate variation and host diversity. PLoS Neg Trip Dis. 10(2):e0004383. doi: 10.1371/journal.pntd.0004383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Durand NC, Shamim MS, Machol I, Rao SS, Huntley MH, Lander ES, Aiden EL. 2016. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3(1):95–98. doi: 10.1016/j.cels.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Formenti G, Rhie A, Balacco J, Haase B, Mountcastle J, Fedrigo O, Brown S, Capodiferro MR, Al-Ajli FO, Ambrosini R. 2021. Complete vertebrate mitogenomes reveal widespread repeats and gene duplications. Genome Biol. 22(1):120. doi: 10.1186/s13059-021-02336-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Formenti G, Rhie A, Walenz BP, Thibaud-Nissen F, Shafin K, Koren S, Myers EW, Jarvis ED, Phillippy AM. 2022. Merfin: improved variant filtering, assembly evaluation and polishing via k-mer validation. Nat Methods. 19(6):696–704. doi: 10.1038/s41592-022-01445-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Formenti G, Stahlke A. 2022. Ag100Pest/mitoPolishCLR: v0. 1.0-Alpha-Hzea (v0. 1.0-Alpha). Zenodo. doi: 10.5281/zenodo.6235247. [DOI]
- Garrison E, Marth G. 2012. Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907. 10.48550/arXiv.1207.3907, preprint: not peer reviewed. [DOI]
- Geraci NS, Johnston JS, Robinson JP, Wikel SK, Hill CA. 2007. Variation in genome size of argasid and ixodid ticks. Insect Biochem Molec Biol. 37(5):399–408. doi: 10.1016/j.ibmb.2006.12.007. [DOI] [PubMed] [Google Scholar]
- Ghurye J, Rhie A, Walenz BP, Schmitt A, Selvaraj S, Pop M, Phillippy AM, Koren S. 2019. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLoS Comput Biol. 15(8):e1007273. doi: 10.1371/journal.pcbi.1007273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gomard Y, Flores O, Vittecoq M, Blanchon T, Toty C, Duron O, Mavingui P, Tortosa P, McCoy KD. 2021. Changes in bacterial diversity, composition and interactions during the development of the seabird tick Ornithodoros maritimus (Argasidae). Microb Ecol. 81(3):770–783. doi: 10.1007/s00248-020-01611-9. [DOI] [PubMed] [Google Scholar]
- Guan D, McCarthy SA, Wood J, Howe K, Wang Y, Durbin R. 2020. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 36(9):2896–2898. doi: 10.1093/bioinformatics/btaa025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, Jahesh G, Khan H, Coombe L, Warren RL, et al. 2017. ABySS 2.0: resource-efficient assembly of large genomes using a bloom filter. Genome Res. 27(5):768–777. doi: 10.1101/gr.214346.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim HJ, Filatov S, Lopez JE, Pèrez de León AA, Teel PD. 2017. Blood feeding of Ornithodoros turicata larvae using an artificial membrane system. Med Vet Entomol. 31(2):230–233. doi: 10.1111/mve.12223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kneubehl AR, Muñoz-Leal S, Filatov S, de Klerk DG, Pienaar R, Lohmeyer KH, Bermúdez SE, Suriyamongkol T, Mali I, Kanduma E. 2022. Amplification and sequencing of entire tick mitochondrial genomes for a phylogenomic analysis. Sci Rep. 12(1):19310. doi: 10.1038/s41598-022-23393-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27(5):722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krishnavajhala A, Armstrong BA, Lopez JE. 2018. Vector competence of geographical populations of Ornithodoros turicata for the tick-borne relapsing fever spirochete Borrelia turicatae. Appl Environ Microbiol. 84(21):e01505-18. doi: 10.1128/AEM.01505-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997. 10.48550/arXiv.1303.3997, preprint: not peer reviewed. [DOI]
- Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 34(18):3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manni M, Berkeley MR, Seppey M, Zdobnov EM. 2021. BUSCO: assessing genomic data quality and beyond. Curr Protoc. 1(12):e323. doi: 10.1002/cpz1.323. [DOI] [PubMed] [Google Scholar]
- Mans BJ. 2023. Paradigms in tick evolution. Trends Parasitol. 39(6):475–486. doi: 10.1016/j.pt.2023.03.011. [DOI] [PubMed] [Google Scholar]
- Mans BJ, Featherston J, Kvas MK, Pillay K, de Klerk DG, Pienaar R, de Castro MH, Schwan TG, Lopez JE, Teel PD, et al. 2019. Argasid and ixodid systematics: implications for soft tick evolution and systematics, with a new argasid species list. Ticks Tick Borne Dis. 10(1):219–240. doi: 10.1016/j.ttbdis.2018.09.010. [DOI] [PubMed] [Google Scholar]
- Mapleson D, Garcia Accinelli G, Kettleborough G, Wright J, Clavijo BJ. 2017. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics. 33(4):574–576. doi: 10.1093/bioinformatics/btw663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marçais G, Kingsford C. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 27(6):764–770. doi: 10.1093/bioinformatics/btr011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mays Maestas SE, Maestas LP, Kaufman PE. 2025. Pathogen and host associations of soft ticks collected in south Texas. Vector-Borne Zoonotic Dis. 25(1):34–42. doi: 10.1089/vbz.2023.0135. [DOI] [PubMed] [Google Scholar]
- Miller JR, Delcher AL, Koren S, Venter E, Walenz BP, Brownley A, Johnson J, Li K, Mobarry C, Sutton G. 2008. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics. 24(24):2818–2824. doi: 10.1093/bioinformatics/btn548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nicholson WL, Sonenshine DE, Noden BH, Brown RN. 2019. Chapter 27 - Ticks (Ixodida). In: Mullen GR, Durden LA, editors. Medical and veterinary entomology. 3rd ed. Academic Press. p. 603–672. [Google Scholar]
- Nuss AB, Lomas JS, Reyes JB, Garcia-Cruz O, Lei W, Sharma A, Pham MN, Beniwal S, Swain ML, McVicar M, et al. 2023. The highly improved genome of Ixodes scapularis with X and Y pseudochromosomes. Life Sci Alliance. 6(12):e202302109. doi: 10.26508/lsa.202302109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palmer DH, Rogers TF, Dean R, Wright AE. 2019. How to identify sex chromosomes and their turnover. Mol Ecol. 28(21):4709–4724. doi: 10.1111/mec.15245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poelchau M, Childers C, Moore G, Tsavatapalli V, Evans J, Lee C-Y, Lin H, Lin J-W, Hackett K. 2015. The i5k Workspace@ NAL—enabling genomic data access, visualization and curation of arthropod genomes. Nucleic Acids Res. 43(D1):D714–D719. doi: 10.1093/nar/gku983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ranallo-Benavidez TR, Jaron KS, Schatz MC. 2020. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 11(1):1432. doi: 10.1038/s41467-020-14998-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rhie A, McCarthy SA, Fedrigo O, Damas J, Formenti G, Koren S, Uliano-Silva M, Chow W, Fungtammasan A, Kim J, et al. 2021. Towards complete and error-free genome assemblies of all vertebrate species. Nordisk Alkohol Nark. 592(7856):737–746. doi: 10.1038/s41586-021-03451-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rhie A, Walenz BP, Koren S, Phillippy AM. 2020. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21(1):1–27. doi: 10.1186/s13059-020-02134-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC, Connor R, Funk K, Kelly C, Kim S, et al. 2022. Database resources of the national center for biotechnology information. Nucleic Acids Res. 50(D1):D20–D26. doi: 10.1093/nar/gkab1112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seppey M, Manni M, Zdobnov EM. 2019. BUSCO: assessing genome assembly and annotation completeness. In: Kollmar M, editors. Gene Prediction. New York, NY: Humana. p. 227–245. [DOI] [PubMed] [Google Scholar]
- Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 31(19):3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
- Thibaud-Nissen F, DiCuccio M, Hlavina W, Kimchi A, Kitts P, Murphy T, Pruitt K, Souvorov A. 2016. P8008 the NCBI eukaryotic genome annotation pipeline. J Animal Sci. 94(suppl_4):184–184. [Google Scholar]
- Tidwell JP, Bendele KG, Bodine D, Holmes VR, Johnston JS, Saelao P, Lohmeyer KH, Teel PD, Tarone AM. 2024. Identifying the sex chromosome and sex determination genes in the cattle tick, Rhipicephalus (Boophilus) microplus. G3 (Bethesda). 14(12):jkae234. doi: 10.1093/g3journal/jkae234. Epub ahead of print. PMID: 39344017; PMCID: PMC11631522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vasimuddin M, Misra S, Li H, Aluru S. 2019. Efficient architecture-aware acceleration of BWA-MEM for Multicore systems. In: IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil. Institute of Electrical and Electronics Engineers. p. 314–324.
- Wickham H, Averick M, Bryan J, Chang W, McGowan LDA, François R, Grolemund G, Hayes A, Henry L, Hester J. 2019. Welcome to the tidyverse. J Open Source Software. 4(43):1686. doi: 10.21105/joss.01686. [DOI] [Google Scholar]
- Zhou C, McCarthy SA, Durbin R. 2023. YaHS: yet another hi-C scaffolding tool. Bioinformatics. 39(1):btac808. doi: 10.1093/bioinformatics/btac808. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- Formenti G, Stahlke A. 2022. Ag100Pest/mitoPolishCLR: v0. 1.0-Alpha-Hzea (v0. 1.0-Alpha). Zenodo. doi: 10.5281/zenodo.6235247. [DOI]
Supplementary Materials
Data Availability Statement
All raw reads used to assemble and scaffold this genome assembly were deposited at DDBJ/ENA/GenBank within the BioProject PRJNA832905. This includes the following Sequence Read Archive (SRA) accessions: SRR18960659 PacBio CLR data used to generate primary and alternate haplotig assemblies, SRR18960658 Illumina whole-genome sequencing used for contig polishing and SRR18960657 Illumina Hi-C library. The primary and alternate haplotype assembly versions discussed in this manuscript are available in GenBank accessions SAMN27925140 and SAMN27925141 respectively. BioProject PRJNA837009 contains RNAseq reads. Scripts used in analyses can be found at https://github.com/dluecke/Ornithodoros_turicata_G3. This project is under the Ag100Pest umbrella project, BioProject PRJNA555319. The primary assembly and annotations are also available at the i5k Workspace@NAL (Poelchau et al. 2015).
Supplemental material available at G3 online.



