Abstract
Despite efforts to minimize the impacts of malaria and reduce the number of primary vectors, malaria has yet to be eliminated in Zambia. Understudied vector species may perpetuate malaria transmission in pre-elimination settings. Anopheles squamosus is one of the most abundantly caught mosquito species in southern Zambia and has previously been found with Plasmodium falciparum sporozoites, a causal agent of human malaria. This species may be a critical vector of malaria transmission, however, there is a lack of genetic information available for An. squamosus. We report the first genome data and the first complete mitogenome (Mt) sequence of An. squamosus. The sequence was extracted from one individual mosquito from the Chidakwa area in Macha, Zambia. The raw reads were obtained using Illumina Novaseq 6000 and assembled through NOVOplasty alignment with related species. The length of the An. squamosus Mt was 15,351 bp, with 77.9 % AT content. The closest match to the whole mitochondrial genome in the phylogenetic tree is the African malaria mosquito, Anopheles gambiae. Its genome data is available through National Center for Biotechnology Information (NCBI) Sequencing Reads Archive (SRA) with accession number SRR22114392. The mitochondrial genome was deposited in NCBI GenBank with the accession number OP776919. The ITS2 containing contig sequence was deposited in GenBank with the accession number OQ241725. Mitogenome annotation and a phylogenetic tree with related Anopheles mosquito species are provided.
Keywords: Anopheles squamosus, understudied malaria vector, Africa, Zambia
Introduction
Anopheles squamosus (Theobald, 1901; Figure 1) can be found across Africa 1 and is of particular relevance to public health due to its implication in the spread of residual malaria cases. Anopheles squamosus is one of the most abundantly caught anopheline species in malaria vector surveillance studies in southern Zambia. However, it is understudied species because of its exophilic and zoophilic behaviours. 2 , 3 Though they are predominantly associated as a zoophilic species, they have been discovered to have high anthropophily in southern Zambia. 4 Additionally, there has been the detection of Plasmodium falciparum sporozoite and DNA, a causal agent of human malaria, in An. squamosus. 2 , 5
Figure 1. A: Anopheles squamosus image. B: An. squamosus wing. C: Head image of An. squamosus, one of the features used for species identification.
A and B have been reproduced with permission from Dr. Rebekah Kading (Colorado State University). 21 C has been reproduced with permission from Walter Reed Biosystematics Unit (WRBU). 22
Unfortunately, there are two key barriers to pursuing the rigorous investigation of the epidemiologically important traits of this vector, such as host choice, biting behaviours, and dispersal capacity. First, An. squamosus is morphologically indistinguishable from An. cydippis at the adult stage. Although they are morphologically distinct as larvae, larvae are often difficult to locate in abundance. There are numerous examples of sympatric Anopheles sibling species expressing drastic differences in insecticide resistance 6 or host choice. 7 , 8 These differences make species confirmation critical to assessing and mitigating malaria transmission risk. Second, there is limited genetic information (173 sequences total in GenBank as of August 2022) for An. squamosus, most (N=166; 96%), are partial sequences of the mitochondrial cytochrome c oxidase subunit I (COI) gene. ITS2 sequences are better at differentiating species within a complex than COI sequences 9 but existing ITS2 primers do not typically work on An. squamosus and the absence of sequence data for this region prohibit the design of functional diagnostic PCR primers.
To overcome these two barriers and advance investigative efforts aimed at this widespread, yet neglected malaria vector, we carried out the first Illumina high-throughput sequencing of this species.
Methods
Data collection
The An. squamosus sample used for the genome sequencing was collected in Chidakwa near Macha, Zambia (utm-x: 0478202, utm-y: 8184394) using a CDC light trap placed outdoors near a goat pen. Samples were frozen after collection at –20°C until DNA extraction. DNA was extracted using a magnetic bead-based protocol optimized for mosquito DNA for Next-generation sequencing. 10 The head and thorax were dissected from the sample and hydrated in nuclease-free water for 1 hour at 4°C. Tissues were then removed from the water and homogenised in a mixture of 2 μL Proteinase K (100 mg/mL) and 98 μL PK Buffer in a 1.5 mL Eppendorf microcentrifuge tube (add company name), followed by incubation at 56°C for 2 hours. The lysate was transferred to a new 1.5 mL microcentrifuge tube and mixed with a MagAttract Mix consisting of 100 μL isopropanol, 100 μL Buffer AL, and 15 μL MagAttract Suspension G (Qiagen, Hilden, Germany). The mixture was incubated at room temperature for 10 minutes and occasionally vortexed to ensure that the magnetic beads were evenly dispersed. The microcentrifuge tube containing the lysate was then moved to a magnetic bead separator until the liquid appeared clear. After a series of ethanol washes of magnetic beads, DNA was eluted from the beads with 100 μL AE Buffer and stored at −20°C until library preparation. The library preparation was completed using the QIAseq FX UDI kit (Qiagen, Hilden, Germany) using 20 ng genomic DNA as input for the protocol as previously described. 11
Data analysis
Raw sequencing reads were trimmed using fastp (RRID:SCR_016962) version 0.20.1. 12 Mitogenome (Mt) contig was assembled using NOVOPlasty (RRID:SCR_017335) version 4.3.1. 13 Automatic annotation of mitogenome was conducted with the MITOS website 14 using the invertebrate genetic code for mitochondria under default settings. Some automatic annotations were not consistent with typical Anopheles mitochondrial gene start and/or end positions. Manual adjustments were made to inconsistent automatic annotations by shifting the start and end positions to match existing Anopheles mitochondrial gene annotations found in GenBank. Annotation information was also deposited to the GenBank with the genome sequence. The full genomic map is provided in Figure 2.
Figure 2. Mitogenome map of An. squamosus including annotated genes.
Phylogenetic analysis was conducted using the mitogenome sequences of seven Anopheles species and one Aedes species as an outgroup in . The Jukes-Cantor model was used to calculate the pairwise genetic distances and the neighbour-joining method was used to build the phylogenetic tree in Geneious Prime (RRID:SCR_010519) 2022.02 (Biomatters, Auckland, New Zealand) 15 (free alternative, AliView).
Draft genome assembly was conducted using MaSuRCA (RRID:SCR_010691) version 4.0.9 16 in order to find a contig containing Internal transcribed spacer 2 (ITS2) sequence. Basic local alignment search tool (BLAST) (RRID:SCR_004870) was used for the resulting contigs to locate contigs with highest similarity with only An. squamosus ITS2 sequence available on GenBank (accession number MK592071).
Results
We yielded 105 million reads from sequencing a single An. squamosus sample. Of these, 238,740 reads were used to assemble mitochondrial genome. Draft genome assembly using MaSuRCA produced 58,252 scaffolds with the total size of scaffolds of 350Mbp. N50 scaffold length was 21,439bp. Among these contigs, we identified one contig containing ITS2 region (GenBank accession number OQ241725), which was 1,223 bp long.
The length of the An. squamosus Mt (GenBank accession number OP776919 23 ) was 15,351 bp and the percentage A+T was 77.9% ( Figure 2). The average A+T percentage of eight other anopheline species was 77.7% (±0.61 SD). The length of this mitochondrial genome was a similar length to other anopheline species that have been deposited in GenBank, with the average of the eight species compared in this analysis being 15,363 bp. The content for this mitochondrial genome includes two ribosomal RNAs, 22 transfer RNAs, and 35 protein-coding genes. The cytochrome c oxidase I (COI) fragment spanning 1,462-2,132 bp of An. squamosus sequence had 97.7% (±4.27 SD, N=9) similarity to the COI sequence of An. squamosus deposited in GenBank.
In the phylogenetic analysis ( Figure 3), the closest match to the whole mitogenome sequence of An. squamosus was the African malaria mosquito An. gambiae (GenBank accession number L20934), with 91.5% sequence similarity. This comparable sample was identified as An. gambiae and published in 1993 before An. gambiae were separated into two species: An. gambiae and An. coluzzii. 17 Nevertheless, previous studies suggest that mitogenome sequence alone is not sufficient to distinguish An. gambiae s.s. from An. coluzzii. 18 , 19
Figure 3. Phylogenetic tree based on mitogenome sequences of An. squamosus and its related mosquitoes.
Species names are provided next to the GenBank accession numbers. Numbers at nodes indicate bootstrap values out of 100 replicates. Aedes aegypti was considered as an outgroup. The scale bar indicates relative nucleotide difference (0.02=2% nucleotide difference).
This study provides a critical genomic resource for research of this understudied malaria vector. Our short reads sequencing data was not sufficient to assemble high-quality reference genome and revealed the need for alternative long-read sequencing technology for a high-quality genome assembly. However, we provided a key ITS2 region data that researchers can develop a low-cost molecular diagnostic assay to identify species. Currently available ITS2 primers for anopheline species identification typically does not produce a PCR amplicon, which is one of the major roadblocks in carrying out surveillance and research of this species. We identified the ITS2-containing contig (GenBank Accession number OQ241725) that could be used for new primer design that would amply the ITS2 fragments more reliably for An. squamosus. Our genome sequence data could be used for further variant identification once high-quality reference genome become available for An. squamosus. The mitogenome sequence could also be used to identify phylogenetic relationship within and between related species and infer gene flow/dispersal. 9 , 20
Ethical considerations
The study involves collection of mosquito specimen near goat pens within individual households in Chidakwa, Zambia as part of the project that had been approved by National Health Research Authority, Zambia: Approval No: NHRA00016/18/08/2021.
Acknowledgements
We thank UF ICBR for providing sequencing services. We appreciate the support from Dr. Edgar Simulundu from Macha Research Trust toward our project.
Funding Statement
This study was supported by University of Florida College of Agricultural and Life Sciences Dean’s Award to VTN, University of Florida Department of Entomology and Nematology Matching Assistantship to SS, University of Florida International Center Global Fellows Program Award to YL, NIH T32 Training Grant (T32 AI-138953) support to MEG, NIH support through the International Centers of Excellence for Malaria Research (2U19AI089680) for DEN, MEG, MMM and LS, and support to DEN and MEG from the Bloomberg Philanthropies and the Johns Hopkins Malaria Research Institute.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
[version 1; peer review: 2 approved]
Data availability
Underlying data
GenBank: Anopheles squamosus mitochondrion, complete genome. Accession number OP776919; https://identifiers.org/ncbi/insdc:OP776919. 23
BioProject: Complete mitogenome sequence of Anopheles squamosus from Macha, Zambia. Accession number PRJNA896235; https://identifiers.org/bioproject:PRJNA896235. 24
SRA: DNA-Seq of mosquito Anopheles squamosus. Accession number SRR22114392; https://identifiers.org/insdc.sra:SRR22114392. 25
BioSample: Anopheles squamosus isolate As22MACHA01. Accession number SAMN31538381; https://identifiers.org/biosample:SAMN31538381. 26
References
- 1. Theobald FV: A monograph of the Culicidae, or mosquitoes. London: British museum (Nat. hist.). Department of Zoology;1901;256. [Google Scholar]
- 2. Stevenson JC, Simubali L, Mbambara S, et al. : Detection of Plasmodium falciparum Infection in Anopheles squamosus (Diptera: Culicidae) in an Area Targeted for Malaria Elimination, Southern Zambia. J. Med. Entomol. 2016 Nov;53(6):1482–1487. 10.1093/jme/tjw091 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hoffman JE, Ciubotariu SL, Mudenda T, et al. : Phylogenetic Complexity of Morphologically Identified Anopheles squamosus in Southern Zambia. Insects. 2021 Feb 8;12(2). 10.3390/insects12020146 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Fornadel CM, Norris LC, Franco V, et al. : Unexpected anthropophily in the potential secondary malaria vectors Anopheles coustani s.l. and Anopheles squamosus in Macha, Zambia. Vector Borne Zoonotic Dis. 2011 Aug;11(8):1173–1179. 10.1089/vbz.2010.0082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Stevenson JC, Norris DE: Implicating cryptic and novel anophelines as malaria vectors in Africa. Insects. 2017;8(1):1–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Main BJ, Lee Y, Collier TC, et al. : Complex genome evolution in Anopheles coluzzii associated with increased insecticide usage in Mali. Mol. Ecol. 2015 Oct;24(20):5145–5157. 10.1111/mec.13382 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Kent RJ, Thuma PE, Mharakurwa S, et al. : Seasonality, blood feeding behavior, and transmission of Plasmodium falciparum by Anopheles arabiensis after an extended drought in southern Zambia. Am. J. Trop. Med. Hyg. 2007 Feb;76(2):267–274. 10.4269/ajtmh.2007.76.267 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Main BJ, Lee Y, Ferguson HM, et al. : The genetic basis of host preference and resting behavior in the major African malaria vector, Anopheles arabiensis. PLoS Genet. 2016 Sep;12(9):e1006303. 10.1371/journal.pgen.1006303 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Jones CM, Ciubotariu II, Muleba M, et al. : Multiple novel clades of anopheline mosquitoes caught outdoors in northern Zambia. Frontiers Trop. Dis. 2021;2:780664. 10.3389/fitd.2021.780664 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Chen T, Vorsino AE, Kosinski KJ, et al. : A magnetic-bead-based mosquito DNA extraction protocol for Next-Generation sequencing. J. Vis. Exp. 2021;170:e62354. 10.3791/62354 [DOI] [PubMed] [Google Scholar]
- 11. Kelly ET, Mack LK, Campos M, et al. : Evidence of local extinction and reintroduction of Aedes aegypti in Exeter, California. Front. Trop. Dis. 2021 Jul;2(8). 10.3389/fitd.2021.703873 [DOI] [Google Scholar]
- 12. Chen S, Zhou Y, Chen Y, et al. : fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018 Sep 1;34(17):i884–i890. 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Dierckxsens N, Mardulyn P, Smits G: NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2017 Feb 28;45(4):e18. 10.1093/nar/gkw955 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Bernt M, Donath A, Juhling F, et al. : MITOS: improved de novo metazoan mitochondrial genome annotation. Mol. Phylogenet. Evol. 2013 Nov;69(2):313–319. 10.1016/j.ympev.2012.08.023 [DOI] [PubMed] [Google Scholar]
- 15. Kearse M, Moir R, Wilson A, et al. : Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012 Jun 15;28(12):1647–1649. 10.1093/bioinformatics/bts199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Zimin AV, Marçais G, Puiu D, et al. : The MaSuRCA genome assembler. Bioinformatics. 2013 Nov 1;29(21):2669–2677. 10.1093/bioinformatics/btt476 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Coetzee M, Hunt RH, Wilkerson RC, et al. : Anopheles coluzzii and Anopheles amharicus, new members of the Anopheles gambiae complex. Zootaxa. 2013;3619(2):246–274. 10.11646/zootaxa.3619.3.2 [DOI] [PubMed] [Google Scholar]
- 18. Hanemaaijer MJ, Houston PD, Collier TC, et al. : Mitochondrial genomes of Anopheles arabiensis, An. gambiae and An. coluzzii show no clear species division. F1000Res. 2018;7:347. 10.12688/f1000research.13807.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Fontaine MC, Pease JB, Steele A, et al. : Mosquito genomics. Extensive introgression in a malaria vector species complex revealed by phylogenomics. Science. 2015 Jan 2;347(6217):1258524. 10.1126/science.1258524 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Campos M, Patel N, Marshall C, et al. : Population genetics of Anopheles pretoriensis in Grande Comore Island. Insects. 2023;14(1):14. 10.3390/insects14010014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Kent RJ: The Mosquitoes of Macha, Zambia. The third International Malaria Research Conference. Baltimore, MD, USA:2006 [cited 2022 Nov 8]. Reference Source [Google Scholar]
- 22. Walter Reed Biosystematics Unit (WRBU): Anopheles squamosus Theobald. 1901. [cited 2022 Nov 3]. Reference Source
- 23. Nguyen VT, Gebhardt ME, Mburu MM, et al. : Anopheles squamosus mitochondrion, complete genome.[Dataset]. GenBank. 2023. Reference Source
- 24. University of Florida: Complete mitogenome sequence of Anopheles squamosus from Macha, Zambia.[Dataset]. BioProject. 2022. Reference Source
- 25. University of Florida: DNA-Seq of mosquito Anopheles squamosus. [Dataset]. SRA. 2022. Reference Source
- 26. University of Florida: Anopheles squamosus isolate As22MACHA01.[Dataset]. BioSample. 2022. Reference Source



