Complete Genome Sequence of wAna, the Wolbachia Endosymbiont of Drosophila ananassae

Mark T Gasser; Matthew Chung; Robin E Bromley; Suvarna Nadendla; Julie C Dunning Hotopp

doi:10.1128/MRA.01136-19

. 2019 Oct 24;8(43):e01136-19. doi: 10.1128/MRA.01136-19

Complete Genome Sequence of wAna, the Wolbachia Endosymbiont of Drosophila ananassae

Mark T Gasser ^a,^*, Matthew Chung ^a,^b, Robin E Bromley ^a, Suvarna Nadendla ^a, Julie C Dunning Hotopp ^a,^b,^c,^✉

Editor: Christina A Cuomo^d

PMCID: PMC6813396 PMID: 31649084

Here, we present the complete genome sequence of the Wolbachia endosymbiont wAna, isolated from Drosophila ananassae and derived from Oxford Nanopore and Illumina sequencing. We anticipate that this will aid in Wolbachia comparative genomics and the assembly of D. ananassae specifically in regions containing extensive lateral gene transfer events.

ABSTRACT

ANNOUNCEMENT

Lateral gene transfer (LGT) from the Wolbachia endosymbiont wAna in Drosophila ananassae constitutes >2% of the insect genome, including integrations of multiple wAna genomes in the abnormally large, largely heterochromatic chromosome 4 (1, 2). To aid in studies of this massive LGT, the complete wAna genome was obtained.

To generate an LGT-free line of D. ananassae, Michael Clark and John Werren at the University of Rochester introgressed D. ananassae harboring the wAna Hawaii strain into the LGT-free D. ananassae Florida line for 10 generations to create D. ananassae W2.1, which was obtained from Irene Newton at Indiana University in Bloomington. The line was reared on molasses medium in plugged bottles at 25°C and 70% relativity humidity with a 12/12-h light/dark cycle. The whole flies were flash frozen in liquid nitrogen in a 50-ml Falcon tube and vortexed for 3 s, and the headless bodies were then collected with a small brush. High-molecular-weight DNA was isolated from the adult Drosophila ananassae W2.1 bodies using phenol-chloroform extraction with ethanol precipitation with sodium acetate (3). Illumina paired-end (2 × 150-bp) library construction and sequencing was performed using the Nextera XT library prep protocol on an Illumina MiSeq platform, yielding 51.4 Gbp in 340,594,990 sequenced reads. Long-read library preparation (SQK-RAD004) and sequencing (FLO-MIN106 R9 MinION) protocols from Oxford Nanopore Technologies (ONT) were performed with slight modifications using 2 μg of DNA as the input and omitting library-loading beads. Raw ONT read signals were base called using Albacore v2.3.1, which yielded 864.5 Mbp in 87,410 reads without barcoding or multiplexing. Sequencing adapters and possible chimeras were removed from base-called reads with Porechop v0.2.3 (4) using –discard_middle. An initial de novo assembly using only the ONT reads and miniasm v0.2 (5) yielded Drosophila and Wolbachia contigs. From the assembly, a single contig of the complete wAna genome was identified by aligning to the wRi genome (6) using MUMmer v3.0 (7). Illumina and ONT reads mapping to this putative Wolbachia contig were identified using BWA aln/sampe (8) and Minimap2 v2.10 (9) with -ax map-ont, respectively. The Wolbachia-mapping reads were used to construct a new, hybrid de novo assembly using Unicycler v0.4.4 (10). The assembly was visually inspected for misassemblies by remapping Illumina and ONT reads to the hybrid de novo assembly. The wRi genome has two nearly identical 68-kbp regions that both include a prophage (Fig. 1). We identified a 28-kbp deletion at the end of the first of these duplicated regions in the wAna genome (Fig. 1). This deletion was supported by ONT reads that spanned the deleted region but failed to assemble correctly. Therefore, the correct sequence of the first duplicate region was manually inserted after being derived from the spanning ONT reads that were Illumina corrected with Pilon v1.22 (11) and manually inspected for errors. The final, complete, and corrected assembly of the Wolbachia endosymbiont of Drosophila ananassae, wAna, consists of a circular chromosome of 1,401,460 bp (GC content, 35.2%) with average sequencing depths of ∼1,240× and 12× for the Illumina and ONT reads, respectively. The genome was annotated using the IGS Prokaryotic Annotation Pipeline (12) with Prodigal v2.6.3 set to disallow calling genes that run off the edge of contigs (13). All software was run using default settings unless otherwise noted. The wAna genome contains 1,289 open reading frames (ORFs), 35 tRNA genes, and one copy of each of the 5S, 16S, and 23S rRNA genes.

FIG 1 — Synteny between wAna and wRi. A MUMmer plot between the genomes of wAna and the *Wolbachia* endosymbiont of *D. simulans*, wRi, was generated using NUCmer to assess synteny. Red and blue line segments are indicative of conserved regions between the two strains, with blue lines being inverted in wRi relative to wAna. The gray-shaded regions mark the two duplicate regions in the wAna genome and wRi genomes where there is a deletion in wAna relative to wRi.

Data availability.

The complete genome sequence of wAna has been deposited in GenBank under the accession number CP042904. The Oxford Nanopore FASTQ file, Oxford Nanopore FAST5 file, and Illumina sequencing reads are available from the NCBI Sequence Read Archive (SRA) under the accession numbers SRR8306005, SRR9866440, and SRR8278850, respectively.

ACKNOWLEDGMENTS

This project was funded by the NSF (ABI1457957), as well as by an NIH Director’s Transformative Research Award (R01CA206188) and the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under grant number U19AI110820.

We thank John Werren and Michael Clark at the University of Rochester and Irene Newton at Indiana University for producing and sharing the fly line, respectively. Illumina genome sequencing was conducted by the Genome Resource Center at the Institute for Genome Sciences.

REFERENCES

1.Klasson L, Kumar N, Bromley R, Sieber K, Flowers M, Ott SH, Tallon LJ, Andersson SG, Dunning Hotopp JC. 2014. Extensive duplication of the Wolbachia DNA in chromosome four of Drosophila ananassae. BMC Genomics 15:1097. doi: 10.1186/1471-2164-15-1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Dunning Hotopp JC, Klasson L. 2018. The complexities and nuances of analyzing the genome of Drosophila ananassae and its Wolbachia endosymbiont. G3 (Bethesda) 8:373–374. doi: 10.1534/g3.117.300164. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Quick J. 2018. Ultra-long read sequencing protocol for RAD004. https://www.protocols.io/view/ultra-long-read-sequencing-protocol-for-rad004-mrxc57n.
4.Wick R. 2018. Porechop. https://github.com/rrwick/Porechop.
5.Li H. 2016. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. doi: 10.1093/bioinformatics/btw152. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Klasson L, Westberg J, Sapountzis P, Naslund K, Lutnaes Y, Darby AC, Veneti Z, Chen L, Braig HR, Garrett R, Bourtzis K, Andersson SG. 2009. The mosaic genome structure of the Wolbachia wRi strain infecting Drosophila simulans. Proc Natl Acad Sci U S A 106:5725–5730. doi: 10.1073/pnas.0810753106. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Delcher AL, Salzberg SL, Phillippy AM. 2003. Using MUMmer to identify similar regions in large sequence sets. Curr Protoc Bioinformatics Chapter 10:Unit 10.3. doi: 10.1002/0471250953.bi1003s00. [DOI] [PubMed] [Google Scholar]
8.Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Galens K, Orvis J, Daugherty S, Creasy HH, Angiuoli S, White O, Wortman J, Mahurkar A, Giglio MG. 2011. The IGS standard operating procedure for automated prokaryotic annotation. Stand Genomic Sci 4:244–251. doi: 10.4056/sigs.1223234. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. doi: 10.1186/1471-2105-11-119. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[B1] 1.Klasson L, Kumar N, Bromley R, Sieber K, Flowers M, Ott SH, Tallon LJ, Andersson SG, Dunning Hotopp JC. 2014. Extensive duplication of the Wolbachia DNA in chromosome four of Drosophila ananassae. BMC Genomics 15:1097. doi: 10.1186/1471-2164-15-1097. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2.Dunning Hotopp JC, Klasson L. 2018. The complexities and nuances of analyzing the genome of Drosophila ananassae and its Wolbachia endosymbiont. G3 (Bethesda) 8:373–374. doi: 10.1534/g3.117.300164. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3.Quick J. 2018. Ultra-long read sequencing protocol for RAD004. https://www.protocols.io/view/ultra-long-read-sequencing-protocol-for-rad004-mrxc57n.

[B4] 4.Wick R. 2018. Porechop. https://github.com/rrwick/Porechop.

[B5] 5.Li H. 2016. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. doi: 10.1093/bioinformatics/btw152. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.Klasson L, Westberg J, Sapountzis P, Naslund K, Lutnaes Y, Darby AC, Veneti Z, Chen L, Braig HR, Garrett R, Bourtzis K, Andersson SG. 2009. The mosaic genome structure of the Wolbachia wRi strain infecting Drosophila simulans. Proc Natl Acad Sci U S A 106:5725–5730. doi: 10.1073/pnas.0810753106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7.Delcher AL, Salzberg SL, Phillippy AM. 2003. Using MUMmer to identify similar regions in large sequence sets. Curr Protoc Bioinformatics Chapter 10:Unit 10.3. doi: 10.1002/0471250953.bi1003s00. [DOI] [PubMed] [Google Scholar]

[B8] 8.Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9.Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12.Galens K, Orvis J, Daugherty S, Creasy HH, Angiuoli S, White O, Wortman J, Mahurkar A, Giglio MG. 2011. The IGS standard operating procedure for automated prokaryotic annotation. Stand Genomic Sci 4:244–251. doi: 10.4056/sigs.1223234. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] 13.Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. doi: 10.1186/1471-2105-11-119. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Complete Genome Sequence of wAna, the Wolbachia Endosymbiont of Drosophila ananassae

Mark T Gasser

Matthew Chung

Robin E Bromley

Suvarna Nadendla

Julie C Dunning Hotopp

Roles

ABSTRACT

ANNOUNCEMENT

FIG 1.

Data availability.

ACKNOWLEDGMENTS

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Complete Genome Sequence of wAna, the Wolbachia Endosymbiont of Drosophila ananassae

Mark T Gasser

Matthew Chung

Robin E Bromley

Suvarna Nadendla

Julie C Dunning Hotopp

Roles

ABSTRACT

ANNOUNCEMENT

FIG 1.

Data availability.

ACKNOWLEDGMENTS

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases