Abstract
Spiroplasma turonicum Tab4cT was isolated from a horse fly (Haematopota sp.; probably Haematopota pluvialis) collected at Champchevrier, Indre-et-Loire, Touraine, France, in 1991. Here, we report the complete genome sequence of this bacterium to facilitate the investigation of its biology and the comparative genomics among Spiroplasma spp.
GENOME ANNOUNCEMENT
Spiroplasma turonicum is a bacterium associated with Haematopota sp. horse flies. The type strain Tab4cT was isolated from a single fly, probably Haematopota pluvialis, collected at Champchevrier (Indre-et-Loire, Touraine, France) in 1991 and was assigned to group XVII within the genus (1). As part of our ongoing effort to investigate Spiroplasma genome evolution (2), we determined the complete genome sequence of S. turonicum Tab4cT.
The DNA sample was prepared from the strain maintained in Gail Gasparich’s laboratory at Towson University, which was acquired from the USDA/ARS Spiroplasma Culture Collection of Robert Whitcomb in 1996. This strain had been lyophilized after 17 passes from the original isolation. Prior to our completion of this project, a complete genome sequence of this bacterium was published based on another subculture of the same strain using the Pacific Biosciences platform (3). Because of this, we utilized this published sequence (GenBank accession no. CP012328.1) as the reference for a resequencing analysis. We chose the Illumina MiSeq platform to generate 301-bp reads from one paired-end library (~510-bp insert, 1,206,242 reads, ~288-fold coverage). The raw reads were mapped to the reference genome using BWA version 0.7.12 (4), programmatically checked using SAMTOOLS version 1.2 (5), and visually inspected using IGV version 2.3.57 (6).
The procedures for genome annotation were based on those described in our previous studies on Spiroplasma genomes (7–15). The programs RNAmmer (16), tRNAscan-SE (17), and Prodigal (18) were used for gene prediction. The gene names and product descriptions were first annotated based on the homologous genes in other Spiroplasma genomes, as identified by OrthoMCL (19). Subsequent manual curation was based on BLASTp (20) searches against the NCBI nonredundant database (21) and the KEGG database (22, 23). Putative clustered regularly interspaced short palindromic repeats (CRISPRs) were identified using CRISPRFinder (24).
Our resequencing analysis identified 13 polymorphic sites, including 12 single-nucleotide polymorphisms and one 1-bp indel in a homopolymeric region. It is unclear if these polymorphisms reflect true genetic variations or are artifacts of the sequencing technologies used. After correcting for these polymorphisms, the S. turonicum Tab4cT chromosome described in this work is 1,261,375 bp in size and has a G+C content of 24.2%. The two S. turonicum genomes both have one set of 16S-23S-5S rRNA genes, 29 tRNA genes (covering all 20 amino acids), and one 2,940-bp CRISPR locus (containing 44 spacers). However, the annotation of protein-coding genes differs between the two genomes. In CP012328.1, the annotation includes 1,085 protein-coding genes and no pseudogenes. Several of these predicted protein-coding genes appeared to be fragments of disrupted open reading frames and were merged into pseudogenes in our annotation. In the first version of our annotation, the S. turonicum Tab4cT genome contains 1,066 protein-coding genes and eight pseudogenes. Finally, the annotation of gene name and product description in this newly reported S. turonicum genome is more consistent with the majority of published Spiroplasma genomes (7–15).
Accession number(s).
The complete genome sequence of S. turonicum Tab4cT has been deposited at DDBJ/EMBL/GenBank under the accession number CP013860.
ACKNOWLEDGMENTS
The funding for this project was provided by the Institute of Plant and Microbial Biology at Academia Sinica and the Ministry of Science and Technology of Taiwan (NSC 101-2621-B-001-004-MY3 and MOST 104-2311-B-001-019) to C.-H.K. W.-S.L. was supported by the TIGP-MBAS program (Academia Sinica and National Chung Hsing University). The sequencing library preparation service was provided by the DNA Microarray Core Laboratory (Institute of Plant and Microbial Biology, Academia Sinica). The Illumina MiSeq sequencing service was provided by the DNA Sequencing Core Facility (Institute of Molecular Biology, Academia Sinica).
Funding Statement
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Footnotes
Citation Lo W-S, Gasparich GE, Kuo C-H. 2016. Complete genome sequence of Spiroplasma turonicum Tab4cT, a bacterium isolated from horse flies (Haematopota sp.). Genome Announc 4(5):e01010-16. doi:10.1128/genomeA.01010-16.
REFERENCES
- 1.Hélias C, Vazeille-Falcoz M, Le Goff F, Abalain-Colloc M-L, Rodhain F, Carle P, Whitcomb RF, Williamson DL, Tully JG, Bové JM, Chastel C. 1998. Spiroplasma turonicum sp. nov. from Haematopota horse flies (Diptera: Tabanidae) in France. Int J Syst Bacteriol 48:457–461. doi: 10.1099/00207713-48-2-457. [DOI] [PubMed] [Google Scholar]
- 2.Lo W-S, Huang Y-Y, Kuo C-H. Winding paths to simplicity: genome evolution in facultative insect symbionts. FEMS Microbiol Rev, in press. doi: 10.1093/femsre/fuw028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Davis RE, Shao J, Zhao Y, Gasparich GE, Gaynor BJ, Donofrio N. 2015. Complete genome sequence of Spiroplasma turonicum strain Tab4cT, a parasite of a horse fly, Haematopota sp. (Diptera: Tabanidae). Genome Announc 3(6):e01367-15. doi: 10.1128/genomeA.01367-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup . 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. 2011. Integrative genomics viewer. Nat Biotechnol 29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lo W-S, Chen L-L, Chung W-C, Gasparich GE, Kuo C-H. 2013. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium. BMC Genomics 14:22. doi: 10.1186/1471-2164-14-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ku C, Lo W-S, Chen L-L, Kuo C-H. 2013. Complete genomes of two dipteran-associated spiroplasmas provided insights into the origin, dynamics, and impacts of viral invasion in Spiroplasma. Genome Biol Evol 5:1151–1164. doi: 10.1093/gbe/evt084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Lo W-S, Ku C, Chen L-L, Chang T-H, Kuo C-H. 2013. Comparison of metabolic capacities and inference of gene content evolution in mosquito-associated Spiroplasma diminutum and S. Taiwanense. Genome Biol Evol 5:1512–1523. doi: 10.1093/gbe/evt108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ku C, Lo W-S, Chen L-L, Kuo C-H. 2014. Complete genome sequence of Spiroplasma apis B31T (ATCC 33834), a bacterium associated with May disease of honeybees (Apis mellifera). Genome Announc 2(1):e01151-13. doi: 10.1128/genomeA.01151-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Chang T-H, Lo W-S, Ku C, Chen L-L, Kuo C-H. 2014. Molecular evolution of the substrate utilization strategies and putative virulence factors in mosquito-associated Spiroplasma species. Genome Biol Evol 6:500–509. doi: 10.1093/gbe/evu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Paredes JC, Herren JK, Schüpfer F, Marin R, Claverol S, Kuo C-H, Lemaitre B, Béven L. 2015. Genome sequence of the Drosophila melanogaster male-killing Spiroplasma strain MSRO endosymbiont. mBio 6:e02437-14. doi: 10.1128/mBio.02437-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lo W-S, Gasparich GE, Kuo C-H. 2015. Found and lost: the fates of horizontally acquired genes in arthropod-symbiotic Spiroplasma. Genome Biol Evol 7:2458–2472. doi: 10.1093/gbe/evv160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lo W-S, Lai Y-C, Lien Y-W, Wang T-H, Kuo C-H. 2015. Complete genome sequence of Spiroplasma litorale TN-1T (DSM 21781), a bacterium isolated from a green-eyed horsefly (Tabanus nigrovittatus). Genome Announc 3(5):e01116-15. doi: 10.1128/genomeA.01116-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lo W-S, Liu P-Y, Kuo C-H. 2015. Complete genome sequence of Spiroplasma cantharicola CC-1T (DSM 21588), a bacterium isolated from soldier beetle (Cantharis carolinus). Genome Announc 3(5):e01253-15. doi: 10.1128/genomeA.01253-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lagesen K, Hallin P, Rødland EA, Staerfeldt H-H, Rognes T, Ussery DW. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35:3100–3108. doi: 10.1093/nar/gkm160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25:955–964. doi: 10.1093/nar/25.5.0955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. doi: 10.1186/1471-2105-11-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Li L, Stoeckert CJ, Roos DS. 2003. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189. doi: 10.1101/gr.1224503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. BLAST+: architecture and applications. BMC Bioinformatics 10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. 2015. GenBank. Nucleic Acids Res 43:D30–D35. doi: 10.1093/nar/gku1216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kanehisa M, Goto S. 2000. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M. 2010. KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38:D355–D360. doi: 10.1093/nar/gkp896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Grissa I, Vergnaud G, Pourcel C. 2007. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res 35:W52–W57. doi: 10.1093/nar/gkm360. [DOI] [PMC free article] [PubMed] [Google Scholar]