Skip to main content
Genomics Data logoLink to Genomics Data
. 2015 Feb 2;4:22–23. doi: 10.1016/j.gdata.2015.01.009

Draft genome sequence of the intestinal parasite Blastocystis subtype 4-isolate WR1

Ivan Wawrzyniak a, Damien Courtine a, Marwan Osman b, Christine Hubans-Pierlot c, Amandine Cian b, Céline Nourrisson a, Magali Chabe b, Philippe Poirier a, Aldert Bart d, Valérie Polonais a, Pilar Delgado-Viscogliosi b, Hicham El Alaoui a, Abdel Belkorchia a, Tom van Gool d, Kevin SW Tan e, Stéphanie Ferreira c, Eric Viscogliosi b,, Frédéric Delbac a,
PMCID: PMC4535960  PMID: 26484170

Abstract

The intestinal protistan parasite Blastocystis is characterized by an extensive genetic variability with 17 subtypes (ST1–ST17) described to date. Only the whole genome of a human ST7 isolate was previously sequenced. Here we report the draft genome sequence of Blastocystis ST4-WR1 isolated from a laboratory rodent at Singapore.

Keywords: Blastocystis subtype 4-isolate WR1, Illumina-HiSeq, Whole genome, Annotation using Maker gene annotation pipeline


Specifications
Organism/cell line/tissue Blastocystis ST4
Strain WR1
Sequencer or array type Illumina-HiSeq 2000
Data format Processed
Experimental factors Laboratory rodent and cultured axenically
Experimental features Draft genome sequence of the intestinal parasite Blastocystis ST4-WR1 isolate
Consent n/a
Sample source location Clermont-Ferrand, France

Direct link to data

This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JPUL02000000 (http://www.ncbi.nlm.nih.gov/nuccore/JPUL00000000.2).

Experimental design, materials and methods

The stramenopile Blastocystis is a common anaerobic protist living in the digestive tract of several animal groups [1]. Its prevalence in human often exceeds 5% in industrialized countries [1] and can reach 100% in developing countries [2]. Although the role of Blastocystis as a human pathogen remains unclear, it has been associated with acute or chronic digestive disorders and some epidemiological surveys have suggested an association with irritable bowel syndrome (IBS) [3], [4]. In patients with IBS, Blastocystis seems to be associated with a decrease of the fecal microbiota protective bacteria, Bifidobacterium sp. and Faecalibacterium prausnitzii [5]. The life cycle of the parasite is poorly documented. Among the parasitic forms described in the literature, the vacuolar stage which is maintained in vitro in axenic culture, is the most easily recognizable and the most frequently observed in stool samples. Blastocystis exhibits an extensive genetic diversity and seventeen subtypes (ST1–ST17) have been identified based on the gene coding for the small-subunit ribosomal RNA [6] among which the first nine are found in humans. The whole genome of a human Blastocystis ST7 isolate was previously sequenced. Briefly, it consists of an 18.8 Mbp nuclear genome with 6020 predicted genes [7] and a circular genome of 29 kbp [8] located within mitochondria-like organelles (MLO). Other MLO genomes with conserved gene synteny have also been sequenced from Blastocystis ST1, ST3 and ST4 isolates [9], [10]. Here we report the sequencing of the Blastocystis ST4-WR1 genome from an isolate of a laboratory rodent and cultured axenically [11]. Genomic DNA was isolated using a Qiagen DNeasy blood and tissue kit and sequencing was performed with the Illumina HiSeq 2000 system (Genoscreen, Lille, France). A total of 43.855.085 of 100-bp high quality paired-end reads were generated and were de novo assembled using the IDBA-ud algorithm [12]. The output was then scaffolded using SSPACE [13] and gaps were filled by Gapfiller software [14]. In total, 1301 scaffolds from 494 bp to 133,271 bp were obtained, with a scaffold N50 of 29,931 bp. The draft genome sequence of Blastocystis ST4 has a deduced total length of 12.91 Mbp and a G + C content of 39.7%. Assembly also provided a circular DNA molecule of 27,717 bp in size with a G + C content of 21.9% corresponding to the whole MLO genome sequence. Genes were carried out using the Maker gene annotation pipeline [15]. The Maker pipeline was set with the results of ab initio gene prediction algorithms Augustus [16] and SNAP [17], the 6020 protein-coding genes of Blastocystis ST7 [5], ESTs of both Blastocystis ST7 [5] and ST1 [18] and 414 manually-designed genes of the ST4-WR1 isolate. Basic information about the assembled genome and predicted genes are shown in Table 1. Gene functions were annotated by BLAST2GO [19] and BLAST analyses with NCBI (http://www.ncbi.nlm.nih.gov/). 183 tRNA were predicted using tRNAscan-SE 1.21 [20]. The preliminary annotation data revealed that Blastocystis ST4-WR1 nuclear genome harbors 5713 protein-coding genes. The presence of proteases was determined using BLAST against MEROPS database [21], and secreted proteases were identified using SIGNALP 3.0 [22] and WoLF PSORT [23]. Finally, OrthoMCL [24] was applied to compare both ST4 and ST7 genomes. This comparative analysis revealed that the ST4 genome contains less duplicated genes than ST7 and that more than 30% of ST4 genes have no ortholog in the ST7 genome at an E value cutoff of 10− 5. This also led to the identification of new candidate genes, in particular some potential virulence factors, including 20 secreted proteases that may be involved in the physiopathology of this parasite. Among these proteases, 7 seem to be specific to ST4 as no ortholog has been found in the ST7 genome. Sequencing and annotation of additional ST (ST1, ST2, ST3 and ST8) genomes are under progress and should be helpful for a better understanding of the genetic diversity, pathogenesis, metabolic potential and genome evolution of this highly prevalent human parasite.

Table 1.

Genome statistics and intron features of Blastocystis ST4 and ST7.

Blastocystis ST4 Blastocystis ST7
Genome assembly size 12.91 Mbp 18.8 Mbp
G + C content 39.6% 45.2%
Number of genes 5713 6021
Average gene size 1386 bp 1299 bp
Genes with introns 92.7% 84.6%
Average exon number per gene 5.06 4.58
Average length of introns (nt number) 33 50
Average length of proteins (aa number) 416 359
MLO genome size 27,815 bp 29,270 bp
MLO G + C content 21.94% 20.03%
Number of MLO genes 45 45

Conflict of interest

Authors declare no conflict of interest.

Acknowledgments

This work was funded by grants from the French National Center for Scientific Research (CNRS), the INSERM, the Programme Orientations Stratégiques from the University of Lille 2 and the Institut Pasteur of Lille. MO was supported by a PhD fellowship from the Conseil National de la Recherche Scientifique and the Azm & Saade Association from Lebanon and AC by a PhD fellowship from the Pasteur Institute of Lille and the University of Lille 2.

Footnotes

Appendix A

Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.gdata.2015.01.009.

Contributor Information

Eric Viscogliosi, Email: eric.viscogliosi@pasteur-lille.fr.

Frédéric Delbac, Email: frederic.delbac@univ-bpclermont.fr.

Appendix A. Supplementary data

Supplementary material.

mmc1.zip (7.5MB, zip)

References

  • 1.Tan K.S., Mirza H., Teo J.D., Wu B., Macary P.A. Current views on the clinical relevance of Blastocystis spp. Curr. Infect. Dis. Rep. 2010;12:28–35. doi: 10.1007/s11908-009-0073-8. [DOI] [PubMed] [Google Scholar]
  • 2.Wawrzyniak I., Poirier P., Viscogliosi E., Meloni D., Texier C., Delbac F., Alaoui H.E. Blastocystis an unrecognized parasite: an overview of pathogenesis and diagnosis. Ther. Adv. Infect. Dis. 2013;1:167–178. doi: 10.1177/2049936113504754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Poirier P., Wawrzyniak I., Vivares C.P., Delbac F., El Alaoui H. New insights into Blastocystis spp.: a potential link with irritable bowel syndrome. PLoS Pathog. 2012;8:e1002545. doi: 10.1371/journal.ppat.1002545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.El Safadi D., Gaayeb L., Meloni D., Cian A., Poirier P., Wawrzyniak I., Delbac F., Dabboussi F., Delhaes L., Seck M., Hamze M., Riveau G., Viscogliosi E. Children of Senegal River Basin show the highest prevalence of Blastocystis sp. ever observed worldwide. BMC Infect. Dis. 2014;14:164. doi: 10.1186/1471-2334-14-164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Nourrisson C., Scanzi J., Pereira B., NkoudMongo C., Wawrzyniak I., Cian A., Viscogliosi E., Livrelli V., Delbac F., Dapoigny M., Poirier P. Blastocystis is associated with decrease of fecal microbiota protective bacteria: comparative analysis between patients with irritable bowel syndrome and control subjects. PLoS ONE. 2014;3(2014):e111868. doi: 10.1371/journal.pone.0111868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Alfellani M.A., Taner-Mulla D., Jacob A.S., Atim Imeede C., Yoshikawa H., Stensvold C.R., Clark C.G. Genetic diversity of Blastocystis in livestock and zoo animals. Protist. 2013;164:497–509. doi: 10.1016/j.protis.2013.05.003. [DOI] [PubMed] [Google Scholar]
  • 7.Denoeud F., Roussel M., Noel B., Wawrzyniak I., Da Silva C., Diogon M., Viscogliosi E., Brochier-Armanet C., Couloux A., Poulain J., Segurens B., Anthouard V., Texier C., Blot N., Poirier P., Ng G.C., Tan K.S., Artiguenave F., Jaillon O., Aury J.M., Delbac F., Wincker P., Vivares C.P., El Alaoui H. Genome sequence of the stramenopile Blastocystis, a human anaerobic parasite. Genome Biol. 2011;12:R29. doi: 10.1186/gb-2011-12-3-r29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wawrzyniak I., Roussel M., Diogon M., Couloux A., Texier C., Tan K.S., Vivares C.P., Delbac F., Wincker P., El Alaoui H. Complete circular DNA in the mitochondria-like organelles of Blastocystis hominis. Int. J. Parasitol. 2008;38:1377–1382. doi: 10.1016/j.ijpara.2008.06.001. [DOI] [PubMed] [Google Scholar]
  • 9.Perez-Brocal V., Clark C.G. Analysis of two genomes from the mitochondrion-like organelle of the intestinal parasite Blastocystis: complete sequences, gene content, and genome organization. Mol. Biol. Evol. 2008;25:2475–2482. doi: 10.1093/molbev/msn193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Stensvold C.R., Alfellani M., Clark C.G. Levels of genetic diversity vary dramatically between Blastocystis subtypes. Infect. Genet. Evol. 2012;12:263–273. doi: 10.1016/j.meegid.2011.11.002. [DOI] [PubMed] [Google Scholar]
  • 11.Chen X.Q., Singh M., Ho L.C., Tan S.W., Ng G.C., Moe K.T., Yap E.H. Description of a Blastocystis species from Rattus norvegicus. Parasitol. Res. 1997;83:313–318. doi: 10.1007/s004360050255. [DOI] [PubMed] [Google Scholar]
  • 12.Peng Y., Leung H.C., Yiu S.M., Chin F.Y. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–1428. doi: 10.1093/bioinformatics/bts174. [DOI] [PubMed] [Google Scholar]
  • 13.Boetzer M., Henkel C.V., Jansen H.J., Butler D., Pirovano W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011;27:578–579. doi: 10.1093/bioinformatics/btq683. [DOI] [PubMed] [Google Scholar]
  • 14.Boetzer M., Pirovano W. Toward almost closed genomes with GapFiller. Genome Biol. 2012;13:R56. doi: 10.1186/gb-2012-13-6-r56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Holt C., Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinf. 2011;12:491. doi: 10.1186/1471-2105-12-491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hoff K.J., Stanke M. WebAUGUSTUS — a web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Res. 2013;41:W123–W128. doi: 10.1093/nar/gkt418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Korf I. Gene finding in novel genomes. BMC Bioinf. 2004;5:59. doi: 10.1186/1471-2105-5-59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Stechmann A., Hamblin K., Pérez-Brocal V., Gaston D., Richmond G.S., van der Giezen M., Clark C.G., Roger A.J. Organelles that blur the distinction between mitochondria and hydrogenosomes. Curr. Biol. 2008;18:580–585. doi: 10.1016/j.cub.2008.03.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Conesa A., Gotz S., Garcia-Gomez J.M., Terol J., Talon M., Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  • 20.Lowe T.M., Eddy S.R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–964. doi: 10.1093/nar/25.5.955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Rawlings N.D., Barrett A.J., Bateman A. MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2014;40:D343–D350. doi: 10.1093/nar/gkr987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Petersen T.N., Brunak S., von Heijne G., Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods. 2014;8:785–786. doi: 10.1038/nmeth.1701. [DOI] [PubMed] [Google Scholar]
  • 23.Horton P., Park K.J., Obayashi T., Fujita N., Harada H., Adams-Collier C.J., Nakai K. WoLF PSORT: protein localization predictor. Nucleic Acids Res. 2007;35:W585–W587. doi: 10.1093/nar/gkm259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Li L., Stoeckert C.J., Jr., Roos D.S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–2189. doi: 10.1101/gr.1224503. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material.

mmc1.zip (7.5MB, zip)

Articles from Genomics Data are provided here courtesy of Elsevier

RESOURCES