Skip to main content
Genome Announcements logoLink to Genome Announcements
. 2014 Apr 3;2(2):e00237-14. doi: 10.1128/genomeA.00237-14

Improved Hybrid Genome Assemblies of Two Strains of Bacteroides xylanisolvens, SD_CC_1b and SD_CC_2a, Obtained Using Illumina and 454 Sequencing Technologies

Thiruvarangan Ramaraj a, Anitha Sundararajan a, Faye D Schilkey a, Vito G DelVecchio b, Mildred Donlon c, Cherie Ziemer d, Joann Mudge a,
PMCID: PMC3974937  PMID: 24699955

Abstract

Bacteroides xlyanisolvens strains (SD_CC_1b, SD_CC_2a) isolated from human feces were grown on crystalline cellulose. Cellulolytic properties are not common in Bacteroides species. Here, we report improved genome sequences of both of the B. xlyanisolvens strains.

GENOME ANNOUNCEMENT

Bacteroides xlyanisolvens is commonly found in the human gut (1). The type strain, XB1A, has high xylanase activity and extensively ferments xylan (2). We isolated two strains, SD_CC_1b and SD_CC_2a, using a substrate-depleted medium (3) with crystalline cellulose as the only carbohydrate, at the USDA-ARS National Laboratory for Agriculture and the Environment (Ames, IA). Analysis of the postfermentation cellulose structure indicated that SD_CC_1b and SD_CC_2a might have unique degradation properties (data not shown). In order to further determine the genomic basis for these results we sequenced the whole genome.

DNA was isolated using a DNeasy blood and tissue kit (Qiagen). PCR amplification of 16S rRNA genes was done with 27F (5′ AGAGGTTTGATCMTGGCTCAG 3′) and 1492R (5′ TACGGYTACCTTGTTACGACTT 3′) primers (Invitrogen) and a DYAD DNA Engine thermocycler (MJ Research) to determine phylogentic species placement. The PCR mixture (50 µl) contained 1× Qiagen PCR buffer, 1.25 U of Taq polymerase (Qiagen), 0.25 mM of each deoxynucleoside triphosphate (dNTP) (Amresco), 25 pmol of each primer, and 80 ng of template DNA. Amplified products were cleaned using QIAquick 96 PCR purification (Qiagen) and sequenced by the Iowa State University DNA Sequencing and Synthesis Facility (Ames, IA) using an ABI Prism 377 sequencer.

We built upon an assembly using a total of 337,702 reads (215.6 million bp and ~35-fold) for SD_CC_1b (GenBank accession no. SRX015718) and 245,608 reads (143 million bp and ~23-fold) for SD_CC_2a (SRX015722), generated using 454 GS-FLX and assembled using a Newbler assembler (4). A total of 6,059,812 bp in 236 contigs (N50, 60,820 bp) and 6,050,198 bp in 305 contigs (N50, 40,148 bp) were generated for SD_CC_1b (GenBank accession no. ASM17821v1) and SD_CC_2a (ASM17829v1), respectively. We added a small insert (~374 bp) library for each strain prepared using the standard Illumina (HiSeq2000) protocol. After screening for sequencing artifacts and ΦX contamination, 50.6 million (~1,687-fold) and 47.1 million (~1,569-fold) 100-bp paired-end reads were generated for SD_CC_1b and SD_CC_2a, respectively. The coverage calculated was based on a genome size estimate of ~6 Mbp. Scaffolds were generated from Illumina reads using the ABySS (v 1.3.4) assembler (5). Ilumina and 454 assemblies were merged using PHRAP (6), an overlap layout consensus (OLC) style assembler. The SD_CC_1b assembly had a total of 6,484,037 bp (60 scaffolds, with a GC of 42% and an N50 of 230,871 bp), and SD_CC_2a had a total of 6,228,594 bp (68 scaffolds, with a GC of 42% and an N50 of 214,376 bp). The assemblies were greatly improved, and the number of sequences decreased from 236 to 60 in SD_CC_1b and 305 to 68 in SD_CC_2a. The N50 increased from 60,820 bp to 230,871 bp and 40,148 bp to 214,376 bp for SD_CC_1b and SD_CC_2a, respectively. The quality levels of both assemblies were assessed by mapping Illumina reads back to the assembly using BWA (7). High percentages of uniquely aligning reads (~87% mapped back with ~86% properly paired and ~85% mapping uniquely for both strains) to the final genome in both strains validated the de novo assembly process. Gene prediction and annotation for both strains were performed using the RAST (8) server incorporating GLIMMER (9, 10). A total of 5,521 protein-coding and 83 RNA sequences were predicted for SD_CC_1b and 5,328 protein-coding and 75 RNA sequences were predicted for SD_CC_2a.

Nucleotide sequence accession numbers.

Draft genome sequences for both strains have been deposited in the European Nucleotide Archive, under accession numbers CBXG000000000 (SD_CC_1b) and CBXH000000000 (SD_CC_2a).

ACKNOWLEDGMENTS

This research was supported by grants from the Defense Advanced Research Projects Agency as part of the Intestinal Fortitude and Crystalline Cellulose Conversion to Glucose programs to C.Z. and V.G.D. and by the National Institute of General Medical Sciences (8P20GM103451-12).

Footnotes

Citation Ramaraj T, Sundararajan A, Schilkey FD, DelVecchio VG, Donlon M, Ziemer C, Mudge J. 2014. Improved hybrid genome assemblies of two strains of Bacteroides xylanisolvens, SD_CC_1b and SD_CC_2a, obtained using Illumina and 454 sequencing technologies. Genome Announc. 2(2):e00237-14. doi:10.1128/genomeA.00237-14.

REFERENCES

  • 1. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T, Mende DR, Li J, Xu J, Li S, Li D, Cao J, Wang B, Liang H, Zheng H, Xie Y, Tap J, Lepage P, Bertalan M, Batto J-M, Hansen T, Le Paslier D, Linneberg A, Nielsen HB, Pelletier E, Renault P, Sicheritz-Ponten T, Turner K, Zhu H, Yu C, Li S, Jian M, Zhou Y, Li Y, Zhang X, Li S, Qin N, Yang H, Wang J, Brunak S, Doré J, Guarner F, Kristiansen K, Pedersen O, Parkhill J, Weissenbach J, MetaHIT Consortium. Bork P, Ehrlich SD, Wang J. 2010. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464:59. 10.1038/nature08821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Mirande C, Kadlecikova E, Matulova M, Capek P, Bernalier-Donadille A, Forano E, Béra-Maillet C. 2010. Dietary fibre degradation and fermentation by two xylanolytic bacteria Bacteroides xylanisolvens XB1A and Roseburia intestinalis XB6B4 from the human intestine. J. Appl. Microbiol. 109:451–460. 10.1111/j.1365-2672.2010.04671.x [DOI] [PubMed] [Google Scholar]
  • 3. Allison MJ, Robinson IM, Bucklin JA, Booth GD. 1979. Comparison of bacterial populations of the pig cecum and colon based upon enumeration with specific energy sources. Appl. Environ. Microbiol. 37:1142–1151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM. 2005.  Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380. 10.1038/nature03959 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. 2009. ABySS: a parallel assembler for short read sequence data. Genome Res. 19:1117–1123. 10.1101/gr.089532.108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Ewing B, Hillier L, Wendl MC, Green P. 1998. Base calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Res. 8:175–185 [DOI] [PubMed] [Google Scholar]
  • 7. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O. 2008. The RAST server: rapid annotations using subsystems technology. BMC Genomics 9:75. 10.1186/1471-2164-9-75 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Delcher AL, Bratke KA, Powers EC, Salzberg SL. 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673–679. 10.1093/bioinformatics/btm009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Salzberg SL, Delcher AL, Kasif S, White O. 1998. Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26:544–548. 10.1093/nar/26.2.544 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Genome Announcements are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES