We report the closed genome sequence of a Lactobacillus johnsonii strain (NCK2677) that was isolated from a cefoperazone-treated mouse model designed for the study of Clostridioides difficile infection. Illumina and Nanopore sequencing reads were assembled into a circular 1,951,416-bp chromosome with a G+C content of 34.7%, containing 1,865 genes.
ABSTRACT
We report the closed genome sequence of a Lactobacillus johnsonii strain (NCK2677) that was isolated from a cefoperazone-treated mouse model designed for the study of Clostridioides difficile infection. Illumina and Nanopore sequencing reads were assembled into a circular 1,951,416-bp chromosome with a G+C content of 34.7%, containing 1,865 genes.
ANNOUNCEMENT
Lactobacilli, including Lactobacillus johnsonii, are considered beneficial bacteria with applications as probiotic and therapeutic bacteria in both humans and animals. We report the complete genome sequence of Lactobacillus johnsonii NCK2677, which was isolated from a cefoperazone-treated mouse model designed for the study of Clostridioides difficile infection (1, 2).
Fecal samples obtained from wild-type C57BL/6 mice (The Jackson Laboratory, Bar Harbor, ME) after cefoperazone treatment were resuspended in phosphate-buffered saline and plated onto Lactobacillus selection (LBS) agar to select individual colonies. Isolated colonies were then inoculated into MRS broth, grown overnight at 37°C under ambient atmospheric conditions, and restreaked twice onto MRS agar plates to confirm purity. L. johnsonii colonies were confirmed by amplification of the variable region of the 16S rRNA gene using the primer pair plb16 (5′-AGAGTTTGATCCTGGCTCAG-3′) and mlb16 (5′-GGCTGCTGGCACGTAGTTAG-3′), as described by Kullen et al. (3). The resulting amplicon was sequenced by Sanger sequencing (Genewiz, Durham, NC), and DNA sequences were used to confirm the species identity with a BLAST search against publicly available sequences in the NCBI database. One confirmed L. johnsonii colony was inoculated into MRS broth and grown overnight, and 15 ml of the culture was centrifuged at 1,717 × g for 10 min. The cell culture pellet was stored at −80°C. DNA extraction, library preparation, whole-genome sequencing, and assembly were performed at the Roy J. Carver Biotechnology Center (University of Illinois at Urbana-Champaign, Urbana, IL). Genomic DNA was extracted with the MasterPure DNA purification kit (Lucigen). For DNA sequencing with the Illumina platform, shotgun genomic libraries were prepared with the KAPA HyperPrep library construction kit (Roche). The libraries were quantitated by quantitative PCR and sequenced in one lane for 251 cycles, from each end of the fragments, on a HiSeq 2500 system using the HiSeq Rapid SBS sequencing kit v2. Fastq files were generated and demultiplexed with bcl2fastq v2.20 conversion software (Illumina). The genomic DNAs were converted into Nanopore libraries with the NBD114 and 1D (SQK-LSK109) library kits. The libraries were sequenced in two SpotON R9.4.1 FLO-MIN106 flow cells for 72 h, using a GridION X5 sequencer. Base calling was carried out using Guppy v3.0.3, and adaptors were removed with Porechop v0.2.3 (Nanopore). The Nanopore sequencing generated 201,347 long reads (974,430,705 bp) with an N50 value of 6.3 kb. The Illumina sequencing generated 1,274,082 read pairs (2 × 250 bp).
To assemble the genome, Illumina reads and Nanopore long reads were checked for quality prior to and after trimming using FastQC v0.11.8 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). Default parameters were used for all software unless otherwise specified. Illumina reads were trimmed using Trimmomatic v0.36 (4) with parameters set ILLUMINACLIP:TruSeq3-PE-2.fa:2:15:10 LEADING:28 TRAILING:28 MINLEN:30; this retained reads longer than 30 bp and resulted in 1,268,909 read pairs. Long reads were adapter trimmed with Porechop v0.2.3 (https://github.com/rrwick/Porechop) and length filtered to a minimum of 1 kb with seqtk v1.2 (https://github.com/lh3/seqtk). Reads were assembled with Unicycler v0.4.4 (5).
Unicycler assembled the trimmed Illumina reads and uncorrected Nanopore reads in a hybrid assembly using the default normal mode. Within Unicycler, the Illumina reads were assembled with SPAdes v3.11.1 (6), and the resulting long-anchor contigs were assembled together with the Nanopore reads with an optimized version of miniasm (7) and Racon v0.5.0 (8). Pilon v1.22 (9) was used within Unicycler to iteratively polish the assembly with the Illumina reads. The circularized genome was rotated to the default starting gene dnaA, resulting in a 1,951,416-bp chromosome with a G+C content of 34.7%. The NCBI Prokaryotic Genome Annotation Pipeline (PGAP) v4.8 was used for annotation (10, 11) and identified 1,865 genes, including 1,762 protein-encoding genes and 79 tRNAs.
Data availability.
This sequencing project has been deposited in GenBank under accession number CP059055. The associated BioProject and BioSample accession numbers are PRJNA645941 and SAMN15520417, respectively. The associated SRA accession numbers are SRR12296539 (Nanopore reads) and SRR12296540 (Illumina reads).
ACKNOWLEDGMENTS
We thank our colleagues from the Barrangou laboratory. We also acknowledge the staff, including Kimberly Walden and Christopher Fields, at the Roy J. Carver Biotechnology Center at the University of Illinois at Urbana-Champaign.
We acknowledge support from DuPont Nutrition and Health. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
REFERENCES
- 1.Theriot CM, Koumpouras CC, Carlson PE, Bergin II, Aronoff DM, Young VB. 2011. Cefoperazone-treated mice as an experimental platform to assess differential virulence of Clostridium difficile strains. Gut Microbes 2:326–334. doi: 10.4161/gmic.19142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Winston JA, Thanissery R, Montgomery SA, Theriot CM. 2016. Cefoperazone-treated mouse model of clinically-relevant Clostridium difficile strain R20291. J Vis Exp (118):54850. doi: 10.3791/54850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kullen MJ, Sanozky-Dawes RB, Crowell DC, Klaenhammer TR. 2000. Use of the DNA sequence of variable regions of the 16S rRNA gene for rapid and accurate identification of bacteria in the Lactobacillus acidophilus complex. J Appl Microbiol 89:511–516. doi: 10.1046/j.1365-2672.2000.01146.x. [DOI] [PubMed] [Google Scholar]
- 4.Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Li H. 2016. minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. doi: 10.1093/bioinformatics/btw152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Vaser R, Sovic I, Nagarajan N, Sikic M. 2017. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746. doi: 10.1101/gr.214270.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Haft DH, DiCuccio M, Badretdin A, Brover V, Chetvernin V, O'Neill K, Li W, Chitsaz F, Derbyshire MK, Gonzales NR, Gwadz M, Lu F, Marchler GH, Song JS, Thanki N, Yamashita RA, Zheng C, Thibaud-Nissen F, Geer LY, Marchler-Bauer A, Pruitt KD. 2018. RefSeq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res 46:D851–D860. doi: 10.1093/nar/gkx1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI Prokaryotic Genome Annotation Pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This sequencing project has been deposited in GenBank under accession number CP059055. The associated BioProject and BioSample accession numbers are PRJNA645941 and SAMN15520417, respectively. The associated SRA accession numbers are SRR12296539 (Nanopore reads) and SRR12296540 (Illumina reads).
