Abstract
Lactobacillus suebicus is important in the generation of particular flavors and in other ripening processes associated with apple mash. Here, we present the draft genome sequence of the type strain Lactobacillus suebicus KCTC 3549 (2,656,936 bp, with a G+C content of 39.0%), which consists of 143 large contigs (>100 bp).
GENOME ANNOUNCEMENT
The lactic acid bacteria possess a large number of metabolic properties that are responsible for their successful use as starter cultures in the commercial production of fermented dairy, meat, and vegetable products and beverages (10). The genus Lactobacillus represents the largest group of rod-shaped organisms within the lactic acid bacteria. Some members of this group of organisms are important in the generation of particular flavors and in other ripening processes associated with apple mash. Our laboratory received the strain Lactobacillus suebicus KCTC 3549 from the Korean Collection for Type Cultures (KCTC), and it was grown under standard conditions (lactobacillus MRS broth [catalog no. 0881; Difco], 30°C, and 200 rpm). The genomic DNA was extracted from the cultured bacteria using the alkaline lysis method (3). We then sequenced the genome of L. suebicus KCTC 3549; genome sequencing of this organism had not been completed when our sequencing project began, according to the Genomes OnLine Database (GOLD) (8).
Here we report the genome sequence of L. suebicus KCTC 3549, obtained using a whole-genome shotgun strategy (5) by Roche 454 GS (FLX Titanium) pyrosequencing by synthesis of paired reads (270,716 reads, totaling ∼72.6 Mb; ∼27.3-fold coverage of the genome) at the Genome Resource Center, KRIBB (Korea Research Institute of Bioscience and Biotechnology). Genome sequences from pyrosequencing were processed by Roche's software according to the manufacturer's instructions. All of the paired reads were assembled using Newbler Assembler 2.3 (454 Life Science), which generated 143 contigs (BACO01000001 to BACO01000143). The annotation was done by merging the results obtained from the RAST (Rapid Annotation using Subsystem Technology) server (1), the Glimmer 3.02 modeling software package (4), tRNAscan-SE 1.21 (9), and RNAmmer 1.2 (7). In addition, the contigs were searched against the KEGG (6), UniProt (2), and COG (Clusters of Orthologous Groups) (11) databases to annotate the gene description. The G+C (mole percent) measurements were calculated using the genome sequences.
The uncompleted draft genome includes 2,656,936 bases and is comprised of 2,543 predicted coding sequences (CDSs), with a G+C content of 39.0%. There are single predicted copies of the 5S, 16S, and 23S rRNA genes and 55 predicted tRNAs. There are 287 subsystem represented in the genome, and we used this information to reconstruct the metabolic network (determined using the RAST server). There are many carbohydrate subsystem features, including genes involved in central carbohydrate, monosaccharide, and fermentation metabolism. There are many protein metabolism features, including protein biosynthesis machinery such as 32 large subunits and 21 small subunits of the bacterial ribosome and universal GTPases. There are also many amino acids and derivatives subsystem features, including lysine, threonine, methionine, and cysteine. The CDSs annotated by COG were classified into 7 COG categories (C, G, K, L, N, R, and S) and 21 COGs. There are 6 alcohol dehydrogenase enzymes (EC 1.1.1.1) and a galactose-1-phosphate uridylyltransferase enzyme (EC 2.7.7.10). In addition, there are 29 predicted genes related to fatty acids and to resistance to antibiotics and toxic compounds.
Nucleotide sequence accession numbers.
This Whole Genome Shotgun project has been deposited at GenBank under the accession no. BACO00000000. The version described in this paper is the first version, BACO01000000.
Acknowledgments
This work was supported by grant 2009-0084206 from the Ministry of Education, Science and Technology, Republic of Korea.
We thank Kun-Hyang Park and Min-Young Kim for their work in sequencing and assembling the genome, respectively.
REFERENCES
- 1. Aziz R. K., et al. 2008. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Bairoch A., et al. 2005. The Universal Protein Resource (UniProt). Nucleic Acids Res. 33:D154–D159 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Birnboim H. C., Doly J. 1979. A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7:1513–1523 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Delcher A. L., Bratke K. A., Powers E. C., Salzberg S. L. 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23:673–679 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Fleischmann R. D., et al. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496–512 [DOI] [PubMed] [Google Scholar]
- 6. Kanehisa M., Goto S., Kawashima S., Okuno Y., Hattori M. 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32:D277–D280 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Lagesen K., et al. 2007. RNAmmer: consistent and rapid annotation of rRNA genes. Nucleic Acids Res. 35:3100–3108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Liolios K., et al. 2010. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 38:D346–D354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Lowe T. M., Eddy S. R. 1997. tRNAscan-SE: a program for improved detection of tRNA genes in genomic sequence. Nucleic Acids Res. 25:955–964 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Stiles M. E., Holzapfel W. H. 1997. Lactic acid bacteria of foods and their current taxonomy. Int. J. Food Microbiol. 36:1–29 [DOI] [PubMed] [Google Scholar]
- 11. Tatusov R. L., et al. 2003. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4:41. [DOI] [PMC free article] [PubMed] [Google Scholar]
