Claesson et al. 10.1073/pnas.0511060103.

Supporting Information

Files in this Data Supplement:

Supporting Figure 4
Supporting Figure 5
Supporting Figure 6
Supporting Table 1
Supporting Table 2
Supporting Figure 7
Supporting Figure 8
Supporting Figure 9
Supporting Figure 10
Supporting Figure 11
Supporting Figure 12
Supporting Figure 13
Supporting Figure 14
Supporting Table 3
Supporting Table 4
Supporting Table 5
Supporting Table 6
Supporting Table 7
Supporting Text





Supporting Figure 4

Fig. 4. Phylogeny of selected Lactobacillus species based on 16S rRNA gene sequences. The phylogenetic tree was constructed by using the neighbor-joining method. Lactobacilli species with published genome sequences available are highlighted in yellow. Some other species whose genomes are being sequenced are highlighted in green. L. salivarius is highlighted in red. A separate clade to which it belongs is indicated by a dashed red box.





Supporting Figure 5

Fig. 5. ARTEMIS COMPARISON TOOL (ACT) alignment of Lactobacillus salivarius UCC118 against L. plantarum WCFS1, L. sakei 23K, L. johnsonii NCC533, L. acidophilus NCFM, and Enterococcus faecalis V583. Only TBLASTX matches with scores higher than 300 bits are shown. The horizontal gray bars represent forward and reverse strands. The red and blue lines between genomes represent direct and inverted matches, respectively.





Supporting Figure 6

Fig. 6. A shufflon associated with a type I restriction enzyme locus. Black arrows show the five different possible inversions taking place at six inverted repeat sites (blue boxes) within genes for the HsdS subunits. White boxes represent inactive inverted repeats. All nine possible combinations were confirmed by sequence reads; AB, AB', AB'', A’B, A'B', A’B'', A''B, A''B', and A''B''.





Supporting Figure 7

Fig. 7. D-Alanine metabolism in L. salivarius UCC118. Green, chromosome-encoded genes. Red, pMP118-encoded genes.





Supporting Figure 8

Fig. 8. Glycine, serine and threonine metabolism in L. salivarius UCC118. Green, chromosome-encoded genes. Red, pMP118-encoded genes.





Supporting Figure 9

Fig. 9. Exopolysaccharide (EPS) biosynthesis gene clusters in L. salivarius UCC118.





Supporting Figure 10

Fig. 10. Nucleotide sugar metabolism in L. salivarius UCC118. Green, chromosome-encoded genes.





Supporting Figure 11

Fig. 11. Glycolysis and gluconeogenesis in L. salivarius UCC118. Green, chromosome-encoded genes. Red, pMP118-encoded genes. Because UCC118 does not have a dedicated glucose-specific phosphotransferase system, we predict that glucose will be taken up by one of the mannose-specific phosphotransferase systems shown in Fig. 13.





Supporting Figure 12

Fig. 12. Pyruvate metabolism in L. salivarius UCC118. Green, chromosome-encoded genes. Red, pMP118-encoded genes. Yellow, enzymes encoded by genes on both.





Supporting Figure 13

Fig. 13. Fructose and mannose metabolism in L. salivarius UCC118. Green, chromosome-encoded genes. Red, pMP118-encoded genes. Yellow, enzymes encoded by genes on both. Four putative mannose-specific phosphotransferase systems were identified, three on the chromosome and one on pMP118, as shown around the yellow-colored enzyme EC2.7.1.69.





Supporting Figure 14

Fig. 14. Aminosugars metabolism in L. salivarius UCC118. Green, chromosome-encoded genes. Red, pMP118-encoded genes.





Supporting Text

Results

General Genome Features.

A single copy of a Clustered Regularly Interspaced Palindromic Repeats (CRISPR) DNA repeat element was identified at position 120,820–122,525 bp in the chromosome. This comprised of 28 direct repeats of 36 bp separated by »30-bp-long spacers of unique sequence. Immediately upstream of this repeat locus, three CRISPR-associated proteins were located, Cas1, Cas2, and a protein belonging to the SAG0897 family. The only Lactobacillus species in which CRISPR loci have previously been reported is L.acidophilus NCFM (1). Although no clear biological functional of CRISPR loci is known, it is thought to be a mobile element, and it was recently suggested that CRISPR spacers are of extrachromosomal origin (2).

Biosynthetic Capabilities.

Five gene products involved in phenylalanine, tyrosine, and tryptophan biosynthesis are located on pMP118. Three of these genes have chromosomal paralogs, but the genes for shikimate dehydrogenase (LSL_1795), and 3-dehydroquinate dehydratase (LSL_1796) are uniquely encoded by adjacent pMP118 genes. These additional genes produce a gene complement for this pathway close to that of L. acidophilus but do not apparently constitute a complete pathway, however.

L. salivarius

UCC118 has incomplete pathways for the biosynthesis of the major vitamins and cofactors, including thiamine, riboflavin, vitamin B6, nicotinate and nicotinamide, pantothenate and CoA, biotin, and folate (data not shown). However, L. salivarius UCC118 can probably take up riboflavin by a putative riboflavin transporter (LSL_0937) and convert the riboflavin to flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) consecutively by a bifunctional enzyme riboflavin kinase (LSL_0575). L. salivarius UCC118 is likely to generate NADP+/NAD+ from nicotinamide via nicotinate. CoA can be synthesized from pantothenate and L-cysteine, which is also the case for L. plantarum, L. acidophilus, and L. johnsonii. Genes for enzymes involved in the complete folate-dependent one-carbon-pool are present in L. salivarius UCC118, as they are in L. plantarum and L. acidophilus.

Peptidoglycan and Exopolysaccharide Biosynthesis.

L. salivarius UCC118 appears to lack an undecaprenol kinase, which could not be identified. However, unlike the other four sequenced Lactobacillus genomes, it has a putative D-alanine aminotransferase (LSL_1670) for cross-linking peptidoglycan. It is also distinguished from L. plantarum by possessing a gene (LSL_0016) for UDP-N-acetylmuramoyl-L-alanyl-D-glutamate-L-lysine ligase (MurE synthetase), which also is present in L. acidophilus, L. johnsonii, and L. sakei.

With regard to nucleotide sugar metabolism for EPS precursor biosynthesis, L. salivarius UCC118 does not have a gene for L-2-hydroxyisocaproate dehydrogenase (EC 1.1.1.-), which is present in the L. plantarum and L. johnsonii genomes to convert UDP-D-galactose to UDP-D-galacturonate. However, L. salivarius UCC118 has another alternative mechanism to generate UDP-D-galacturonate. This mechanism is reflected by the presence of UDP-glucose 6-dehydrogenase (1.1.1.22, LSL0979) and UDP-glucuronate 4-epimerase (5.1.3.6, LSL0980), which allows the conversion of UDP-D-glucose to UDP-D-glucuronate and subsequently to UDP-D-galacturonate. These two enzymes are only found as a pair in the L. salivarius UCC118 genome and not the other sequenced Lactobacillus genomes.

Sensing and Regulation.

The L. salivarius genome contains nine genetically coupled two-component regulatory systems (Table 6), which includes the megaplasmid-encoded AbpK-AbpR system involved in control of expression of the bacteriocin Abp118 (3). The megaplasmid also includes a two-component regulator system linked to a protein tyrosine phosphatase, and an ABC transporter cluster. In addition, two putative orphan sensors were identified. There are 62 recognizable transcription regulators encoded by the genome, 44 of which can be assigned to regulatory protein families (Table 7). The most abundant class is the MarR family (nine members). Seven transcriptional regulators are encoded by the megaplasmid, and three by pSF118-44, emphasizing the likely contribution of the variable genome content of L. salivarius to regulation of expression and, ultimately, to phenotype. The number of two-component systems in the L. salivarius UCC118 genome is the same as that reported for L. johnsonii and L. acidophilus (1, 4) but less than the 10 systems in L. sakei (5), or the 13 systems present in L. plantarum (6), which also has more transcriptional regulatory proteins. This finding is consistent with the notion that L. salivarius, L. johnsonii, and L. acidophilus are confined to relatively consistent environmental conditions in the GI tract compared with the more free-living lifestyle of L. plantarum.

Methods

Sequencing, Assembly, Annotation, and Analysis.

A genomic library was constructed by shearing genomic DNA and cloning the small (1.5-3 kb) and medium (3-5 kb) inserts into plasmids pGEM-T (Promega) and pWSK29 (7), respectively. Bulk sequencing was performed by MWG Biotech (Ebersberg, Germany). About 30,000 sequence reads were assembled into 110 contigs with an 11-fold coverage and error rate of [lt]1 per 1.7 × 107 by using PRHED and PHRAP (www.phrap.org) within the Staden package (8). Sequence gaps were closed by primer walking on clone templates. Physical gaps were closed by sequencing the amplicons generated by either multiplex PCR or inverse PCR approaches. Automatic gene predictions were obtained by combining GLIMMER (9), ZCURVE (10), ORPHEUS (11), and CRITICA (12) by using the metatool YACOP (13). The gene calls were then further examined and manually adjusted by using FASTA (14) alignments in ARTEMIS (15) and BLASTXTRACT (16). Signal peptide domains and transmembrane spanning regions were detected by SIGNALP-4 (17) and TMHMM (18), respectively. The locations of tRNAs were determined by using TRNASCAN-SE (19).

For comparative analyses, orthology was defined as reciprocal best FASTA matches, with at least 30% identity over at least 80% of the lengths of both protein.

API 50 Carbohydrate Fermentation Tests.

The carbohydrate fermenting profile of L. salivarius UCC118 was determined by using API 50 CH strips consisting of 50 carbohydrates and their derivatives in conjunction with API 50 CHL medium (BioMérieux, Marcy-I’Etoile, France). Freshly grown colonies of strain UCC118 were harvested and resuspended in sterile water to achieve a cell density of 1010 colony-forming units/ml. Cell suspension (200 ml) was inoculated into 10 ml API 50 CHL medium and mixed gently by inversion. One hundred twenty microliters of this suspension was inoculated into API 50 CH strips that were then overlaid with paraffin to maintain anaerobic conditions. Incubation was carried out at 37°C for 48 h. Fermentation was indicated by a color change from purple to yellow in the strip cupule.

Detection of Heterofermentative End Products.

Lactic acid, acetic acid, and ethanol were determined by high performance liquid chromatography (HPLC). Sampling was performed by taking 1 ml of culture supernatant and filtering through a 0.45-mm sterile syringe filter into a HPLC vial stored on ice. Samples were then analyzed on an LKB Bromma 2150 HPLC system equipped with a Shodex RI-71 refractive index detector and a Highchrom heating block. A Rezex 8 m 8% organic acid column (300 × 7.8 mm, Phenomenex, Torrance, CA) was used with 0.005 M H2SO4 as the mobile phase at a flow rate of 0.6 ml/min. The temperature of the column is maintained at 65°C. The injection volume was 20 ml.

1. Altermann, E., Russell, W. M., Azcarate-Peril, M. A., Barrangou, R., Buck, B. L., McAuliffe, O., Souther, N., Dobson, A., Duong, T., Callanan, M., et al. (2005) Proc. Natl. Acad. Sci. USA 102, 3906–3912.

2. Bolotin, A., Quinquis, B., Sorokin, A. & Ehrlich, S. D. (2005) Microbiology 151, 2551–2561.

3. Flynn, S., van Sinderen, D., Thornton, G. M., Holo, H., Nes, I. F. & Collins, J. K. (2002) Microbiology 148, 973–84.

4. Pridmore, R. D., Berger, B., Desiere, F., Vilanova, D., Barretto, C., Pittet, A. C., Zwahlen, M. C., Rouvet, M., Altermann, E., Barrangou, R., et al. (2004) Proc. Natl. Acad. Sci. USA 101, 2512–2517.

5. Chaillou, S., Champomier-Verges, M. C., Cornet, M., Crutz-Le Coq, A. M., Dudez, A. M., Martin, V., Beaufils, S., Darbon-Rongere, E., Bossy, R., Loux, V. & Zagorec, M. (2005) Nat. Biotechnol. 23, 1527–1533.

6. Kleerebezem, M., Boekhorst, J., van Kranenburg, R., Molenaar, D., Kuipers, O. P., Leer, R., Tarchini, R., Peters, S. A., Sandbrink, H. M., Fiers, M. W., et al. (2003) Proc. Natl. Acad. Sci. USA 100, 1990–1995.

7. Wang, R. F. & Kushner, S. R. (1991) Gene 100, 195–199.

8. Staden, R., Beal, K. F. & Bonfield, J. K. (2000) Methods Mol. Biol. 132, 115–130.

9. Delcher, A. L., Harmon, D., Kasif, S., White, O. & Salzberg, S. L. (1999) Nucleic Acids Res. 27, 4636–4641.

10. Guo, F. B., Ou, H. Y. & Zhang, C. T. (2003) Nucleic Acids Res. 31, 1780–1789.

11. Frishman, D., Mironov, A., Mewes, H. W. & Gelfand, M. (1998) Nucleic Acids Res. 26, 2941–2947.

12. Badger, J. H. & Olsen, G. J. (1999) Mol. Biol. Evol. 16, 512–524.

13. Tech, M. & Merkl, R. (2003) In Silico Biol. (Gedrukt) 3, 441–451.

14. Pearson, W. R. & Lipman, D. J. (1988) Proc. Natl. Acad. Sci. USA 85, 2444–2448.

15. Rutherford, K., Parkhill, J., Crook, J., Horsnell, T., Rice, P., Rajandream, M. A. & Barrell, B. (2000) Bioinformatics 16, 944–945.

16. Claesson, M. J. & van Sinderen, D. (2005) Bioinformatics 21, 3667–3668.

17. Bendtsen, J. D., Nielsen, H., von Heijne, G. & Brunak, S. (2004) J. Mol. Biol. 340, 783–795.

18. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. (2001) J. Mol. Biol. 305, 567–580.

19. Fichant, G. A. & Burks, C. (1991) J. Mol. Biol. 220, 659–671.