Skip to main content
Journal of Bacteriology logoLink to Journal of Bacteriology
. 2011 Jun;193(12):3152–3153. doi: 10.1128/JB.00405-11

Whole-Genome Sequences of Four Mycobacterium bovis BCG Vaccine Strains

Yuanlong Pan 1,2, Xi Yang 1,2, Jia Duan 1,2, Na Lu 1, Andrea S Leung 3, Vanessa Tran 3, Yongfei Hu 1, Na Wu 1,2, Di Liu 1, Zhiming Wang 1,2, Xuping Yu 4, Chen Chen 5, Yuanyuan Zhang 5, Kanglin Wan 5, Jun Liu 3,*, Baoli Zhu 1,*
PMCID: PMC3133212  PMID: 21478353

Abstract

Mycobacterium bovis Bacille Calmette-Guérin (BCG) is the only vaccine available against tuberculosis (TB). A number of BCG strains are in use, and they exhibit biochemical and genetic differences. We report the genome sequences of four BCG strains representing different lineages, which will help to design more effective TB vaccines.

TEXT

Mycobacterium bovis Bacille Calmette-Guérin (BCG) was derived from M. bovis by continuous in vitro passaging from 1908 to 1921 (12). Distribution and widespread use of BCG started around 1924 and were accompanied by further in vitro passaging until 1960s. The in vitro evolution of BCG has resulted in a number of BCG substrains that are heterogenic (12, 17). The protective efficacy of BCG against pulmonary TB varies from 0 to 80%, and the heterogeneity of BCG strains is thought to be one of the contributing factors (1).

A molecular genealogy of BCG based on genomic deletions and duplications has been established (2, 4, 10). More than a dozen BCG strains were placed into four major groups (4). Thus far, the complete genome sequence has been determined for only two BCG strains, BCG-Pasteur (4) and BCG-Tokyo (15). Here we present draft genome sequences of four BCG strains obtained by using a whole-genome shotgun strategy.

BCG-China, BCG-Denmark 1331 (ATCC 35733), BCG-Russia (ATCC 35740), and BCG-Tice (ATCC 35743) were described previously (2, 10). Genomic DNA was sequenced with an Illumina genome analyzer. The genome coverages were 63.2-, 65.6-, 59.1-, and 70.7-fold for BCG-China, -Danish, -Russia, and -Tice, respectively. The pair-end reads were assembled by SOAPdenovo (11). Nearly 500 gaps were filled by multiplex PCR and primer walking, which yielded 29, 32, 36, and 28 contigs for the four strains, respectively.

Annotation was done using MetaGeneAnnotator (14), tRNAscan-SE 1.21 (13), RNAmmer 1.2 (9), and Tandem Repeats Finder 4.04 (3). In addition, the contigs were searched against the KEGG (8), Pfam (6), COGs (16), and NCBI NR protein databases.

The draft genome sequences of four BCG strains have similar sizes, approximately 4.27 Mb, and all have about 4,030 predicted genes, with a G+C content of 65%. Each genome has a single copy of predicted 5S, 16S, and 23S rRNA genes and 45 copies of predicted tRNAs genes. The repeat regions of each genome vary from 638 (BCG-Russia) to 679 (BCG-Tice). Genes annotated by the COGs database can be classified into 21 COG (clusters of orthologous groups) categories.

Comparative genomic analyses were performed using the genome sequence of M. bovis (7) and M. tuberculosis H37Rv (5) as references. More than 350 single-nucleotide polymorphisms (SNPs) between BCG and M. bovis genomes were identified. About 170 SNPs were specific to BCG strains (not present in M. bovis or M. tuberculosis), which likely contributes to the attenuation of BCG. Each BCG strain contains dozens of SNPs not shared by other BCG strains. Consistent with previous studies (4, 10), we confirmed the presence of tandem duplications: DU-I in BCG-Russia, DU-III in BCG-China and BCG-Denmark, and DU-IV in BCG-Tice. Future studies of these genomic polymorphisms will help to identify molecular factors that influence BCG clinical properties, including safety, immunogenicity, and protective efficacy, and provide novel insights for the rational design of the next generation of TB vaccines.

Nucleotide sequence accession numbers.

The whole-genome shotgun sequences have been deposited at DDBJ/EMBL/GenBank under the accession codes AEZE00000000, AEZF00000000, AEZG00000000, and AEZH00000000 for BCG-China, -Danish, -Russia, and -Tice, respectively.

Acknowledgments

This work was supported by funding from the National Basic Research Program of China (973 Program: 2009CB522605 and 2007CB513002 to B.Z.), the Scientific Program for Key Infectious Diseases, The Ministry of Science and Technology (2009ZX10601 to B.Z.), and funding from Canadian Institutes of Health Research (CIHR) (MOP-489636 and MOP-482204 to J.L.).

Footnotes

Published ahead of print on 8 April 2011.

REFERENCES

  • 1. Behr M. A. 2002. BCG—different strains, different vaccines? Lancet Infect. Dis. 2:86–92 [DOI] [PubMed] [Google Scholar]
  • 2. Behr M. A., et al. 1999. Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 284:1520–1523 [DOI] [PubMed] [Google Scholar]
  • 3. Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic. Acids Res. 27:573–580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Brosch R., et al. 2007. Genome plasticity of BCG and impact on vaccine efficacy. Proc. Natl. Acad. Sci. U. S. A. 104:5596–5601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Cole S. T., et al. 1998. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393:537–544 [DOI] [PubMed] [Google Scholar]
  • 6. Finn R. D., et al. 2009. The Pfam protein families database. Nucleic. Acids Res. 38:D211–D222 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Garnier T., et al. 2003. The complete genome sequence of Mycobacterium bovis. Proc. Natl. Acad. Sci. U. S. A. 100:7877–7882 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Kanehisa M., Goto S., Kawashima S., Okuno Y., Hattori M. 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32:D277–D280 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Lagesen K., et al. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 35:3100–3108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Leung A. S., et al. 2008. Novel genome polymorphisms in BCG vaccine strains and impact on efficacy. BMC Genomics 9:413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Li R., et al. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265–272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Liu J., Tran V., Leung A. S., Alexander D. C., Zhu B. L. 2009. BCG vaccines Their mechanisms of attenuation and impact on safety and protective efficacy. Hum. Vaccin. 5:70–78 [DOI] [PubMed] [Google Scholar]
  • 13. Lowe T. M., Eddy S. R. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic. Acids Res. 25:955–964 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Noguchi H., Taniguchi T., Itoh T. 2008. MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res. 15:387–396 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Seki M., et al. 2009. Whole genome sequence analysis of Mycobacterium bovis bacillus Calmette-Guérin (BCG) Tokyo 172: A comparative study of BCG vaccine substrains. Vaccine 27:1710–1716 [DOI] [PubMed] [Google Scholar]
  • 16. Tatusov R. L., Galperin M. Y., Natale D. A., Koonin E. V. 2000. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28:33–36 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Walker K. B., et al. 2010. The second Geneva Consensus: Recommendations for novel live TB vaccines. Vaccine 28:2259–2270 [DOI] [PubMed] [Google Scholar]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES