Abstract
With the completion of the determination of its entire genome sequence, one of the next major targets of Bacillus subtilis genomics is to clarify the whole gene regulatory network. To this end, the results of systematic experiments should be compared with the rich source of individual experimental results accumulated so far. Thus, we constructed a database of the upstream regulatory information of B.subtilis (DBTBS). The current version was constructed by surveying 291 references and contains information on 90 binding factors and 403 promoters. For each promoter, all of its known cis-elements are listed according to their positions, while these cis-elements are aligned to illustrate their consensus sequence for each transcription factor. All probable transcription factors coded in the genome were classified with the Pfam motifs. Using this database, we compared the character of B.subtilis promoters with that of Escherichia coli promoters. Our database is accessible at http://elmo.ims.u-tokyo.ac.jp/dbtbs/.
INTRODUCTION
Although the number of bacteria with their entire genome sequenced is rapidly increasing, the importance of ‘classical’ bacteria, such as Escherichia coli and Bacillus subtilis, still remains unchanged because of their wealth of experimental evidence. For example, after the completion of sequencing the entire B.subtilis genome (1), systematic function analysis projects are beginning to decipher the total gene activity picture (2). One big challenge of such function analyses is the so-called transcriptome analysis, where expression profiles of all ORFs have been studied using microarrays (3). Since it is probable that genes with similar expression profiles are co-regulated by a common transcription factor, examination of their upstream sequences is very important. Undoubtedly, a compilation of known regulatory cis-elements would become a useful tool for this kind of analysis. Interpretation of transcriptional regulatory information of genes is also an important theme of bioinformatics. For example, our group has attempted to predict the sigma-dependency of ORFs found in B.subtilis (4). Such prediction models were constructed based on the compilation of known promoter sequences for each sigma factor. In E.coli, RegulonDB (5) contains both experimental and predicted cis-elements and is suited to the above-mentioned analyses while there are no databases containing comprehensive information of transcription in B.subtilis. Thus, we constructed a database, DBTBS, which will be a useful resource not only for theoretical studies but also for the works of experimental researchers.
OVERVIEW OF THE DATABASE
Compared with RegulonDB (5), which has been constructed as a powerful but slightly complicated relational database, DBTBS employs a simpler structure. It consists of two inter-related tables: one is a compilation of experimentally characterized promoters where the positions of known binding sites of transcription factors, including sigma factors, are given with links to PubMed reference information (6). These positions are shown both in table format and in a gif-format picture. Currently, 403 promoters are contained and are classified according to the function of their gene products; we used the classification by SubtiList (7). Although it is still incomplete, the information about operon structure and the experimental techniques for the characterization of binding sites are given when available. Genes are linked to both the Japanese BSORF database (http://bacillus.genome.ad.jp/) and the transcription profile of early to middle sporulation (8) provided at the Losick laboratory (http://mcb.harvard.edu/losick/).
Furthermore, to facilitate the discovery of uncharacterized cis-elements, we added the alignment of the upstream 300 nt of B.subtilis genes with those of orthologous genes of Bacillus halodurans (9) and Bacillus stearothermophilus (sequence data fetched from http://www.genome.ou.edu/bstearo.html). Namely, a local alignment program [‘LALIGN’ by W. Pearson (University of Virginia, USA); downloaded from ftp://ftp.virginia.edu/pub/fasta] was used to detect locally conserved regions and they were aligned with the B.subtilis sequence. Orthologous genes of B.halodurans were supplied by I. Uchiyama (National Institute of Basic Biology, Japan) and those of B.stearothermophilus were obtained by finding the best hit counterparts from both genomes. Nucleotides conserved across three species were highlighted except where they belong to other coding regions. Many known cis-elements appear to be conserved through the three species (G.Terai, T.Takagi and K.Nakai, manuscript in preparation).
The other is a table of 90 transcription factors including sigma factors. For each factor, the collection of its known binding sites is given with the link to the PubMed database (7). Their consensus pattern (if any) is highlighted in red. The main source of our database is a literature survey of 291 references including the compilation works such as Helmann (10). The entries are also linked to other databases such as SWISS-PROT (11) and SubtiList (7).
FEATURES OF THE TRANSCRIPTION SYSTEM OF B.SUBTILIS
For users’ convenience, we systematically classified all potential transcription factors in the B.subtilis genome mainly based on the Pfam motifs (12). Table 1 summarizes its result. The total number of independent transcription factors is estimated to be 275, which is comparable to the number, 314, for E.coli estimated by a similar work (13).
Table 1. Classification of B.subtilis transcription factors (including sigma factors).
Family |
Subfamily |
Annotated |
Predicted |
Sigma factors | σ70 | 8 | 1 |
σ54 | 1 | 0 | |
σ70 ECF | 3 | 5 | |
Helix–turn–helix family | MarR | 1 | 22 |
LacI | 5 | 6 | |
GntR | 3 | 18 | |
LysR (HTH1) | 4 | 16 | |
ArsR (HTH5) | 1 | 6 | |
DeoR | 3 | 4 | |
AraC/XylS (HTH2) | 1 | 11 | |
GerE | 3 | 7 | |
CRP | 1 | 1 | |
Xre (HTH3) | 3 | 14 | |
MerR | 5 | 5 | |
TetR | 1 | 18 | |
AsnC | 4 | 3 | |
LexA | 1 | 0 | |
HTH6• | 0 | 2 | |
Other families | Trans, Reg. C | 2 | 11 |
Response Reg. | 5 | 30 | |
Fur | 2 | 1 | |
σ54 binding factors | 4 | 1 | |
Bgl antitermin. | 6 | 2 | |
CSD | 0 | 3 | |
IclR | 1 | 0 | |
GreA | 0 | 1 | |
Fe-dep. Repress. | 0 | 1 | |
HrcA | 1 | 0 | |
Arg repressor | 1 | 0 | |
No families assigned | 21 | 17 | |
Total | 91 | 206 |
Note that these classifications are based on the presence of functional domains and are not mutually exclusive.
To see if there are some specific features of B.subtilis promoters, we also took some statistics on B.subtilis σA-dependent promoters and compared them with those on E.coli σ70-dependent promoters (14). The ratio of repressible to activatable promoters is about 6:4, which is similar to the value of E.coli. But it seems that the number of promoters with repeated homologous binding sites is significantly fewer in B.subtilis than in E.coli (Table 2). In addition, we noticed that, unlike E.coli promoters, B.subtilis promoters can have some binding regions of activators even at the downstream regions of transcriptional initiation sites (data not shown).
Table 2. Comparison of B.subtilis 91 σA-regulated promoters and E.coli 132 σ70-regulated promoters [(data adopted from Bateman et al. (12)].
Category |
B.subtilis
(%) |
E.coli
(%) |
Total number of repressible promoters | 61 | 69 |
Total number of activatable promoters | 43 | 49 |
Promoters with repressor and activator sites | 14 | 17 |
Promoters regulated by two or more different activators | 6 | 7 |
Promoters regulated by two or more different repressors | 5 | 3 |
Promoters with repeated homologous activator sites | 3 | 13 |
Promoters with repeated homologous repressor sites | 9 | 36 |
FUTURE PROSPECTS
Our database is currently accessible at http://elmo.ims.u-tokyo.ac.jp/dbtbs/. Links to other related databases such as RegulonDB and KEGG (15) are planned, and the prediction of uncharacterized positions of cis-elements should be sought. Finally, the incorporation of the results of systematic experiments of B.subtilis projects (2) should be the key of its future development.
Acknowledgments
ACKNOWLEDGEMENTS
We are grateful to the members of the Japanese B.subtilis genomics project for their help in surveying recent papers with respect to the DNA-binding proteins. We also thank John D. Helmann and Hideto Takami for providing us with their sequence data, Ikuo Uchiyama for a table of orthologous genes between B.subtilis and B.halodurans, Yukiko Nakanishi for her assistance and the B.stearothermophilus Genome Sequencing Project funded by NSF EPSCoR Program for releasing unfinished sequence data. This work was supported in part by Special Coordination Funds for Promoting Science and Technology from Science and Technology Agency. K.Y., Y.F. and K.N. were also supported by a Grant-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Science and Culture of Japan.
References
- 1.Kunst F., Ogasawara,N., Moszer,I., Albertini,A.M., Alloni,G., Azevedo,V., Bertero,M.G., Bessieres,P., Bolotin,A., Borchert,S. et al. (1997) The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature, 390, 249–256. [DOI] [PubMed] [Google Scholar]
- 2.Ogasawara N. (2000) Systematic function analysis of Bacillus subtilis genes. Res. Microbiol., 151, 129–134. [DOI] [PubMed] [Google Scholar]
- 3.Ye R.W., Tao,W., Bedzyk,L., Young,T., Chen,M. and Li.,L. (2000) Global gene expression profiles of Bacillus subtilis grown under anaerobic conditions. J. Bacteriol., 182, 4458–4465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Yada T., Totoki,Y., Ishii,T. and Nakai,K., (1997) Functional prediction of B. subtilis genes from their regulatory sequences, Intell. Syst. Mol. Biol., 5, 354–357. [PubMed] [Google Scholar]
- 5.Salgado H., Sentos-Zavaleta,A., Gama-Castro,S., Millán-Zárate,D., Blattner,F.R. and Collado-Vides,J. (2000) Regulon DB (version 3.0): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res., 28, 65–67. Updated article in this issue: Nucleic Acids Res. (2001), 29, 72–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wheeler D.L., Chappey,C., Lash,A.E., Leipe,D.D., Madden,T.L., Schuler,G.D., Tatusova,T.A. and Rapp,B.A. (2000) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 28, 10–14. Updated article in this issue: Nucleic Acids Res. (2001), 29, 11–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Moszer I., Glaser,P., and Danchin,A. (1995) SubtiList: a relational database for the Bacillus subtilis genome, Microbiology, 141, 261–268. [DOI] [PubMed] [Google Scholar]
- 8.Fawcett P., Eichenberger,P., Losick,R., and Youngman,P. (2000) The transcriptional profile of early to middle sporulation in Bacillus subtilis. Proc. Natl Acad. Sci. USA, 97, 8063–8068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Takami H., Nakasone,K., Takaki,Y., Maeno,G., Sasaki,R., Masui,N., Fuji,F., Hirama,C., Nakamura,Y., Ogasawara,N., Kuhara,S. and Horikoshi,K. (2000) Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis, Nucleic Acids Res., 28, 4317–4331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Helmann J.D. (1995) Compilation and analysis of Bacillus subtilis sigmaA-dependent promoter sequences: evidence for extended contact between RNA polymerase and upstream promoter DNA, Nucleic Acids Res., 23, 2351–2360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bairoch A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 26, 38–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bateman A., Birney,E., Durbin,R., Eddy,S.R., Howe,K.L. and Sonnhammer,E.L.L. (2000) The Pfam protein families database. Nucleic Acids Res., 26, 263–266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pérez-Rueda E. and Collado-Vides,J. (2000) The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12. Nucleic Acids Res., 28, 1838–1847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gralla J.D., and Collado-Vides,J. (1996) Organization and function of transcription regulatory elements, In Neidhardt,F.C. and Curtiss,R. (eds), Cellular and Molecular Biology: Eschericha coli and Salmonella, 2nd edn. American Society for Microbiology, Washington, DC, pp. 1232–1245.
- 15.Kanehisa M. and Goto,S. (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res., 28, 27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]