Abstract
In the context of the international project aimed at sequencing the whole genome of Bacillus subtilis we have developed a non-redundant, fully annotated database of sequences from this organism. Starting from the B.subtilis sequences available in the EMBL, GenBank and DDBJ collections we have removed all encountered duplications and then added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage, etc.) We have also added cross-references to the EMBL, MEDLINE, SWISS-PROT and ENZYME data banks. The present system results from merging of the NRSub and SubtiList databases and the sequence contigs used in the two systems are identical. NRSub is distributed as a flatfile in EMBL format (which is supported by most sequence analysis software packages) and as an ACNUC database, while SubtiList is distributed as a relational database under 4th Dimension. It is possible to access the data through two dedicated World Wide Web servers located in France and Japan.
Full Text
The Full Text of this article is available as a PDF (70.7 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank: current status. Nucleic Acids Res. 1994 Sep;22(17):3578–3580. [PMC free article] [PubMed] [Google Scholar]
- Bairoch A. The ENZYME data bank. Nucleic Acids Res. 1994 Sep;22(17):3626–3627. [PMC free article] [PubMed] [Google Scholar]
- Benson D. A., Boguski M., Lipman D. J., Ostell J. GenBank. Nucleic Acids Res. 1994 Sep;22(17):3441–3444. doi: 10.1093/nar/22.17.3441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borodovsky M., Rudd K. E., Koonin E. V. Intrinsic and extrinsic approaches for detecting genes in a bacterial genome. Nucleic Acids Res. 1994 Nov 11;22(22):4756–4767. doi: 10.1093/nar/22.22.4756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Demerec M., Adelberg E. A., Clark A. J., Hartman P. E. A proposal for a uniform nomenclature in bacterial genetics. Genetics. 1966 Jul;54(1):61–76. doi: 10.1093/genetics/54.1.61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duret L., Mouchiroud D., Gouy M. HOVERGEN: a database of homologous vertebrate genes. Nucleic Acids Res. 1994 Jun 25;22(12):2360–2365. doi: 10.1093/nar/22.12.2360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Emmert D. B., Stoehr P. J., Stoesser G., Cameron G. N. The European Bioinformatics Institute (EBI) databases. Nucleic Acids Res. 1994 Sep;22(17):3445–3449. doi: 10.1093/nar/22.17.3445. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etzold T., Argos P. SRS--an indexing and retrieval tool for flat file data libraries. Comput Appl Biosci. 1993 Feb;9(1):49–57. doi: 10.1093/bioinformatics/9.1.49. [DOI] [PubMed] [Google Scholar]
- Etzold T., Argos P. Transforming a set of biological flat file libraries to a fast access network. Comput Appl Biosci. 1993 Feb;9(1):59–64. doi: 10.1093/bioinformatics/9.1.59. [DOI] [PubMed] [Google Scholar]
- Fichant G., Gautier C. Statistical method for predicting protein coding regions in nucleic acid sequences. Comput Appl Biosci. 1987 Nov;3(4):287–295. doi: 10.1093/bioinformatics/3.4.287. [DOI] [PubMed] [Google Scholar]
- George D. G., Barker W. C., Mewes H. W., Pfeiffer F., Tsugita A. The PIR-International Protein Sequence Database. Nucleic Acids Res. 1994 Sep;22(17):3569–3573. [PMC free article] [PubMed] [Google Scholar]
- Gouy M., Gautier C., Attimonelli M., Lanave C., di Paola G. ACNUC--a portable retrieval system for nucleic acid sequence databases: logical and physical designs and usage. Comput Appl Biosci. 1985 Sep;1(3):167–172. doi: 10.1093/bioinformatics/1.3.167. [DOI] [PubMed] [Google Scholar]
- Itaya M., Tanaka T. Complete physical map of the Bacillus subtilis 168 chromosome constructed by a gene-directed mutagenesis method. J Mol Biol. 1991 Aug 5;220(3):631–648. doi: 10.1016/0022-2836(91)90106-g. [DOI] [PubMed] [Google Scholar]
- Kunisawa T. Identification and chromosomal distribution of DNA sequence segments conserved since divergence of Escherichia coli and Bacillus subtilis. J Mol Evol. 1995 Jun;40(6):585–593. doi: 10.1007/BF00160505. [DOI] [PubMed] [Google Scholar]
- Kunst F., Devine K. The project of sequencing the entire Bacillus subtilis genome. Res Microbiol. 1991 Sep-Oct;142(7-8):905–912. doi: 10.1016/0923-2508(91)90072-i. [DOI] [PubMed] [Google Scholar]
- Kunst F., Vassarotti A., Danchin A. Organization of the European Bacillus subtilis genome sequencing project. Microbiology. 1995 Feb;141(Pt 2):249–255. doi: 10.1099/13500872-141-2-249. [DOI] [PubMed] [Google Scholar]
- Moszer I., Glaser P., Danchin A. SubtiList: a relational database for the Bacillus subtilis genome. Microbiology. 1995 Feb;141(Pt 2):261–268. doi: 10.1099/13500872-141-2-261. [DOI] [PubMed] [Google Scholar]
- Ogasawara N., Fujita Y., Kobayashi Y., Sadaie Y., Tanaka T., Takahashi H., Yamane K., Yoshikawa H. Systematic sequencing of the Bacillus subtilis genome: progress report of the Japanese group. Microbiology. 1995 Feb;141(Pt 2):257–259. doi: 10.1099/13500872-141-2-257. [DOI] [PubMed] [Google Scholar]
- Perrière G., Gautier C. ColiGene: object-centered representation for the study of E coli gene expressivity by sequence analysis. Biochimie. 1993;75(5):415–422. [PubMed] [Google Scholar]
- Perrière G., Gouy M., Gojobori T. NRSub: a non-redundant data base for the Bacillus subtilis genome. Nucleic Acids Res. 1994 Dec 25;22(25):5525–5529. doi: 10.1093/nar/22.25.5525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp P. M., Li W. H. The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987 Feb 11;15(3):1281–1295. doi: 10.1093/nar/15.3.1281. [DOI] [PMC free article] [PubMed] [Google Scholar]
