Abstract
I present a common philosophy for implementing the EMBL and GENBANK (BBN-Los Alamos) nucleic acid sequence databases, as well as the National Biological Foundation (Dayhoff) protein sequence database. The associated FORTRAN 77 fully transportable software package includes: 1) modules for implementing each of these databases from the initial magnetic tape file, 2) modules performing a fast mnemonic access, 3) modules performing key-string access and allowing the definition of user-specific database subsets, 4) a common probe searching module allowing the stacking of multiple combined search requests over the databases. This software is particularly suitable for 32-bit mini/microcomputers but would eventually run on 16-bit computers.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Brutlag D. L., Clayton J., Friedland P., Kedes L. H. SEQ: a nucleotide sequence analysis and recombination system. Nucleic Acids Res. 1982 Jan 11;10(1):279–294. doi: 10.1093/nar/10.1.279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doolittle R. F. Similar amino acid sequences: chance or common ancestry? Science. 1981 Oct 9;214(4517):149–159. doi: 10.1126/science.7280687. [DOI] [PubMed] [Google Scholar]
- Dumas J. P., Ninio J. Efficient algorithms for folding and comparing nucleic acid sequences. Nucleic Acids Res. 1982 Jan 11;10(1):197–206. doi: 10.1093/nar/10.1.197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kanehisa M. I. Los Alamos sequence analysis package for nucleic acids and proteins. Nucleic Acids Res. 1982 Jan 11;10(1):183–196. doi: 10.1093/nar/10.1.183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orcutt B. C., George D. G., Fredrickson J. A., Dayhoff M. O. Nucleic acid sequence database computer system. Nucleic Acids Res. 1982 Jan 11;10(1):157–174. doi: 10.1093/nar/10.1.157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Staden R. A strategy of DNA sequencing employing computer programs. Nucleic Acids Res. 1979 Jun 11;6(7):2601–2610. doi: 10.1093/nar/6.7.2601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Staden R. Further procedures for sequence analysis by computer. Nucleic Acids Res. 1978 Mar;5(3):1013–1016. doi: 10.1093/nar/5.3.1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Staden R., McLachlan A. D. Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res. 1982 Jan 11;10(1):141–156. doi: 10.1093/nar/10.1.141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Staden R. Sequence data handling by computer. Nucleic Acids Res. 1977 Nov;4(11):4037–4051. doi: 10.1093/nar/4.11.4037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilbur W. J., Lipman D. J. Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci U S A. 1983 Feb;80(3):726–730. doi: 10.1073/pnas.80.3.726. [DOI] [PMC free article] [PubMed] [Google Scholar]