Abstract
We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship.
Full Text
The Full Text of this article is available as a PDF (126.0 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Adzhubei A. A., Adzhubei I. A., Krasheninnikov I. A., Neidle S. Non-random usage of 'degenerate' codons is related to protein three-dimensional structure. FEBS Lett. 1996 Dec 9;399(1-2):78–82. doi: 10.1016/s0014-5793(96)01287-2. [DOI] [PubMed] [Google Scholar]
- Adzhubei A. A., Sternberg M. J. Left-handed polyproline II helices commonly occur in globular proteins. J Mol Biol. 1993 Jan 20;229(2):472–493. doi: 10.1006/jmbi.1993.1047. [DOI] [PubMed] [Google Scholar]
- Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
- Brunak S., Engelbrecht J. Protein structure and the sequential structure of mRNA: alpha-helix and beta-sheet signals at the nucleotide level. Proteins. 1996 Jun;25(2):237–252. doi: 10.1002/(SICI)1097-0134(199606)25:2<237::AID-PROT9>3.0.CO;2-E. [DOI] [PubMed] [Google Scholar]
- Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
- Krasheninnikov I. A., Komar A. A., Adzhubei I. A. Chastota ispol'zovaniia kodonov v mRNK i kodirovanie domennoi struktury belka. Dokl Akad Nauk SSSR. 1989;305(4):1006–1012. [PubMed] [Google Scholar]
- Krasheninnikov I. A., Komar A. A., Adzhubei I. A. Nonuniform size distribution of nascent globin peptides, evidence for pause localization sites, and a contranslational protein-folding model. J Protein Chem. 1991 Oct;10(5):445–453. doi: 10.1007/BF01025472. [DOI] [PubMed] [Google Scholar]
- Krasheninnikov I. A., Komar A. A., Adzhubei I. A. Rol' vyrozhdennosti koda v opredelenii puti kotransliatsionnogo svorachivaniia belka. Biokhimiia. 1989 Feb;54(2):187–200. [PubMed] [Google Scholar]
- Nakamura Y., Wada K., Wada Y., Doi H., Kanaya S., Gojobori T., Ikemura T. Codon usage tabulated from the international DNA sequence databases. Nucleic Acids Res. 1996 Jan 1;24(1):214–215. doi: 10.1093/nar/24.1.214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp P. M., Tuohy T. M., Mosurski K. R. Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res. 1986 Jul 11;14(13):5125–5143. doi: 10.1093/nar/14.13.5125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thanaraj T. A., Argos P. Protein secondary structural types are differentially coded on messenger RNA. Protein Sci. 1996 Oct;5(10):1973–1983. doi: 10.1002/pro.5560051003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thanaraj T. A., Argos P. Ribosome-mediated translational pause and protein domain organization. Protein Sci. 1996 Aug;5(8):1594–1612. doi: 10.1002/pro.5560050814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wada K., Aota S., Tsuchiya R., Ishibashi F., Gojobori T., Ikemura T. Codon usage tabulated from the GenBank genetic sequence data. Nucleic Acids Res. 1990 Apr 25;18 (Suppl):2367–2411. doi: 10.1093/nar/18.suppl.2367. [DOI] [PMC free article] [PubMed] [Google Scholar]