Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2002 Nov;83(5):2781–2791. doi: 10.1016/s0006-3495(02)75287-9

Analysis of protein sequence/structure similarity relationships.

Hin Hark Gan 1, Rebecca A Perlow 1, Sharmili Roy 1, Joy Ko 1, Min Wu 1, Jing Huang 1, Shixiang Yan 1, Angelo Nicoletta 1, Jonathan Vafai 1, Ding Sun 1, Lihua Wang 1, Joyce E Noah 1, Samuela Pasquali 1, Tamar Schlick 1
PMCID: PMC1302362  PMID: 12414710

Abstract

Current analyses of protein sequence/structure relationships have focused on expected similarity relationships for structurally similar proteins. To survey and explore the basis of these relationships, we present a general sequence/structure map that covers all combinations of similarity/dissimilarity relationships and provide novel energetic analyses of these relationships. To aid our analysis, we divide protein relationships into four categories: expected/unexpected similarity (S and S(?)) and expected/unexpected dissimilarity (D and D(?)) relationships. In the expected similarity region S, we show that trends in the sequence/structure relation can be derived based on the requirement of protein stability and the energetics of sequence and structural changes. Specifically, we derive a formula relating sequence and structural deviations to a parameter characterizing protein stiffness; the formula fits the data reasonably well. We suggest that the absence of data in region S(?) (high structural but low sequence similarity) is due to unfavorable energetics. In contrast to region S, region D(?) (high sequence but low structural similarity) is well-represented by proteins that can accommodate large structural changes. Our analyses indicate that there are several categories of similarity relationships and that protein energetics provide a basis for understanding these relationships.

Full Text

The Full Text of this article is available as a PDF (700.0 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Abagyan R. A., Batalov S. Do aligned sequences share the same fold? J Mol Biol. 1997 Oct 17;273(1):355–368. doi: 10.1006/jmbi.1997.1287. [DOI] [PubMed] [Google Scholar]
  2. Andrade M. A., Brown N. P., Leroy C., Hoersch S., de Daruvar A., Reich C., Franchini A., Tamames J., Valencia A., Ouzounis C. Automated genome sequence analysis and annotation. Bioinformatics. 1999 May;15(5):391–412. doi: 10.1093/bioinformatics/15.5.391. [DOI] [PubMed] [Google Scholar]
  3. Bennett M. J., Choe S., Eisenberg D. Refined structure of dimeric diphtheria toxin at 2.0 A resolution. Protein Sci. 1994 Sep;3(9):1444–1463. doi: 10.1002/pro.5560030911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bennett M. J., Eisenberg D. Refined structure of monomeric diphtheria toxin at 2.3 A resolution. Protein Sci. 1994 Sep;3(9):1464–1475. doi: 10.1002/pro.5560030912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Brenner S. E., Chothia C., Hubbard T. J. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc Natl Acad Sci U S A. 1998 May 26;95(11):6073–6078. doi: 10.1073/pnas.95.11.6073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chiche L., Gregoret L. M., Cohen F. E., Kollman P. A. Protein model structure evaluation using the solvation free energy of folding. Proc Natl Acad Sci U S A. 1990 Apr;87(8):3240–3243. doi: 10.1073/pnas.87.8.3240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chothia C., Lesk A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 1986 Apr;5(4):823–826. doi: 10.1002/j.1460-2075.1986.tb04288.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Hofmann K., Bucher P., Falquet L., Bairoch A. The PROSITE database, its status in 1999. Nucleic Acids Res. 1999 Jan 1;27(1):215–219. doi: 10.1093/nar/27.1.215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Holm L., Sander C. Mapping the protein universe. Science. 1996 Aug 2;273(5275):595–603. doi: 10.1126/science.273.5275.595. [DOI] [PubMed] [Google Scholar]
  10. Kuhlman B., Baker D. Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci U S A. 2000 Sep 12;97(19):10383–10388. doi: 10.1073/pnas.97.19.10383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Levitt M., Gerstein M. A unified statistical framework for sequence comparison and structure comparison. Proc Natl Acad Sci U S A. 1998 May 26;95(11):5913–5920. doi: 10.1073/pnas.95.11.5913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Mirzoeva S., Weigand S., Lukas T. J., Shuvalova L., Anderson W. F., Watterson D. M. Analysis of the functional coupling between calmodulin's calcium binding and peptide recognition properties. Biochemistry. 1999 Mar 30;38(13):3936–3947. doi: 10.1021/bi9821263. [DOI] [PubMed] [Google Scholar]
  13. Murzin A. G., Brenner S. E., Hubbard T., Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995 Apr 7;247(4):536–540. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]
  14. Pearl F., Todd A. E., Bray J. E., Martin A. C., Salamov A. A., Suwa M., Swindells M. B., Thornton J. M., Orengo C. A. Using the CATH domain database to assign structures and functions to the genome sequences. Biochem Soc Trans. 2000 Feb;28(2):269–275. doi: 10.1042/bst0280269. [DOI] [PubMed] [Google Scholar]
  15. Pelletier H., Sawaya M. R., Wolfle W., Wilson S. H., Kraut J. Crystal structures of human DNA polymerase beta complexed with DNA: implications for catalytic mechanism, processivity, and fidelity. Biochemistry. 1996 Oct 1;35(39):12742–12761. doi: 10.1021/bi952955d. [DOI] [PubMed] [Google Scholar]
  16. Persechini A., Kretsinger R. H., Davis T. N. Calmodulins with deletions in the central helix functionally replace the native protein in yeast cells. Proc Natl Acad Sci U S A. 1991 Jan 15;88(2):449–452. doi: 10.1073/pnas.88.2.449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Russell R. B., Saqi M. A., Sayle R. A., Bates P. A., Sternberg M. J. Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol. 1997 Jun 13;269(3):423–439. doi: 10.1006/jmbi.1997.1019. [DOI] [PubMed] [Google Scholar]
  18. Sauder J. M., Arthur J. W., Dunbrack R. L., Jr Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins. 2000 Jul 1;40(1):6–22. doi: 10.1002/(sici)1097-0134(20000701)40:1<6::aid-prot30>3.0.co;2-7. [DOI] [PubMed] [Google Scholar]
  19. Sawaya M. R., Prasad R., Wilson S. H., Kraut J., Pelletier H. Crystal structures of human DNA polymerase beta complexed with gapped and nicked DNA: evidence for an induced fit mechanism. Biochemistry. 1997 Sep 16;36(37):11205–11215. doi: 10.1021/bi9703812. [DOI] [PubMed] [Google Scholar]
  20. Shindyalov I. N., Bourne P. E. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998 Sep;11(9):739–747. doi: 10.1093/protein/11.9.739. [DOI] [PubMed] [Google Scholar]
  21. Todd A. E., Orengo C. A., Thornton J. M. Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol. 2001 Apr 6;307(4):1113–1143. doi: 10.1006/jmbi.2001.4513. [DOI] [PubMed] [Google Scholar]
  22. Vandonselaar M., Hickie R. A., Quail J. W., Delbaere L. T. Trifluoperazine-induced conformational change in Ca(2+)-calmodulin. Nat Struct Biol. 1994 Nov;1(11):795–801. doi: 10.1038/nsb1194-795. [DOI] [PubMed] [Google Scholar]
  23. Wilson C. A., Kreychman J., Gerstein M. Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. J Mol Biol. 2000 Mar 17;297(1):233–249. doi: 10.1006/jmbi.2000.3550. [DOI] [PubMed] [Google Scholar]
  24. Wood T. C., Pearson W. R. Evolution of protein sequences and structures. J Mol Biol. 1999 Aug 27;291(4):977–995. doi: 10.1006/jmbi.1999.2972. [DOI] [PubMed] [Google Scholar]
  25. Zaccai G. How soft is a protein? A protein dynamics force constant measured by neutron scattering. Science. 2000 Jun 2;288(5471):1604–1607. doi: 10.1126/science.288.5471.1604. [DOI] [PubMed] [Google Scholar]

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES