Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1992 Dec 15;89(24):12098–12102. doi: 10.1073/pnas.89.24.12098

Sequence-structure matching in globular proteins: application to supersecondary and tertiary structure determination.

A Godzik 1, J Skolnick 1
PMCID: PMC50705  PMID: 1465445

Abstract

A methodology designed to address the inverse globular protein-folding problem (the identification of which sequences are compatible with a given three-dimensional structure) is described. By using a library of protein finger-prints, defined by the side chain interaction pattern, it is possible to match each structure to its own sequence in an exhaustive data base search. It is shown that this is a permissive requirement for the validation of the methodology. To pass the more rigorous test of identifying proteins that are not close sequence homologs, but that have similar structure, the method has been extended to include insertions and deletions in the sequence, which is compared to the fingerprint. This allows for the identification of sequences having little or no sequence homology to the fingerprint. Examples include plastocyanin/azurin/pseudoazurin, the globin family, different families of proteases and cytochromes, including cytochromes c' and b-562, actinidin/papain, and lysozyme/alpha-lactalbumin. Turning to supersecondary structure prediction, we find that alpha/beta/alpha fragments possess sufficient specificity to identify their own and related sequences. By threading a beta-hairpin through a sequence, it is possible to predict the location of such hairpins and turns with remarkable fidelity. Thus, the method greatly extends existing techniques for the prediction of both global structural homology and local supersecondary structure.

Full text

PDF
12098

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
  2. Bowie J. U., Lüthy R., Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991 Jul 12;253(5016):164–170. doi: 10.1126/science.1853201. [DOI] [PubMed] [Google Scholar]
  3. Casari G., Sippl M. J. Structure-derived hydrophobic potential. Hydrophobic potential derived from X-ray structures of globular proteins is able to identify native folds. J Mol Biol. 1992 Apr 5;224(3):725–732. doi: 10.1016/0022-2836(92)90556-y. [DOI] [PubMed] [Google Scholar]
  4. Crippen G. M. Prediction of protein folding from amino acid sequence over discrete conformation spaces. Biochemistry. 1991 Apr 30;30(17):4232–4237. doi: 10.1021/bi00231a018. [DOI] [PubMed] [Google Scholar]
  5. Godzik A., Kolinski A., Skolnick J. Topology fingerprint approach to the inverse protein folding problem. J Mol Biol. 1992 Sep 5;227(1):227–238. doi: 10.1016/0022-2836(92)90693-e. [DOI] [PubMed] [Google Scholar]
  6. Hinds D. A., Levitt M. A lattice model for protein structure prediction at low resolution. Proc Natl Acad Sci U S A. 1992 Apr 1;89(7):2536–2540. doi: 10.1073/pnas.89.7.2536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Hobohm U., Scharf M., Schneider R., Sander C. Selection of representative protein data sets. Protein Sci. 1992 Mar;1(3):409–417. doi: 10.1002/pro.5560010313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Luger K., Szadkowski H., Kirschner K. An 8-fold beta alpha barrel protein with redundant folding possibilities. Protein Eng. 1990 Mar;3(4):249–258. doi: 10.1093/protein/3.4.249. [DOI] [PubMed] [Google Scholar]
  9. Pabo C. Molecular technology. Designing proteins and peptides. Nature. 1983 Jan 20;301(5897):200–200. doi: 10.1038/301200a0. [DOI] [PubMed] [Google Scholar]
  10. Ponder J. W., Richards F. M. Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol. 1987 Feb 20;193(4):775–791. doi: 10.1016/0022-2836(87)90358-5. [DOI] [PubMed] [Google Scholar]
  11. Weber P. C., Salemme F. R., Mathews F. S., Bethge P. H. On the evolutionary relationship of the 4-alpha-helical heme proteins. The comparison of cytochrome b562 and cytochrome c'. J Biol Chem. 1981 Aug 10;256(15):7702–7704. [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES