Abstract
Relatively few protein structures are known, compared to the enormous amount of sequence data produced in the sequencing of different genomes, and relatively few protein complexes are deposited in the PDB with respect to the great amount of interaction data coming from high-throughput experiments (two-hybrid or affinity purification of protein complexes and mass spectrometry). Nevertheless, we can rely on computational techniques for the extraction of high-quality and information-rich data from the known structures and for their spreading in the protein sequence space. We describe here the ongoing research projects in our group: we analyse the protein complexes stored in the PDB and, for each complex involving one domain belonging to a family of interaction domains for which some interaction data are available, we can calculate its probability of interaction with any protein sequence. We analyse the structures of proteins encoding a function specified in a PROSITE pattern, which exhibits relatively low selectivity and specificity, and build extended patterns. To this aim, we consider residues that are well-conserved in the structure, even if their conservation cannot easily be recognized in the sequence alignment of the proteins holding the function. We also analyse protein surface regions and, through the annotation of the solvent-exposed residues, we annotate protein surface patches via a structural comparison performed with stringent parameters and independently of the residue order in the sequence. Local surface comparison may also help in identifying new sequence patterns, which could not be highlighted with other sequence-based methods.
Full Text
The Full Text of this article is available as a PDF (77.3 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Burley Stephen K., Bonanno Jeffrey B. Structural genomics. Methods Biochem Anal. 2003;44:591–612. [PubMed] [Google Scholar]
- Dower William J., Mattheakis Larry C. In vitro selection as a powerful tool for the applied evolution of proteins and peptides. Curr Opin Chem Biol. 2002 Jun;6(3):390–398. doi: 10.1016/s1367-5931(02)00332-0. [DOI] [PubMed] [Google Scholar]
- Falquet Laurent, Pagni Marco, Bucher Philipp, Hulo Nicolas, Sigrist Christian J. A., Hofmann Kay, Bairoch Amos. The PROSITE database, its status in 2002. Nucleic Acids Res. 2002 Jan 1;30(1):235–238. doi: 10.1093/nar/30.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laskowski R. A., Luscombe N. M., Swindells M. B., Thornton J. M. Protein clefts in molecular recognition and function. Protein Sci. 1996 Dec;5(12):2438–2452. doi: 10.1002/pro.5560051206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laskowski R. A. SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J Mol Graph. 1995 Oct;13(5):323-30, 307-8. doi: 10.1016/0263-7855(95)00073-9. [DOI] [PubMed] [Google Scholar]
- Reimer Ulf, Reineke Ulrich, Schneider-Mergener Jens. Peptide arrays: from macro to micro. Curr Opin Biotechnol. 2002 Aug;13(4):315–320. doi: 10.1016/s0958-1669(02)00339-7. [DOI] [PubMed] [Google Scholar]
- Westbrook John, Feng Zukang, Chen Li, Yang Huanwang, Berman Helen M. The Protein Data Bank and structural genomics. Nucleic Acids Res. 2003 Jan 1;31(1):489–491. doi: 10.1093/nar/gkg068. [DOI] [PMC free article] [PubMed] [Google Scholar]
