PRINTS--a database of protein motif fingerprints

T K Attwood; M E Beck; A J Bleasby; D J Parry-Smith

. 1994 Sep;22(17):3590–3596.

PRINTS--a database of protein motif fingerprints.

T K Attwood ¹, M E Beck ¹, A J Bleasby ¹, D J Parry-Smith ¹

PMCID: PMC308327 PMID: 7937065

Abstract

PRINTS is a compendium of protein motif 'fingerprints'. A fingerprint is defined as a group of motifs excised from conserved regions of a sequence alignment, whose diagnostic power or potency is refined by iterative databasescanning (in this case the OWL composite sequence database). Generally, the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space. The use of groups of independent, linearly- or spatially-distinct motifs allows protein folds and functionalities to be characterised more flexibly and powerfully than conventional single-component patterns or regular expressions. The current version of the database contains 200 entries (encoding 950 motifs), covering a wide range of globular and membrane proteins, modular polypeptides, and so on. The growth of the databaseis influenced by a number of factors; e.g. the use of multiple motifs; the maximisation of sequence information through iterative database scanning; and the fact that the database searched is a large composite. The information contained within PRINTS is distinct from, but complementary to the consensus expressions stored in the widely-used PROSITE dictionary of patterns.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

Akrigg D., Attwood T. K., Bleasby A. J., Findlay J. B., North A. C., Maughan N. A., Parry-Smith D. J., Perkins D. N., Wootton J. C. SERPENT--an information storage and analysis resource for protein sequences. Comput Appl Biosci. 1992 Jun;8(3):295–296. doi: 10.1093/bioinformatics/8.3.295. [DOI] [PubMed] [Google Scholar]
Attwood T. K., Beck M. E. PRINTS--a protein motif fingerprint database. Protein Eng. 1994 Jul;7(7):841–848. doi: 10.1093/protein/7.7.841. [DOI] [PubMed] [Google Scholar]
Attwood T. K., Findlay J. B. Design of a discriminating fingerprint for G-protein-coupled receptors. Protein Eng. 1993 Feb;6(2):167–176. doi: 10.1093/protein/6.2.167. [DOI] [PubMed] [Google Scholar]
Attwood T. K., Findlay J. B. Fingerprinting G-protein-coupled receptors. Protein Eng. 1994 Feb;7(2):195–203. doi: 10.1093/protein/7.2.195. [DOI] [PubMed] [Google Scholar]
Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank, recent developments. Nucleic Acids Res. 1993 Jul 1;21(13):3093–3096. doi: 10.1093/nar/21.13.3093. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bairoch A. The PROSITE dictionary of sites and patterns in proteins, its current status. Nucleic Acids Res. 1993 Jul 1;21(13):3097–3103. doi: 10.1093/nar/21.13.3097. [DOI] [PMC free article] [PubMed] [Google Scholar]
Barker W. C., George D. G., Mewes H. W., Pfeiffer F., Tsugita A. The PIR-International databases. Nucleic Acids Res. 1993 Jul 1;21(13):3089–3092. doi: 10.1093/nar/21.13.3089. [DOI] [PMC free article] [PubMed] [Google Scholar]
Benson D., Lipman D. J., Ostell J. GenBank. Nucleic Acids Res. 1993 Jul 1;21(13):2963–2965. doi: 10.1093/nar/21.13.2963. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bleasby A. J., Wootton J. C. Construction of validated, non-redundant composite protein sequence databases. Protein Eng. 1990 Jan;3(3):153–159. doi: 10.1093/protein/3.3.153. [DOI] [PubMed] [Google Scholar]
Boguski M. S., Bairoch A., Attwood T. K., Michaels G. S. Proto-vav and gene expression. Nature. 1992 Jul 9;358(6382):113–113. doi: 10.1038/358113a0. [DOI] [PubMed] [Google Scholar]
Chee M. S., Satchwell S. C., Preddie E., Weston K. M., Barrell B. G. Human cytomegalovirus encodes three G protein-coupled receptor homologues. Nature. 1990 Apr 19;344(6268):774–777. doi: 10.1038/344774a0. [DOI] [PubMed] [Google Scholar]
Flower D. R., North A. C., Attwood T. K. Mouse oncogene protein 24p3 is a member of the lipocalin protein family. Biochem Biophys Res Commun. 1991 Oct 15;180(1):69–74. doi: 10.1016/s0006-291x(05)81256-2. [DOI] [PubMed] [Google Scholar]
Flower D. R., North A. C., Attwood T. K. Structure and sequence relationships in the lipocalins and related proteins. Protein Sci. 1993 May;2(5):753–761. doi: 10.1002/pro.5560020507. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gribskov M., Homyak M., Edenfield J., Eisenberg D. Profile scanning for three-dimensional structural patterns in protein sequences. Comput Appl Biosci. 1988 Mar;4(1):61–66. doi: 10.1093/bioinformatics/4.1.61. [DOI] [PubMed] [Google Scholar]
Henikoff S., Henikoff J. G. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]
Henikoff S., Henikoff J. G. Automated assembly of protein blocks for database searching. Nucleic Acids Res. 1991 Dec 11;19(23):6565–6572. doi: 10.1093/nar/19.23.6565. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jones D. T., Taylor W. R., Thornton J. M. A mutation data matrix for transmembrane proteins. FEBS Lett. 1994 Feb 21;339(3):269–275. doi: 10.1016/0014-5793(94)80429-x. [DOI] [PubMed] [Google Scholar]
Jones D. T., Taylor W. R., Thornton J. M. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992 Jun;8(3):275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
Mehldau G., Myers G. A system for pattern matching applications on biosequences. Comput Appl Biosci. 1993 Jun;9(3):299–314. doi: 10.1093/bioinformatics/9.3.299. [DOI] [PubMed] [Google Scholar]
Ogiwara A., Uchiyama I., Seto Y., Kanehisa M. Construction of a dictionary of sequence motifs that characterize groups of related proteins. Protein Eng. 1992 Sep;5(6):479–488. doi: 10.1093/protein/5.6.479. [DOI] [PMC free article] [PubMed] [Google Scholar]
Parry-Smith D. J., Attwood T. K. ADSP--a new package for computational sequence analysis. Comput Appl Biosci. 1992 Oct;8(5):451–459. doi: 10.1093/bioinformatics/8.5.451. [DOI] [PubMed] [Google Scholar]
Parry-Smith D. J., Attwood T. K. SOMAP: a novel interactive approach to multiple protein sequences alignment. Comput Appl Biosci. 1991 Apr;7(2):233–235. doi: 10.1093/bioinformatics/7.2.233. [DOI] [PubMed] [Google Scholar]
Pattabiraman N., Namboodiri K., Lowrey A., Gaber B. P. NRL-3D: a sequence-structure database derived from the protein data bank (PDB) and searchable within the PIR environment. Protein Seq Data Anal. 1990 Oct;3(5):387–405. [PubMed] [Google Scholar]
Persson B., Argos P. Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J Mol Biol. 1994 Mar 25;237(2):182–192. doi: 10.1006/jmbi.1994.1220. [DOI] [PubMed] [Google Scholar]
Pongor S., Skerl V., Cserzö M., Hátsági Z., Simon G., Bevilacqua V. The SBASE protein domain library, release 2.0: a collection of annotated protein sequence segments. Nucleic Acids Res. 1993 Jul 1;21(13):3111–3115. doi: 10.1093/nar/21.13.3111. [DOI] [PMC free article] [PubMed] [Google Scholar]
Saqi M. A., Sternberg M. J. Identification of sequence motifs from a set of proteins with related function. Protein Eng. 1994 Feb;7(2):165–171. doi: 10.1093/protein/7.2.165. [DOI] [PubMed] [Google Scholar]
Seto Y., Ikeuchi Y., Kanehisa M. Fragment peptide library for classification and functional prediction of proteins. Proteins. 1990;8(4):341–351. doi: 10.1002/prot.340080408. [DOI] [PubMed] [Google Scholar]

[OCR_00620] Akrigg D., Attwood T. K., Bleasby A. J., Findlay J. B., North A. C., Maughan N. A., Parry-Smith D. J., Perkins D. N., Wootton J. C. SERPENT--an information storage and analysis resource for protein sequences. Comput Appl Biosci. 1992 Jun;8(3):295–296. doi: 10.1093/bioinformatics/8.3.295. [DOI] [PubMed] [Google Scholar]

[OCR_00619] Attwood T. K., Beck M. E. PRINTS--a protein motif fingerprint database. Protein Eng. 1994 Jul;7(7):841–848. doi: 10.1093/protein/7.7.841. [DOI] [PubMed] [Google Scholar]

[OCR_00611] Attwood T. K., Findlay J. B. Design of a discriminating fingerprint for G-protein-coupled receptors. Protein Eng. 1993 Feb;6(2):167–176. doi: 10.1093/protein/6.2.167. [DOI] [PubMed] [Google Scholar]

[OCR_00643] Attwood T. K., Findlay J. B. Fingerprinting G-protein-coupled receptors. Protein Eng. 1994 Feb;7(2):195–203. doi: 10.1093/protein/7.2.195. [DOI] [PubMed] [Google Scholar]

[OCR_00590] Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank, recent developments. Nucleic Acids Res. 1993 Jul 1;21(13):3093–3096. doi: 10.1093/nar/21.13.3093. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00572] Bairoch A. The PROSITE dictionary of sites and patterns in proteins, its current status. Nucleic Acids Res. 1993 Jul 1;21(13):3097–3103. doi: 10.1093/nar/21.13.3097. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00595] Barker W. C., George D. G., Mewes H. W., Pfeiffer F., Tsugita A. The PIR-International databases. Nucleic Acids Res. 1993 Jul 1;21(13):3089–3092. doi: 10.1093/nar/21.13.3089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00599] Benson D., Lipman D. J., Ostell J. GenBank. Nucleic Acids Res. 1993 Jul 1;21(13):2963–2965. doi: 10.1093/nar/21.13.2963. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00594] Bleasby A. J., Wootton J. C. Construction of validated, non-redundant composite protein sequence databases. Protein Eng. 1990 Jan;3(3):153–159. doi: 10.1093/protein/3.3.153. [DOI] [PubMed] [Google Scholar]

[OCR_00633] Boguski M. S., Bairoch A., Attwood T. K., Michaels G. S. Proto-vav and gene expression. Nature. 1992 Jul 9;358(6382):113–113. doi: 10.1038/358113a0. [DOI] [PubMed] [Google Scholar]

[OCR_00639] Chee M. S., Satchwell S. C., Preddie E., Weston K. M., Barrell B. G. Human cytomegalovirus encodes three G protein-coupled receptor homologues. Nature. 1990 Apr 19;344(6268):774–777. doi: 10.1038/344774a0. [DOI] [PubMed] [Google Scholar]

[OCR_00629] Flower D. R., North A. C., Attwood T. K. Mouse oncogene protein 24p3 is a member of the lipocalin protein family. Biochem Biophys Res Commun. 1991 Oct 15;180(1):69–74. doi: 10.1016/s0006-291x(05)81256-2. [DOI] [PubMed] [Google Scholar]

[OCR_00625] Flower D. R., North A. C., Attwood T. K. Structure and sequence relationships in the lipocalins and related proteins. Protein Sci. 1993 May;2(5):753–761. doi: 10.1002/pro.5560020507. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00578] Gribskov M., Homyak M., Edenfield J., Eisenberg D. Profile scanning for three-dimensional structural patterns in protein sequences. Comput Appl Biosci. 1988 Mar;4(1):61–66. doi: 10.1093/bioinformatics/4.1.61. [DOI] [PubMed] [Google Scholar]

[OCR_00647] Henikoff S., Henikoff J. G. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00582] Henikoff S., Henikoff J. G. Automated assembly of protein blocks for database searching. Nucleic Acids Res. 1991 Dec 11;19(23):6565–6572. doi: 10.1093/nar/19.23.6565. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00652] Jones D. T., Taylor W. R., Thornton J. M. A mutation data matrix for transmembrane proteins. FEBS Lett. 1994 Feb 21;339(3):269–275. doi: 10.1016/0014-5793(94)80429-x. [DOI] [PubMed] [Google Scholar]

[OCR_00651] Jones D. T., Taylor W. R., Thornton J. M. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992 Jun;8(3):275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]

[OCR_00656] Mehldau G., Myers G. A system for pattern matching applications on biosequences. Comput Appl Biosci. 1993 Jun;9(3):299–314. doi: 10.1093/bioinformatics/9.3.299. [DOI] [PubMed] [Google Scholar]

[OCR_00584] Ogiwara A., Uchiyama I., Seto Y., Kanehisa M. Construction of a dictionary of sequence motifs that characterize groups of related proteins. Protein Eng. 1992 Sep;5(6):479–488. doi: 10.1093/protein/5.6.479. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00610] Parry-Smith D. J., Attwood T. K. ADSP--a new package for computational sequence analysis. Comput Appl Biosci. 1992 Oct;8(5):451–459. doi: 10.1093/bioinformatics/8.5.451. [DOI] [PubMed] [Google Scholar]

[OCR_00609] Parry-Smith D. J., Attwood T. K. SOMAP: a novel interactive approach to multiple protein sequences alignment. Comput Appl Biosci. 1991 Apr;7(2):233–235. doi: 10.1093/bioinformatics/7.2.233. [DOI] [PubMed] [Google Scholar]

[OCR_00605] Pattabiraman N., Namboodiri K., Lowrey A., Gaber B. P. NRL-3D: a sequence-structure database derived from the protein data bank (PDB) and searchable within the PIR environment. Protein Seq Data Anal. 1990 Oct;3(5):387–405. [PubMed] [Google Scholar]

[OCR_00657] Persson B., Argos P. Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J Mol Biol. 1994 Mar 25;237(2):182–192. doi: 10.1006/jmbi.1994.1220. [DOI] [PubMed] [Google Scholar]

[OCR_00574] Pongor S., Skerl V., Cserzö M., Hátsági Z., Simon G., Bevilacqua V. The SBASE protein domain library, release 2.0: a collection of annotated protein sequence segments. Nucleic Acids Res. 1993 Jul 1;21(13):3111–3115. doi: 10.1093/nar/21.13.3111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00615] Saqi M. A., Sternberg M. J. Identification of sequence motifs from a set of proteins with related function. Protein Eng. 1994 Feb;7(2):165–171. doi: 10.1093/protein/7.2.165. [DOI] [PubMed] [Google Scholar]

[OCR_00588] Seto Y., Ikeuchi Y., Kanehisa M. Fragment peptide library for classification and functional prediction of proteins. Proteins. 1990;8(4):341–351. doi: 10.1002/prot.340080408. [DOI] [PubMed] [Google Scholar]

PERMALINK

PRINTS--a database of protein motif fingerprints.

T K Attwood

M E Beck

A J Bleasby

D J Parry-Smith

Abstract

Full text

Selected References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

PRINTS--a database of protein motif fingerprints.

T K Attwood

M E Beck

A J Bleasby

D J Parry-Smith

Abstract

Full text

Selected References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases