Abstract
Pfam contains multiple alignments and hidden Markov model based profiles (HMM-profiles) of complete protein domains. The definition of domain boundaries, family members and alignment is done semi-automatically based on expert knowledge, sequence similarity, other protein family databases and the ability of HMM-profiles to correctly identify and align the members. Release 2.0 of Pfam contains 527 manually verified families which are available for browsing and on-line searching via the World Wide Web in the UK at http://www.sanger.ac.uk/Pfam/ and in the US at http://genome.wustl. edu/Pfam/ Pfam 2.0 matches one or more domains in 50% of Swissprot-34 sequences, and 25% of a large sample of predicted proteins from the Caenorhabditis elegans genome.
Full Text
The Full Text of this article is available as a PDF (156.2 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Attwood T. K., Beck M. E., Bleasby A. J., Degtyarenko K., Michie A. D., Parry-Smith D. J. Novel developments with the PRINTS protein fingerprint database. Nucleic Acids Res. 1997 Jan 1;25(1):212–217. doi: 10.1093/nar/25.1.212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bairoch A., Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998. Nucleic Acids Res. 1998 Jan 1;26(1):38–42. doi: 10.1093/nar/26.1.38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bairoch A., Bucher P., Hofmann K. The PROSITE database, its status in 1997. Nucleic Acids Res. 1997 Jan 1;25(1):217–221. doi: 10.1093/nar/25.1.217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eddy S. R. Hidden Markov models. Curr Opin Struct Biol. 1996 Jun;6(3):361–365. doi: 10.1016/s0959-440x(96)80056-x. [DOI] [PubMed] [Google Scholar]
- Henikoff S., Endow S. A., Greene E. A. Connecting protein family resources using the proWeb network. Trends Biochem Sci. 1996 Nov;21(11):444–445. doi: 10.1016/s0968-0004(96)30039-x. [DOI] [PubMed] [Google Scholar]
- Henikoff S., Pietrokovski S., Henikoff J. G. Superior performance in protein homology detection with the Blocks Database servers. Nucleic Acids Res. 1998 Jan 1;26(1):309–312. doi: 10.1093/nar/26.1.309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krogh A., Brown M., Mian I. S., Sjölander K., Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994 Feb 4;235(5):1501–1531. doi: 10.1006/jmbi.1994.1104. [DOI] [PubMed] [Google Scholar]
- Sonnhammer E. L., Durbin R. A workbench for large-scale sequence homology analysis. Comput Appl Biosci. 1994 Jun;10(3):301–307. doi: 10.1093/bioinformatics/10.3.301. [DOI] [PubMed] [Google Scholar]
- Sonnhammer E. L., Eddy S. R., Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997 Jul;28(3):405–420. doi: 10.1002/(sici)1097-0134(199707)28:3<405::aid-prot10>3.0.co;2-l. [DOI] [PubMed] [Google Scholar]
- Sonnhammer E. L., Kahn D. Modular arrangement of proteins as inferred from analysis of homology. Protein Sci. 1994 Mar;3(3):482–492. doi: 10.1002/pro.5560030314. [DOI] [PMC free article] [PubMed] [Google Scholar]