Abstract
An algorithm for multiple sequence alignment is given that matches words of length and degree of mismatch chosen by the user. The alignment maximizes an alignment scoring function. The method is based on a novel extension of our consensus sequence methods. The algorithm works for both DNA and protein sequences, and from earlier work on consensus sequences, it is possible to estimate statistical significance.
Full text
PDF







Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Galas D. J., Eggert M., Waterman M. S. Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. J Mol Biol. 1985 Nov 5;186(1):117–128. doi: 10.1016/0022-2836(85)90262-1. [DOI] [PubMed] [Google Scholar]
- Martinez H. M. An efficient method for finding repeats in molecular sequences. Nucleic Acids Res. 1983 Jul 11;11(13):4629–4634. doi: 10.1093/nar/11.13.4629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murata M., Richardson J. S., Sussman J. L. Simultaneous comparison of three protein sequences. Proc Natl Acad Sci U S A. 1985 May;82(10):3073–3077. doi: 10.1073/pnas.82.10.3073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sankoff D., Cedergren R. J., Lapalme G. Frequency of insertion-deletion, transversion, and transition in the evolution of 5S ribosomal RNA. J Mol Evol. 1976 Mar 29;7(2):133–149. doi: 10.1007/BF01732471. [DOI] [PubMed] [Google Scholar]
- Sobel E., Martinez H. M. A multiple sequence alignment program. Nucleic Acids Res. 1986 Jan 10;14(1):363–374. doi: 10.1093/nar/14.1.363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waterman M. S., Arratia R., Galas D. J. Pattern recognition in several sequences: consensus and alignment. Bull Math Biol. 1984;46(4):515–527. doi: 10.1007/BF02459500. [DOI] [PubMed] [Google Scholar]