Abstract
A method for the simultaneous alignment of a very large number of sequences using simulated annealing is presented. The total running time of the algorithm does not depend explicitly on the number of sequences treated. The method has been used for the simultaneous alignment of 1462 human intron sequences upstream of the intron-exon boundary. The consensus sequence of the aligned set together with a calculation of the Shannon information clearly shows that several sequence motives are conserved: (i) a previously undetected guanosine rich region, (ii) the branch point and (iii) the polypyrimidine tract. The nucleotide frequencies at each position of the branch point consensus sequence qualitatively reproduce the frequencies of the experimentally determined branch points.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Argos P., Vingron M., Vogt G. Protein sequence comparison: methods and significance. Protein Eng. 1991 Apr;4(4):375–383. doi: 10.1093/protein/4.4.375. [DOI] [PubMed] [Google Scholar]
- Black D. L., Chabot B., Steitz J. A. U2 as well as U1 small nuclear ribonucleoproteins are involved in premessenger RNA splicing. Cell. 1985 Oct;42(3):737–750. doi: 10.1016/0092-8674(85)90270-3. [DOI] [PubMed] [Google Scholar]
- Brunak S., Engelbrecht J., Knudsen S. Prediction of human mRNA donor and acceptor sites from the DNA sequence. J Mol Biol. 1991 Jul 5;220(1):49–65. doi: 10.1016/0022-2836(91)90380-o. [DOI] [PubMed] [Google Scholar]
- Corpet F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 1988 Nov 25;16(22):10881–10890. doi: 10.1093/nar/16.22.10881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Green M. R. Pre-mRNA splicing. Annu Rev Genet. 1986;20:671–708. doi: 10.1146/annurev.ge.20.120186.003323. [DOI] [PubMed] [Google Scholar]
- Guthrie C. Messenger RNA splicing in yeast: clues to why the spliceosome is a ribonucleoprotein. Science. 1991 Jul 12;253(5016):157–163. doi: 10.1126/science.1853200. [DOI] [PubMed] [Google Scholar]
- Harley C. B., Reynolds R. P. Analysis of E. coli promoter sequences. Nucleic Acids Res. 1987 Mar 11;15(5):2343–2361. doi: 10.1093/nar/15.5.2343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris N. L., Senapathy P. Distribution and consensus of branch point signals in eukaryotic genes: a computerized statistical analysis. Nucleic Acids Res. 1990 May 25;18(10):3015–3019. doi: 10.1093/nar/18.10.3015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartmuth K., Barta A. Unusual branch point selection in processing of human growth hormone pre-mRNA. Mol Cell Biol. 1988 May;8(5):2011–2020. doi: 10.1128/mcb.8.5.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keller E. B., Noon W. A. Intron splicing: a conserved internal signal in introns of animal pre-mRNAs. Proc Natl Acad Sci U S A. 1984 Dec;81(23):7417–7420. doi: 10.1073/pnas.81.23.7417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirkpatrick S., Gelatt C. D., Jr, Vecchi M. P. Optimization by simulated annealing. Science. 1983 May 13;220(4598):671–680. doi: 10.1126/science.220.4598.671. [DOI] [PubMed] [Google Scholar]
- Lukashin A. V., Anshelevich V. V., Amirikyan B. R., Gragerov A. I., Frank-Kamenetskii M. D. Neural network models for promoter recognition. J Biomol Struct Dyn. 1989 Jun;6(6):1123–1133. doi: 10.1080/07391102.1989.10506540. [DOI] [PubMed] [Google Scholar]
- Murata M., Richardson J. S., Sussman J. L. Simultaneous comparison of three protein sequences. Proc Natl Acad Sci U S A. 1985 May;82(10):3073–3077. doi: 10.1073/pnas.82.10.3073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker R., Siliciano P. G., Guthrie C. Recognition of the TACTAAC box during mRNA splicing in yeast involves base pairing to the U2-like snRNA. Cell. 1987 Apr 24;49(2):229–239. doi: 10.1016/0092-8674(87)90564-2. [DOI] [PubMed] [Google Scholar]
- Schneider T. D., Stormo G. D., Gold L., Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986 Apr 5;188(3):415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
- Smith C. W., Porro E. B., Patton J. G., Nadal-Ginard B. Scanning from an independently specified branch point defines the 3' splice site of mammalian introns. Nature. 1989 Nov 16;342(6247):243–247. doi: 10.1038/342243a0. [DOI] [PubMed] [Google Scholar]
- Steitz J. A., Wolin S. L., Rinke J., Pettersson I., Mount S. M., Lerner E. A., Hinterberger M., Gottlieb E. Small ribonucleoproteins from eukaryotes: structures and roles in RNA biogenesis. Cold Spring Harb Symp Quant Biol. 1983;47(Pt 2):893–900. doi: 10.1101/sqb.1983.047.01.103. [DOI] [PubMed] [Google Scholar]
- Taylor W. R. Hierarchical method to align large numbers of biological sequences. Methods Enzymol. 1990;183:456–474. doi: 10.1016/0076-6879(90)83031-4. [DOI] [PubMed] [Google Scholar]
- Van Arsdell S. W., Weiner A. M. Human genes for U2 small nuclear RNA are tandemly repeated. Mol Cell Biol. 1984 Mar;4(3):492–499. doi: 10.1128/mcb.4.3.492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vihinen M. Simultaneous comparison of several sequences. Methods Enzymol. 1990;183:447–456. doi: 10.1016/0076-6879(90)83030-d. [DOI] [PubMed] [Google Scholar]
- Vingron M., Argos P. Motif recognition and alignment for many sequences by comparison of dot-matrices. J Mol Biol. 1991 Mar 5;218(1):33–43. doi: 10.1016/0022-2836(91)90871-3. [DOI] [PubMed] [Google Scholar]
- Waterman M. S. Multiple sequence alignment by consensus. Nucleic Acids Res. 1986 Nov 25;14(22):9095–9102. doi: 10.1093/nar/14.22.9095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu J., Manley J. L. Mammalian pre-mRNA branch site selection by U2 snRNP involves base pairing. Genes Dev. 1989 Oct;3(10):1553–1561. doi: 10.1101/gad.3.10.1553. [DOI] [PubMed] [Google Scholar]
- Zhuang Y., Weiner A. M. A compensatory base change in human U2 snRNA can suppress a branch site mutation. Genes Dev. 1989 Oct;3(10):1545–1552. doi: 10.1101/gad.3.10.1545. [DOI] [PubMed] [Google Scholar]