Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1994 Nov 25;22(23):5112–5120. doi: 10.1093/nar/22.23.5112

Stochastic context-free grammars for tRNA modeling.

Y Sakakibara 1, M Brown 1, R Hughey 1, I S Mian 1, K Sjölander 1, R C Underwood 1, D Haussler 1
PMCID: PMC523785  PMID: 7800507

Abstract

Stochastic context-free grammars (SCFGs) are applied to the problems of folding, aligning and modeling families of tRNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. Results show that after having been trained on as few as 20 tRNA sequences from only two tRNA subfamilies (mitochondrial and cytoplasmic), the model can discern general tRNA from similar-length RNA sequences of other kinds, can find secondary structure of new tRNA sequences, and can produce multiple alignments of large sets of tRNA sequences. Our results suggest potential improvements in the alignments of the D- and T-domains in some mitochondrial tRNAs that cannot be fit into the canonical secondary structure.

Full text

PDF
5113

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Abarbanel R. M., Wieneke P. R., Mansfield E., Jaffe D. A., Brutlag D. L. Rapid searches for complex patterns in biological molecules. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):263–280. doi: 10.1093/nar/12.1part1.263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Brown J. W., Haas E. S., James B. D., Hunt D. A., Liu J. S., Pace N. R. Phylogenetic analysis and evolution of RNase P RNA in proteobacteria. J Bacteriol. 1991 Jun;173(12):3855–3863. doi: 10.1128/jb.173.12.3855-3863.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Chiu D. K., Kolodziejczak T. Inferring consensus structure from nucleic acid sequences. Comput Appl Biosci. 1991 Jul;7(3):347–352. doi: 10.1093/bioinformatics/7.3.347. [DOI] [PubMed] [Google Scholar]
  4. Cohen F. E., Abarbanel R. M., Kuntz I. D., Fletterick R. J. Turn prediction in proteins using a pattern-matching approach. Biochemistry. 1986 Jan 14;25(1):266–275. doi: 10.1021/bi00349a037. [DOI] [PubMed] [Google Scholar]
  5. Eddy S. R., Durbin R. RNA sequence analysis using covariance models. Nucleic Acids Res. 1994 Jun 11;22(11):2079–2088. doi: 10.1093/nar/22.11.2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Fichant G. A., Burks C. Identifying potential tRNA genes in genomic DNA sequences. J Mol Biol. 1991 Aug 5;220(3):659–671. doi: 10.1016/0022-2836(91)90108-i. [DOI] [PubMed] [Google Scholar]
  7. Fox G. E., Woese C. R. 5S RNA secondary structure. Nature. 1975 Aug 7;256(5517):505–507. doi: 10.1038/256505a0. [DOI] [PubMed] [Google Scholar]
  8. Gautheret D., Major F., Cedergren R. Pattern searching/alignment with RNA primary and secondary structures: an effective descriptor for tRNA. Comput Appl Biosci. 1990 Oct;6(4):325–331. doi: 10.1093/bioinformatics/6.4.325. [DOI] [PubMed] [Google Scholar]
  9. Gutell R. R., Power A., Hertz G. Z., Putz E. J., Stormo G. D. Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. Nucleic Acids Res. 1992 Nov 11;20(21):5785–5795. doi: 10.1093/nar/20.21.5785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Guthrie C., Patterson B. Spliceosomal snRNAs. Annu Rev Genet. 1988;22:387–419. doi: 10.1146/annurev.ge.22.120188.002131. [DOI] [PubMed] [Google Scholar]
  11. Han K., Kim H. J. Prediction of common folding structures of homologous RNAs. Nucleic Acids Res. 1993 Mar 11;21(5):1251–1257. doi: 10.1093/nar/21.5.1251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. James B. D., Olsen G. J., Pace N. R. Phylogenetic comparative analysis of RNA secondary structure. Methods Enzymol. 1989;180:227–239. doi: 10.1016/0076-6879(89)80104-1. [DOI] [PubMed] [Google Scholar]
  13. Krogh A., Brown M., Mian I. S., Sjölander K., Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994 Feb 4;235(5):1501–1531. doi: 10.1006/jmbi.1994.1104. [DOI] [PubMed] [Google Scholar]
  14. Krogh A., Mian I. S., Haussler D. A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 1994 Nov 11;22(22):4768–4778. doi: 10.1093/nar/22.22.4768. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Le S. Y., Zuker M. Predicting common foldings of homologous RNAs. J Biomol Struct Dyn. 1991 Apr;8(5):1027–1044. doi: 10.1080/07391102.1991.10507863. [DOI] [PubMed] [Google Scholar]
  16. Marvel C. C. A program for the identification of tRNA-like structures in DNA sequence data. Nucleic Acids Res. 1986 Jan 10;14(1):431–435. doi: 10.1093/nar/14.1.431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Michel F., Ellington A. D., Couture S., Szostak J. W. Phylogenetic and genetic evidence for base-triples in the catalytic domain of group I introns. Nature. 1990 Oct 11;347(6293):578–580. doi: 10.1038/347578a0. [DOI] [PubMed] [Google Scholar]
  18. Michel F., Umesono K., Ozeki H. Comparative and functional anatomy of group II catalytic introns--a review. Gene. 1989 Oct 15;82(1):5–30. doi: 10.1016/0378-1119(89)90026-7. [DOI] [PubMed] [Google Scholar]
  19. Michel F., Westhof E. Modelling of the three-dimensional architecture of group I catalytic introns based on comparative sequence analysis. J Mol Biol. 1990 Dec 5;216(3):585–610. doi: 10.1016/0022-2836(90)90386-Z. [DOI] [PubMed] [Google Scholar]
  20. Presnell S. R., Cohen F. E. Artificial neural networks for pattern recognition in biochemical sequences. Annu Rev Biophys Biomol Struct. 1993;22:283–298. doi: 10.1146/annurev.bb.22.060193.001435. [DOI] [PubMed] [Google Scholar]
  21. Saurin W., Marlière P. Matching relational patterns in nucleic acid sequences. Comput Appl Biosci. 1987 Jun;3(2):115–120. doi: 10.1093/bioinformatics/3.2.115. [DOI] [PubMed] [Google Scholar]
  22. Sibbald P. R., Argos P. Scrutineer: a computer program that flexibly seeks and describes motifs and profiles in protein sequence databases. Comput Appl Biosci. 1990 Jul;6(3):279–288. doi: 10.1093/bioinformatics/6.3.279. [DOI] [PubMed] [Google Scholar]
  23. Staden R. A computer program to search for tRNA genes. Nucleic Acids Res. 1980 Feb 25;8(4):817–825. [PMC free article] [PubMed] [Google Scholar]
  24. Steinberg S., Misch A., Sprinzl M. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 1993 Jul 1;21(13):3011–3015. doi: 10.1093/nar/21.13.3011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Tranguch A. J., Engelke D. R. Comparative structural analysis of nuclear RNase P RNAs from yeast. J Biol Chem. 1993 Jul 5;268(19):14045–14055. [PubMed] [Google Scholar]
  26. Turner D. H., Sugimoto N., Freier S. M. RNA structure prediction. Annu Rev Biophys Biophys Chem. 1988;17:167–192. doi: 10.1146/annurev.bb.17.060188.001123. [DOI] [PubMed] [Google Scholar]
  27. Waterman M. S. Computer analysis of nucleic acid sequences. Methods Enzymol. 1988;164:765–793. doi: 10.1016/s0076-6879(88)64083-3. [DOI] [PubMed] [Google Scholar]
  28. Winker S., Overbeek R., Woese C. R., Olsen G. J., Pfluger N. Structure detection through automated covariance search. Comput Appl Biosci. 1990 Oct;6(4):365–371. doi: 10.1093/bioinformatics/6.4.365. [DOI] [PubMed] [Google Scholar]
  29. Woese C. R., Gutell R., Gupta R., Noller H. F. Detailed analysis of the higher-order structure of 16S-like ribosomal ribonucleic acids. Microbiol Rev. 1983 Dec;47(4):621–669. doi: 10.1128/mr.47.4.621-669.1983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Zuker M. On finding all suboptimal foldings of an RNA molecule. Science. 1989 Apr 7;244(4900):48–52. doi: 10.1126/science.2468181. [DOI] [PubMed] [Google Scholar]
  31. Zwieb C. Structure and function of signal recognition particle RNA. Prog Nucleic Acid Res Mol Biol. 1989;37:207–234. doi: 10.1016/s0079-6603(08)60699-6. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES