Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1985 Jan 11;13(1):185–194. doi: 10.1093/nar/13.1.185

A method to locate protein coding sequences in DNA of prokaryotic systems.

A S Kolaskar, B V Reddy
PMCID: PMC340983  PMID: 3839071

Abstract

cDNA sequence data from E. coli phages, for which complete genome sequences are known, have been analysed, From this analysis thirteen triplets have been identified as markers to distinguish protein-coding frames from fortuitous open reading frames. The region of -18 to +18 nucleotides around ATG/GTG, has been analysed and used to identify initiator codons from internal ATG/GTG. With the aid of criteria defined above a method has been developed to locate protein coding sequences by a combination of 'gene search by signal' and 'gene search by content' approaches. Application of this method to prokaryotic systems including those which were not part of our data base indicates that it is quite accurate and general in nature.

Full text

PDF
185

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Almagor H. A Markov analysis of DNA sequences. J Theor Biol. 1983 Oct 21;104(4):633–645. doi: 10.1016/0022-5193(83)90251-5. [DOI] [PubMed] [Google Scholar]
  2. Atkins J. F. Is UAA or UGA part of the recognition signal for ribosomal initiation? Nucleic Acids Res. 1979 Oct 25;7(4):1035–1041. doi: 10.1093/nar/7.4.1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Drew H. R. Structural specificities of five commonly used DNA nucleases. J Mol Biol. 1984 Jul 15;176(4):535–557. doi: 10.1016/0022-2836(84)90176-1. [DOI] [PubMed] [Google Scholar]
  4. Fickett J. W. Recognition of protein coding regions in DNA sequences. Nucleic Acids Res. 1982 Sep 11;10(17):5303–5318. doi: 10.1093/nar/10.17.5303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Fuchs R. P. DNA binding spectrum of the carcinogen N-acetoxy-N-2-acetylaminofluorene significantly differs from the mutation spectrum. J Mol Biol. 1984 Jul 25;177(1):173–180. doi: 10.1016/0022-2836(84)90063-9. [DOI] [PubMed] [Google Scholar]
  6. Gold L., Pribnow D., Schneider T., Shinedling S., Singer B. S., Stormo G. Translational initiation in prokaryotes. Annu Rev Microbiol. 1981;35:365–403. doi: 10.1146/annurev.mi.35.100181.002053. [DOI] [PubMed] [Google Scholar]
  7. Gribskov M., Devereux J., Burgess R. R. The codon preference plot: graphic analysis of protein coding sequences and prediction of gene expression. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):539–549. doi: 10.1093/nar/12.1part2.539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Scherer G. F., Walkinshaw M. D., Arnott S., Morré D. J. The ribosome binding sites recognized by E. coli ribosomes have regions with signal character in both the leader and protein coding segments. Nucleic Acids Res. 1980 Sep 11;8(17):3895–3907. doi: 10.1093/nar/8.17.3895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Shepherd J. C. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci U S A. 1981 Mar;78(3):1596–1600. doi: 10.1073/pnas.78.3.1596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Shine J., Dalgarno L. The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci U S A. 1974 Apr;71(4):1342–1346. doi: 10.1073/pnas.71.4.1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Staden R. Graphic methods to determine the function of nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):521–538. doi: 10.1093/nar/12.1part2.521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Staden R., McLachlan A. D. Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res. 1982 Jan 11;10(1):141–156. doi: 10.1093/nar/10.1.141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Stormo G. D., Schneider T. D., Gold L. M. Characterization of translational initiation sites in E. coli. Nucleic Acids Res. 1982 May 11;10(9):2971–2996. doi: 10.1093/nar/10.9.2971. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES