Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1987 Oct 26;15(20):8125–8148. doi: 10.1093/nar/15.20.8125

An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs.

M Kozak 1
PMCID: PMC306349  PMID: 3313277

Abstract

5'-Noncoding sequences have been compiled from 699 vertebrate mRNAs. (GCC) GCCA/GCCATGG emerges as the consensus sequence for initiation of translation in vertebrates. The most highly conserved position in that motif is the purine in position -3 (three nucleotides upstream from the ATG codon); 97% of vertebrate mRNAs have a purine, most often A, in that position. The periodical occurrence of G (in positions -3, -6, -9) is discussed. Upstream ATG codons occur in fewer than 10% of vertebrate mRNAs-at-large; a notable exception are oncogene transcripts, two-thirds of which have ATG codons preceding the start of the major open reading frame. The leader sequences of most vertebrate mRNAs fall in the size range of 20 to 100 nucleotides. The significance of shorter and longer 5'-noncoding sequences is discussed.

Full text

PDF
8125

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Ahn T. G., Cohn D. V., Gorr S. U., Ornstein D. L., Kashdan M. A., Levine M. A. Primary structure of bovine pituitary secretory protein I (chromogranin A) deduced from the cDNA sequence. Proc Natl Acad Sci U S A. 1987 Jul;84(14):5043–5047. doi: 10.1073/pnas.84.14.5043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akeson A. L., Wiginton D. A., States J. C., Perme C. M., Dusing M. R., Hutton J. J. Mutations in the human adenosine deaminase gene that affect protein structure and RNA splicing. Proc Natl Acad Sci U S A. 1987 Aug;84(16):5947–5951. doi: 10.1073/pnas.84.16.5947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Auron P. E., Webb A. C., Rosenwasser L. J., Mucci S. F., Rich A., Wolff S. M., Dinarello C. A. Nucleotide sequence of human monocyte interleukin 1 precursor cDNA. Proc Natl Acad Sci U S A. 1984 Dec;81(24):7907–7911. doi: 10.1073/pnas.81.24.7907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Cavener D. R. Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Res. 1987 Feb 25;15(4):1353–1361. doi: 10.1093/nar/15.4.1353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Claesson L., Larhammar D., Rask L., Peterson P. A. cDNA clone for the human invariant gamma chain of class II histocompatibility antigens and its implications for the protein structure. Proc Natl Acad Sci U S A. 1983 Dec;80(24):7395–7399. doi: 10.1073/pnas.80.24.7395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cognet M., Lone Y. C., Vaulont S., Kahn A., Marie J. Structure of the rat L-type pyruvate kinase gene. J Mol Biol. 1987 Jul 5;196(1):11–25. doi: 10.1016/0022-2836(87)90507-9. [DOI] [PubMed] [Google Scholar]
  7. Collins P. L., Wertz G. W. The envelope-associated 22K protein of human respiratory syncytial virus: nucleotide sequence of the mRNA and a related polytranscript. J Virol. 1985 Apr;54(1):65–71. doi: 10.1128/jvi.54.1.65-71.1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Conboy J., Kan Y. W., Shohet S. B., Mohandas N. Molecular cloning of protein 4.1, a major structural element of the human erythrocyte membrane skeleton. Proc Natl Acad Sci U S A. 1986 Dec;83(24):9512–9516. doi: 10.1073/pnas.83.24.9512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Daddona P. E., Shewach D. S., Kelley W. N., Argos P., Markham A. F., Orkin S. H. Human adenosine deaminase. cDNA and complete primary amino acid sequence. J Biol Chem. 1984 Oct 10;259(19):12101–12106. [PubMed] [Google Scholar]
  10. Dasgupta R., Shih D. S., Saris C., Kaesberg P. Nucleotide sequence of a viral RNA fragment that binds to eukaryotic ribosomes. Nature. 1975 Aug 21;256(5519):624–628. doi: 10.1038/256624a0. [DOI] [PubMed] [Google Scholar]
  11. Dente L., Pizza M. G., Metspalu A., Cortese R. Structure and expression of the genes coding for human alpha 1-acid glycoprotein. EMBO J. 1987 Aug;6(8):2289–2296. doi: 10.1002/j.1460-2075.1987.tb02503.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dörner M. H., Salfeld J., Will H., Leibold E. A., Vass J. K., Munro H. N. Structure of human ferritin light subunit messenger RNA: comparison with heavy subunit message and functional implications. Proc Natl Acad Sci U S A. 1985 May;82(10):3139–3143. doi: 10.1073/pnas.82.10.3139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fukasawa K. M., Li S. S. Nucleotide sequence of the putative regulatory region of mouse lactate dehydrogenase-A gene. Biochem J. 1986 Apr 15;235(2):435–439. doi: 10.1042/bj2350435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gaul U., Seifert E., Schuh R., Jäckle H. Analysis of Krüppel protein distribution during early Drosophila development reveals posttranscriptional regulation. Cell. 1987 Aug 14;50(4):639–647. doi: 10.1016/0092-8674(87)90037-7. [DOI] [PubMed] [Google Scholar]
  15. Grass D. S., Manley J. L. Selective translation initiation on bicistronic simian virus 40 late mRNA. J Virol. 1987 Jul;61(7):2331–2335. doi: 10.1128/jvi.61.7.2331-2335.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hall L., Craig R. K., Edbrooke M. R., Campbell P. N. Comparison of the nucleotide sequence of cloned human and guinea-pig pre-alpha-lactalbumin cDNA with that of chick pre-lysozyme cDNA suggests evolution from a common ancestral gene. Nucleic Acids Res. 1982 Jun 11;10(11):3503–3515. doi: 10.1093/nar/10.11.3503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hamilton R., Watanabe C. K., de Boer H. A. Compilation and comparison of the sequence context around the AUG startcodons in Saccharomyces cerevisiae mRNAs. Nucleic Acids Res. 1987 Apr 24;15(8):3581–3593. doi: 10.1093/nar/15.8.3581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Inoue H., Noguchi T., Tanaka T. Complete amino acid sequence of rat L-type pyruvate kinase deduced from the cDNA sequence. Eur J Biochem. 1986 Jan 15;154(2):465–469. doi: 10.1111/j.1432-1033.1986.tb09420.x. [DOI] [PubMed] [Google Scholar]
  19. Kobilka B. K., Frielle T., Dohlman H. G., Bolanowski M. A., Dixon R. A., Keller P., Caron M. G., Lefkowitz R. J. Delineation of the intronless nature of the genes for the human and hamster beta 2-adrenergic receptor and their putative promoter regions. J Biol Chem. 1987 May 25;262(15):7321–7327. [PubMed] [Google Scholar]
  20. Kozak M. At least six nucleotides preceding the AUG initiator codon enhance translation in mammalian cells. J Mol Biol. 1987 Aug 20;196(4):947–950. doi: 10.1016/0022-2836(87)90418-9. [DOI] [PubMed] [Google Scholar]
  21. Kozak M. Bifunctional messenger RNAs in eukaryotes. Cell. 1986 Nov 21;47(4):481–483. doi: 10.1016/0092-8674(86)90609-4. [DOI] [PubMed] [Google Scholar]
  22. Kozak M. Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. Nucleic Acids Res. 1984 Jan 25;12(2):857–872. doi: 10.1093/nar/12.2.857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kozak M. Influences of mRNA secondary structure on initiation by eukaryotic ribosomes. Proc Natl Acad Sci U S A. 1986 May;83(9):2850–2854. doi: 10.1073/pnas.83.9.2850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kozak M. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell. 1986 Jan 31;44(2):283–292. doi: 10.1016/0092-8674(86)90762-2. [DOI] [PubMed] [Google Scholar]
  25. Kozak M. Possible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomes. Nucleic Acids Res. 1981 Oct 24;9(20):5233–5252. doi: 10.1093/nar/9.20.5233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kozak M. Translation of insulin-related polypeptides from messenger RNAs with tandemly reiterated copies of the ribosome binding site. Cell. 1983 Oct;34(3):971–978. doi: 10.1016/0092-8674(83)90554-8. [DOI] [PubMed] [Google Scholar]
  27. Kronenberg H. M., McDevitt B. E., Majzoub J. A., Nathans J., Sharp P. A., Potts J. T., Jr, Rich A. Cloning and nucleotide sequence of DNA coding for bovine preproparathyroid hormone. Proc Natl Acad Sci U S A. 1979 Oct;76(10):4981–4985. doi: 10.1073/pnas.76.10.4981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Larhammar D., Hammerling U., Rask L., Peterson P. A. Sequence of gene and cDNA encoding murine major histocompatibility complex class II gene A beta 2. J Biol Chem. 1985 Nov 15;260(26):14111–14119. [PubMed] [Google Scholar]
  29. Lawn R. M., Adelman J., Bock S. C., Franke A. E., Houck C. M., Najarian R. C., Seeburg P. H., Wion K. L. The sequence of human serum albumin cDNA and its expression in E. coli. Nucleic Acids Res. 1981 Nov 25;9(22):6103–6114. doi: 10.1093/nar/9.22.6103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Li S. S., Tiano H. F., Fukasawa K. M., Yagi K., Shimizu M., Sharief F. S., Nakashima Y., Pan Y. E. Protein structure and gene organization of mouse lactate dehydrogenase-A isozyme. Eur J Biochem. 1985 Jun 3;149(2):215–225. doi: 10.1111/j.1432-1033.1985.tb08914.x. [DOI] [PubMed] [Google Scholar]
  31. McPhaul M., Berg P. Identification and characterization of cDNA clones encoding two homologous proteins that are part of the asialoglycoprotein receptor. Mol Cell Biol. 1987 May;7(5):1841–1847. doi: 10.1128/mcb.7.5.1841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Minghetti P. P., Ruffner D. E., Kuang W. J., Dennison O. E., Hawkins J. W., Beattie W. G., Dugaiczyk A. Molecular structure of the human albumin gene is revealed by nucleotide sequence within q11-22 of chromosome 4. J Biol Chem. 1986 May 25;261(15):6747–6757. [PubMed] [Google Scholar]
  33. Peralta E. G., Winslow J. W., Peterson G. L., Smith D. H., Ashkenazi A., Ramachandran J., Schimerlik M. I., Capon D. J. Primary structure and biochemical properties of an M2 muscarinic receptor. Science. 1987 May 1;236(4801):600–605. doi: 10.1126/science.3107123. [DOI] [PubMed] [Google Scholar]
  34. Persico M. G., Viglietto G., Martini G., Toniolo D., Paonessa G., Moscatelli C., Dono R., Vulliamy T., Luzzatto L., D'Urso M. Isolation of human glucose-6-phosphate dehydrogenase (G6PD) cDNA clones: primary structure of the protein and unusual 5' non-coding region. Nucleic Acids Res. 1986 Mar 25;14(6):2511–2522. doi: 10.1093/nar/14.6.2511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Propst F., Rosenberg M. P., Iyer A., Kaul K., Vande Woude G. F. c-mos proto-oncogene RNA transcripts in mouse tissues: structural features, developmental regulation, and localization in specific cell types. Mol Cell Biol. 1987 May;7(5):1629–1637. doi: 10.1128/mcb.7.5.1629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Ratner L., Thielan B., Collins T. Sequences of the 5' portion of the human c-sis gene: characterization of the transcriptional promoter and regulation of expression of the protein product by 5' untranslated mRNA sequences. Nucleic Acids Res. 1987 Aug 11;15(15):6017–6036. doi: 10.1093/nar/15.15.6017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Rixon M. W., Chung D. W., Davie E. W. Nucleotide sequence of the gene for the gamma chain of human fibrinogen. Biochemistry. 1985 Apr 9;24(8):2077–2086. doi: 10.1021/bi00329a041. [DOI] [PubMed] [Google Scholar]
  38. Rose J. K. Complete intergenic and flanking gene sequences from the genome of vesicular stomatitis virus. Cell. 1980 Feb;19(2):415–421. doi: 10.1016/0092-8674(80)90515-2. [DOI] [PubMed] [Google Scholar]
  39. Royer-Pokora B., Kunkel L. M., Monaco A. P., Goff S. C., Newburger P. E., Baehner R. L., Cole F. S., Curnutte J. T., Orkin S. H. Cloning the gene for an inherited human disorder--chronic granulomatous disease--on the basis of its chromosomal location. Nature. 1986 Jul 3;322(6074):32–38. doi: 10.1038/322032a0. [DOI] [PubMed] [Google Scholar]
  40. Ruppert S., Scherer G., Schütz G. Recent gene conversion involving bovine vasopressin and oxytocin precursor genes suggested by nucleotide sequence. Nature. 1984 Apr 5;308(5959):554–557. doi: 10.1038/308554a0. [DOI] [PubMed] [Google Scholar]
  41. Santoro C., Marone M., Ferrone M., Costanzo F., Colombo M., Minganti C., Cortese R., Silengo L. Cloning of the gene coding for human L apoferritin. Nucleic Acids Res. 1986 Apr 11;14(7):2863–2876. doi: 10.1093/nar/14.7.2863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Sazer S., Schimke R. T. A re-examination of the 5' termini of mouse dihydrofolate reductase RNA. J Biol Chem. 1986 Apr 5;261(10):4685–4690. [PubMed] [Google Scholar]
  43. Schwer B., Visca P., Vos J. C., Stunnenberg H. G. Discontinuous transcription or RNA processing of vaccinia virus late messengers results in a 5' poly(A) leader. Cell. 1987 Jul 17;50(2):163–169. doi: 10.1016/0092-8674(87)90212-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Shahan K., Gilmartin M., Derman E. Nucleotide sequences of liver, lachrymal, and submaxillary gland mouse major urinary protein mRNAs: mosaic structure and construction of panels of gene-specific synthetic oligonucleotide probes. Mol Cell Biol. 1987 May;7(5):1938–1946. doi: 10.1128/mcb.7.5.1938. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Soares M. B., Ishii D. N., Efstratiadis A. Developmental and tissue-specific expression of a family of transcripts related to rat insulin-like growth factor II mRNA. Nucleic Acids Res. 1985 Feb 25;13(4):1119–1134. doi: 10.1093/nar/13.4.1119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Soares M. B., Turken A., Ishii D., Mills L., Episkopou V., Cotter S., Zeitlin S., Efstratiadis A. Rat insulin-like growth factor II gene. A single gene with two promoters expressing a multitranscript family. J Mol Biol. 1986 Dec 20;192(4):737–752. doi: 10.1016/0022-2836(86)90025-2. [DOI] [PubMed] [Google Scholar]
  47. Trifonov E. N. Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequences. J Mol Biol. 1987 Apr 20;194(4):643–652. doi: 10.1016/0022-2836(87)90241-5. [DOI] [PubMed] [Google Scholar]
  48. Tsuchiya M., Kaziro Y., Nagata S. The chromosomal gene structure for murine granulocyte colony-stimulating factor. Eur J Biochem. 1987 May 15;165(1):7–12. doi: 10.1111/j.1432-1033.1987.tb11187.x. [DOI] [PubMed] [Google Scholar]
  49. Ueda K., Clark D. P., Chen C. J., Roninson I. B., Gottesman M. M., Pastan I. The human multidrug resistance (mdr1) gene. cDNA cloning and transcription initiation. J Biol Chem. 1987 Jan 15;262(2):505–508. [PubMed] [Google Scholar]
  50. Weaver C. A., Gordon D. F., Kemper B. Nucleotide sequence of bovine parathyroid hormone messenger RNA. Mol Cell Endocrinol. 1982 Nov-Dec;28(3):411–424. doi: 10.1016/0303-7207(82)90136-8. [DOI] [PubMed] [Google Scholar]
  51. Wells D., Kedes L. Structure of a human histone cDNA: evidence that basally expressed histone genes have intervening sequences and encode polyadenylylated mRNAs. Proc Natl Acad Sci U S A. 1985 May;82(9):2834–2838. doi: 10.1073/pnas.82.9.2834. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES