Abstract
It has long been recognized that various genome classes were distinguishable on the basis of base composition and nearest neighbor frequencies. In addition Grantham et al. (8) have recently presented evidence that these distinctions are preserved at the level of codon usage. As discussed in this report it is now clear that these and related statistics can uniquely characterize the various functional domains of the genome. In particular peptide coding, intervening segments, structural RNA coding and mitochondrial domains of the vertebrate genome are uniquely characterizable. The statistical measures not only reflect understood functional differences among these domains but suggest others. The ability of these simple statistics of nucleic acid sequences to reflect so much of the encoded complex pattern information and/or effects of selective constraints is somewhat surprising. Here, we investigated the statistical measures most distinctive of the various domains and then linked them to our current understandings in so far as possible.
Full text
PDF















Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Anderson S., Bankier A. T., Barrell B. G., de Bruijn M. H., Coulson A. R., Drouin J., Eperon I. C., Nierlich D. P., Roe B. A., Sanger F. Sequence and organization of the human mitochondrial genome. Nature. 1981 Apr 9;290(5806):457–465. doi: 10.1038/290457a0. [DOI] [PubMed] [Google Scholar]
- Benoist C., O'Hare K., Breathnach R., Chambon P. The ovalbumin gene-sequence of putative control regions. Nucleic Acids Res. 1980 Jan 11;8(1):127–142. doi: 10.1093/nar/8.1.127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bird A. P. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 1980 Apr 11;8(7):1499–1504. doi: 10.1093/nar/8.7.1499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bourgeois S., Jernigan R. L., Szu S. C., Kabat E. A., Wu T. T. Composite predictions of secondary structures of lac repressor. Biopolymers. 1979 Oct;18(10):2625–2643. doi: 10.1002/bip.1979.360181017. [DOI] [PubMed] [Google Scholar]
- Browne M. J., Burdon R. H. The sequence specificity of vertebrate DNA methylation. Nucleic Acids Res. 1977 Apr;4(4):1025–1037. doi: 10.1093/nar/4.4.1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Büchel D. E., Gronenborn B., Müller-Hill B. Sequence of the lactose permease gene. Nature. 1980 Feb 7;283(5747):541–545. doi: 10.1038/283541a0. [DOI] [PubMed] [Google Scholar]
- Dickerson R. E., Drew H. R. Structure of a B-DNA dodecamer. II. Influence of base sequence on helix structure. J Mol Biol. 1981 Jul 15;149(4):761–786. doi: 10.1016/0022-2836(81)90357-0. [DOI] [PubMed] [Google Scholar]
- Duncan B. K., Miller J. H. Mutagenic deamination of cytosine residues in DNA. Nature. 1980 Oct 9;287(5782):560–561. doi: 10.1038/287560a0. [DOI] [PubMed] [Google Scholar]
- Fields S., Winter G., Brownlee G. G. Structure of the neuraminidase gene in human influenza virus A/PR/8/34. Nature. 1981 Mar 19;290(5803):213–217. doi: 10.1038/290213a0. [DOI] [PubMed] [Google Scholar]
- Fischhoff D. A., Vovis G. F., Zinder N. D. Organization of chimeras between filamentous bacteriophage f1 and plasmid pSC101. J Mol Biol. 1980 Dec 15;144(3):247–265. doi: 10.1016/0022-2836(80)90089-3. [DOI] [PubMed] [Google Scholar]
- Fitch W. M. Estimating the total number of nucleotide substitutions since the common ancestor of a pair of homologous genes: comparison of several methods and three beta hemoglobin messenger RNA's. J Mol Evol. 1980 Dec;16(3-4):153–209. doi: 10.1007/BF01804976. [DOI] [PubMed] [Google Scholar]
- Fox T. D. Five TGA "stop" codons occur within the translated sequence of the yeast mitochondrial gene for cytochrome c oxidase subunit II. Proc Natl Acad Sci U S A. 1979 Dec;76(12):6534–6538. doi: 10.1073/pnas.76.12.6534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garoff H., Frischauf A. M., Simons K., Lehrach H., Delius H. Nucleotide sequence of cdna coding for Semliki Forest virus membrane glycoproteins. Nature. 1980 Nov 20;288(5788):236–241. doi: 10.1038/288236a0. [DOI] [PubMed] [Google Scholar]
- Gilbert W. Why genes in pieces? Nature. 1978 Feb 9;271(5645):501–501. doi: 10.1038/271501a0. [DOI] [PubMed] [Google Scholar]
- Goeddel D. V., Leung D. W., Dull T. J., Gross M., Lawn R. M., McCandliss R., Seeburg P. H., Ullrich A., Yelverton E., Gray P. W. The structure of eight distinct cloned human leukocyte interferon cDNAs. Nature. 1981 Mar 5;290(5801):20–26. doi: 10.1038/290020a0. [DOI] [PubMed] [Google Scholar]
- Grantham R., Gautier C., Gouy M. Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. Nucleic Acids Res. 1980 May 10;8(9):1893–1912. doi: 10.1093/nar/8.9.1893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartley J. L., Donelson J. E. Nucleotide sequence of the yeast plasmid. Nature. 1980 Aug 28;286(5776):860–865. doi: 10.1038/286860a0. [DOI] [PubMed] [Google Scholar]
- Hobart P. M., Shen L. P., Crawford R., Pictet R. L., Rutter W. J. Comparison of the nucleic acid sequence of anglerfish and mammalian insulin mRNA's from cloned cDNA's. Science. 1980 Dec 19;210(4476):1360–1363. doi: 10.1126/science.7001633. [DOI] [PubMed] [Google Scholar]
- Lindahl T. DNA methylation and control of gene expression. Nature. 1981 Apr 2;290(5805):363–364. doi: 10.1038/290363b0. [DOI] [PubMed] [Google Scholar]
- Lomonossoff G. P., Butler P. J., Klug A. Sequence-dependent variation in the conformation of DNA. J Mol Biol. 1981 Jul 15;149(4):745–760. doi: 10.1016/0022-2836(81)90356-9. [DOI] [PubMed] [Google Scholar]
- MacDonald R. J., Crerar M. M., Swain W. F., Pictet R. L., Thomas G., Rutter W. J. Structure of a family of rat amylase genes. Nature. 1980 Sep 11;287(5778):117–122. doi: 10.1038/287117a0. [DOI] [PubMed] [Google Scholar]
- Marx J. L. Gene control puzzle begins to yield. Science. 1981 May 8;212(4495):653–655. doi: 10.1126/science.7221551. [DOI] [PubMed] [Google Scholar]
- McKay D. B., Steitz T. A. Structure of catabolite gene activator protein at 2.9 A resolution suggests binding to left-handed B-DNA. Nature. 1981 Apr 30;290(5809):744–749. doi: 10.1038/290744a0. [DOI] [PubMed] [Google Scholar]
- Moreau J., Marcaud L., Maschat F., Kejzlarova-Lepesant J., Lepesant J. A., Scherrer K. A + T-rich linkers define functional domains in eukaryotic DNA. Nature. 1982 Jan 21;295(5846):260–262. doi: 10.1038/295260a0. [DOI] [PubMed] [Google Scholar]
- Porter A. G., Barber C., Carey N. H., Hallewell R. A., Threlfall G., Emtage J. S. Complete nucleotide sequence of an influenza virus haemagglutinin gene from cloned DNA. Nature. 1979 Nov 29;282(5738):471–477. doi: 10.1038/282471a0. [DOI] [PubMed] [Google Scholar]
- Proudfoot N. J., Shander M. H., Manley J. L., Gefter M. L., Maniatis T. Structure and in vitro transcription of human globin genes. Science. 1980 Sep 19;209(4463):1329–1336. doi: 10.1126/science.6158093. [DOI] [PubMed] [Google Scholar]
- Razin A., Riggs A. D. DNA methylation and gene function. Science. 1980 Nov 7;210(4470):604–610. doi: 10.1126/science.6254144. [DOI] [PubMed] [Google Scholar]
- Sanger F., Air G. M., Barrell B. G., Brown N. L., Coulson A. R., Fiddes C. A., Hutchison C. A., Slocombe P. M., Smith M. Nucleotide sequence of bacteriophage phi X174 DNA. Nature. 1977 Feb 24;265(5596):687–695. doi: 10.1038/265687a0. [DOI] [PubMed] [Google Scholar]
- Shepherd J. C. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci U S A. 1981 Mar;78(3):1596–1600. doi: 10.1073/pnas.78.3.1596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slightom J. L., Blechl A. E., Smithies O. Human fetal G gamma- and A gamma-globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Cell. 1980 Oct;21(3):627–638. doi: 10.1016/0092-8674(80)90426-2. [DOI] [PubMed] [Google Scholar]
- Smith T. F., Waterman M. S., Fitch W. M. Comparative biosequence metrics. J Mol Evol. 1981;18(1):38–46. doi: 10.1007/BF01733210. [DOI] [PubMed] [Google Scholar]
- Waalwijk C., Flavell R. A. DNA methylation at a CCGG sequence in the large intron of the rabbit beta-globin gene: tissue-specific variations. Nucleic Acids Res. 1978 Dec;5(12):4631–4634. doi: 10.1093/nar/5.12.4631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wagner M. J., Sharp J. A., Summers W. C. Nucleotide sequence of the thymidine kinase gene of herpes simplex virus type 1. Proc Natl Acad Sci U S A. 1981 Mar;78(3):1441–1445. doi: 10.1073/pnas.78.3.1441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang A. J., Quigley G. J., Kolpak F. J., van der Marel G., van Boom J. H., Rich A. Left-handed double helical DNA: variations in the backbone conformation. Science. 1981 Jan 9;211(4478):171–176. doi: 10.1126/science.7444458. [DOI] [PubMed] [Google Scholar]
