Abstract
Estimation of gene number in mammals is difficult due to the high proportion of noncoding DNA within the nucleus. In this study, we provide a direct measurement of the number of genes in human and mouse. We have taken advantage of the fact that many mammalian genes are associated with CpG islands whose distinctive properties allow their physical separation from bulk DNA. Our results suggest that there are approximately 45,000 CpG islands per haploid genome in humans and 37,000 in the mouse. Sequence comparison confirms that about 20% of the human CpG islands are absent from the homologous mouse genes. Analysis of a selection of genes suggests that both human and mouse are losing CpG islands over evolutionary time due to de novo methylation in the germ line followed by CpG loss through mutation. This process appears to be more rapid in rodents. Combining the number of CpG islands with the proportion of island-associated genes, we estimate that the total number of genes per haploid genome is approximately 80,000 in both organisms.
Full text
PDFImages in this article
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Antequera F., Boyes J., Bird A. High levels of de novo methylation and altered chromatin structure at CpG islands in cell lines. Cell. 1990 Aug 10;62(3):503–514. doi: 10.1016/0092-8674(90)90015-7. [DOI] [PubMed] [Google Scholar]
- Bickmore W. A., Bird A. P. Use of restriction endonucleases to detect and isolate genes from mammalian cells. Methods Enzymol. 1992;216:224–244. doi: 10.1016/0076-6879(92)16024-e. [DOI] [PubMed] [Google Scholar]
- Bickmore W. A., Sumner A. T. Mammalian chromosome banding--an expression of genome organization. Trends Genet. 1989 May;5(5):144–148. doi: 10.1016/0168-9525(89)90055-3. [DOI] [PubMed] [Google Scholar]
- Bird A. P., Taggart M. H., Gehring C. A. Methylated and unmethylated ribosomal RNA genes in the mouse. J Mol Biol. 1981 Oct 15;152(1):1–17. doi: 10.1016/0022-2836(81)90092-9. [DOI] [PubMed] [Google Scholar]
- Bird A. P., Taggart M. H., Nicholls R. D., Higgs D. R. Non-methylated CpG-rich islands at the human alpha-globin locus: implications for evolution of the alpha-globin pseudogene. EMBO J. 1987 Apr;6(4):999–1004. doi: 10.1002/j.1460-2075.1987.tb04851.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bird A. P., Taggart M. H. Variable patterns of total DNA and rDNA methylation in animals. Nucleic Acids Res. 1980 Apr 11;8(7):1485–1497. doi: 10.1093/nar/8.7.1485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bird A., Taggart M., Frommer M., Miller O. J., Macleod D. A fraction of the mouse genome that is derived from islands of nonmethylated, CpG-rich DNA. Cell. 1985 Jan;40(1):91–99. doi: 10.1016/0092-8674(85)90312-5. [DOI] [PubMed] [Google Scholar]
- Bonetta L., Kuehn S. E., Huang A., Law D. J., Kalikin L. M., Koi M., Reeve A. E., Brownstein B. H., Yeger H., Williams B. R. Wilms tumor locus on 11p13 defined by multiple CpG island-associated transcripts. Science. 1990 Nov 16;250(4983):994–997. doi: 10.1126/science.2173146. [DOI] [PubMed] [Google Scholar]
- Brooks-Wilson A. R., Smailus D. E., Goodfellow P. J. A cluster of CpG islands at D10S94, near the locus responsible for multiple endocrine neoplasia type 2A (MEN2A). Genomics. 1992 Jun;13(2):339–343. doi: 10.1016/0888-7543(92)90250-v. [DOI] [PubMed] [Google Scholar]
- Carlock L., Wisniewski D., Lorincz M., Pandrangi A., Vo T. An estimate of the number of genes in the Huntington disease gene region and the identification of 13 transcripts in the 4p16.3 segment. Genomics. 1992 Aug;13(4):1108–1118. doi: 10.1016/0888-7543(92)90025-n. [DOI] [PubMed] [Google Scholar]
- Colombo P., Yon J., Garson K., Fried M. Conservation of the organization of five tightly clustered genes over 600 million years of divergent evolution. Proc Natl Acad Sci U S A. 1992 Jul 15;89(14):6358–6362. doi: 10.1073/pnas.89.14.6358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooke H. J., Smith B. A. Variability at the telomeres of the human X/Y pseudoautosomal region. Cold Spring Harb Symp Quant Biol. 1986;51(Pt 1):213–219. doi: 10.1101/sqb.1986.051.01.026. [DOI] [PubMed] [Google Scholar]
- Cooper D. N., Krawczak M. Cytosine methylation and the fate of CpG dinucleotides in vertebrate genomes. Hum Genet. 1989 Sep;83(2):181–188. doi: 10.1007/BF00286715. [DOI] [PubMed] [Google Scholar]
- Cooper D. N., Taggart M. H., Bird A. P. Unmethylated domains in vertebrate DNA. Nucleic Acids Res. 1983 Feb 11;11(3):647–658. doi: 10.1093/nar/11.3.647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daniels D. L., Plunkett G., 3rd, Burland V., Blattner F. R. Analysis of the Escherichia coli genome: DNA sequence of the region from 84.5 to 86.5 minutes. Science. 1992 Aug 7;257(5071):771–778. doi: 10.1126/science.1379743. [DOI] [PubMed] [Google Scholar]
- Fischel-Ghodsian N., Nicholls R. D., Higgs D. R. Long range genome structure around the human alpha-globin complex analysed by PFGE. Nucleic Acids Res. 1987 Aug 11;15(15):6197–6207. doi: 10.1093/nar/15.15.6197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gardiner-Garden M., Frommer M. CpG islands in vertebrate genomes. J Mol Biol. 1987 Jul 20;196(2):261–282. doi: 10.1016/0022-2836(87)90689-9. [DOI] [PubMed] [Google Scholar]
- Holmquist G. P. Evolution of chromosome bands: molecular ecology of noncoding DNA. J Mol Evol. 1989 Jun;28(6):469–486. doi: 10.1007/BF02602928. [DOI] [PubMed] [Google Scholar]
- Jones P. A., Rideout W. M., 3rd, Shen J. C., Spruck C. H., Tsai Y. C. Methylation, mutation and cancer. Bioessays. 1992 Jan;14(1):33–36. doi: 10.1002/bies.950140107. [DOI] [PubMed] [Google Scholar]
- Larsen F., Gundersen G., Lopez R., Prydz H. CpG islands as gene markers in the human genome. Genomics. 1992 Aug;13(4):1095–1107. doi: 10.1016/0888-7543(92)90024-m. [DOI] [PubMed] [Google Scholar]
- Lavia P., Macleod D., Bird A. Coincident start sites for divergent transcripts at a randomly selected CpG-rich island of mouse. EMBO J. 1987 Sep;6(9):2773–2779. doi: 10.1002/j.1460-2075.1987.tb02572.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Long E. O., Dawid I. B. Repeated genes in eukaryotes. Annu Rev Biochem. 1980;49:727–764. doi: 10.1146/annurev.bi.49.070180.003455. [DOI] [PubMed] [Google Scholar]
- Martin-Gallardo A., McCombie W. R., Gocayne J. D., FitzGerald M. G., Wallace S., Lee B. M., Lamerdin J., Trapp S., Kelley J. M., Liu L. I. Automated DNA sequencing and analysis of 106 kilobases from human chromosome 19q13.3. Nat Genet. 1992 Apr;1(1):34–39. doi: 10.1038/ng0492-34. [DOI] [PubMed] [Google Scholar]
- McCombie W. R., Martin-Gallardo A., Gocayne J. D., FitzGerald M., Dubnick M., Kelley J. M., Castilla L., Liu L. I., Wallace S., Trapp S. Expressed genes, Alu repeats and polymorphisms in cosmids sequenced from chromosome 4p16.3. Nat Genet. 1992 Aug;1(5):348–353. doi: 10.1038/ng0892-348. [DOI] [PubMed] [Google Scholar]
- Milner C. M., Campbell R. D. Genes, genes and more genes in the human major histocompatibility complex. Bioessays. 1992 Aug;14(8):565–571. doi: 10.1002/bies.950140814. [DOI] [PubMed] [Google Scholar]
- Nichols J., Evans E. P., Smith A. G. Establishment of germ-line-competent embryonic stem (ES) cells using differentiation inhibiting activity. Development. 1990 Dec;110(4):1341–1348. doi: 10.1242/dev.110.4.1341. [DOI] [PubMed] [Google Scholar]
- Oliver S. G., van der Aart Q. J., Agostoni-Carbone M. L., Aigle M., Alberghina L., Alexandraki D., Antoine G., Anwar R., Ballesta J. P., Benit P. The complete DNA sequence of yeast chromosome III. Nature. 1992 May 7;357(6373):38–46. doi: 10.1038/357038a0. [DOI] [PubMed] [Google Scholar]
- Shmookler Reis R. J., Goldstein S. Variability of DNA methylation patterns during serial passage of human diploid fibroblasts. Proc Natl Acad Sci U S A. 1982 Jul;79(13):3949–3953. doi: 10.1073/pnas.79.13.3949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sved J., Bird A. The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. Proc Natl Acad Sci U S A. 1990 Jun;87(12):4692–4696. doi: 10.1073/pnas.87.12.4692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weber B., Collins C., Kowbel D., Riess O., Hayden M. R. Identification of multiple CpG islands and associated conserved sequences in a candidate region for the Huntington disease gene. Genomics. 1991 Dec;11(4):1113–1124. doi: 10.1016/0888-7543(91)90039-h. [DOI] [PubMed] [Google Scholar]