Skip to main content
Genetics logoLink to Genetics
. 2003 Sep;165(1):427–436. doi: 10.1093/genetics/165.1.427

New explicit expressions for relative frequencies of single-nucleotide polymorphisms with application to statistical inference on population growth.

A Polanski 1, M Kimmel 1
PMCID: PMC1462751  PMID: 14504247

Abstract

We present new methodology for calculating sampling distributions of single-nucleotide polymorphism (SNP) frequencies in populations with time-varying size. Our approach is based on deriving analytical expressions for frequencies of SNPs. Analytical expressions allow for computations that are faster and more accurate than Monte Carlo simulations. In contrast to other articles showing analytical formulas for frequencies of SNPs, we derive expressions that contain coefficients that do not explode when the genealogy size increases. We also provide analytical formulas to describe the way in which the ascertainment procedure modifies SNP distributions. Using our methods, we study the power to test the hypothesis of exponential population expansion vs. the hypothesis of evolution with constant population size. We also analyze some of the available SNP data and we compare our results of demographic parameters estimation to those obtained in previous studies in population genetics. The analyzed data seem consistent with the hypothesis of past population growth of modern humans. The analysis of the data also shows a very strong sensitivity of estimated demographic parameters to changes of the model of the ascertainment procedure.

Full Text

The Full Text of this article is available as a PDF (146.1 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Boerwinkle E., Ellsworth D. L., Hallman D. M., Biddinger A. Genetic analysis of atherosclerosis: a research paradigm for the common chronic diseases. Hum Mol Genet. 1996;5(Spec No):1405–1410. doi: 10.1093/hmg/5.supplement_1.1405. [DOI] [PubMed] [Google Scholar]
  2. Bonnen P. E., Story M. D., Ashorn C. L., Buchholz T. A., Weil M. M., Nelson D. L. Haplotypes at ATM identify coding-sequence variation and indicate a region of extensive linkage disequilibrium. Am J Hum Genet. 2000 Nov 14;67(6):1437–1451. doi: 10.1086/316908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Cann R. L., Stoneking M., Wilson A. C. Mitochondrial DNA and human evolution. Nature. 1987 Jan 1;325(6099):31–36. doi: 10.1038/325031a0. [DOI] [PubMed] [Google Scholar]
  4. Cargill M., Altshuler D., Ireland J., Sklar P., Ardlie K., Patil N., Shaw N., Lane C. R., Lim E. P., Kalyanaraman N. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet. 1999 Jul;22(3):231–238. doi: 10.1038/10290. [DOI] [PubMed] [Google Scholar]
  5. Collins F. S., Guyer M. S., Charkravarti A. Variations on a theme: cataloging human DNA sequence variation. Science. 1997 Nov 28;278(5343):1580–1581. doi: 10.1126/science.278.5343.1580. [DOI] [PubMed] [Google Scholar]
  6. Eberle M. A., Kruglyak L. An analysis of strategies for discovery of single-nucleotide polymorphisms. Genet Epidemiol. 2000;19 (Suppl 1):S29–S35. doi: 10.1002/1098-2272(2000)19:1+<::AID-GEPI5>3.0.CO;2-P. [DOI] [PubMed] [Google Scholar]
  7. Fu Y. X. Statistical properties of segregating sites. Theor Popul Biol. 1995 Oct;48(2):172–197. doi: 10.1006/tpbi.1995.1025. [DOI] [PubMed] [Google Scholar]
  8. Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet. 1999 Jun;22(2):139–144. doi: 10.1038/9642. [DOI] [PubMed] [Google Scholar]
  9. Kuhner M. K., Beerli P., Yamato J., Felsenstein J. Usefulness of single nucleotide polymorphism data for estimating population parameters. Genetics. 2000 Sep;156(1):439–447. doi: 10.1093/genetics/156.1.439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Marth G. T., Korf I., Yandell M. D., Yeh R. T., Gu Z., Zakeri H., Stitziel N. O., Hillier L., Kwok P. Y., Gish W. R. A general approach to single-nucleotide polymorphism discovery. Nat Genet. 1999 Dec;23(4):452–456. doi: 10.1038/70570. [DOI] [PubMed] [Google Scholar]
  11. Nielsen R. Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics. 2000 Feb;154(2):931–942. doi: 10.1093/genetics/154.2.931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Picoult-Newberg L., Ideker T. E., Pohl M. G., Taylor S. L., Donaldson M. A., Nickerson D. A., Boyce-Jacino M. Mining SNPs from EST databases. Genome Res. 1999 Feb;9(2):167–174. [PMC free article] [PubMed] [Google Scholar]
  13. Polanski A., Bobrowski A., Kimmel M. A note on distributions of times to coalescence, under time-dependent population size. Theor Popul Biol. 2003 Feb;63(1):33–40. doi: 10.1016/s0040-5809(02)00010-2. [DOI] [PubMed] [Google Scholar]
  14. Polanski A., Kimmel M., Chakraborty R. Application of a time-dependent coalescence process for inferring the history of population size changes from DNA sequence data. Proc Natl Acad Sci U S A. 1998 May 12;95(10):5456–5461. doi: 10.1073/pnas.95.10.5456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Rogers A. R., Harpending H. Population growth makes waves in the distribution of pairwise genetic differences. Mol Biol Evol. 1992 May;9(3):552–569. doi: 10.1093/oxfordjournals.molbev.a040727. [DOI] [PubMed] [Google Scholar]
  16. Sherry S. T., Harpending H. C., Batzer M. A., Stoneking M. Alu evolution in human populations: using the coalescent to estimate effective population size. Genetics. 1997 Dec;147(4):1977–1982. doi: 10.1093/genetics/147.4.1977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Slatkin M., Hudson R. R. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics. 1991 Oct;129(2):555–562. doi: 10.1093/genetics/129.2.555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Trikka Dimitra, Fang Zhe, Renwick Alex, Jones Sally H., Chakraborty Ranajit, Kimmel Marek, Nelson David L. Complex SNP-based haplotypes in three human helicases: implications for cancer association studies. Genome Res. 2002 Apr;12(4):627–639. doi: 10.1101/gr.176702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Wakeley J., Nielsen R., Liu-Cordero S. N., Ardlie K. The discovery of single-nucleotide polymorphisms--and inferences about human demographic history. Am J Hum Genet. 2001 Nov 6;69(6):1332–1347. doi: 10.1086/324521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Wakeley J. The coalescent in an island model of population subdivision with variation among demes. Theor Popul Biol. 2001 Mar;59(2):133–144. doi: 10.1006/tpbi.2000.1495. [DOI] [PubMed] [Google Scholar]
  21. Wang D. G., Fan J. B., Siao C. J., Berno A., Young P., Sapolsky R., Ghandour G., Perkins N., Winchester E., Spencer J. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998 May 15;280(5366):1077–1082. doi: 10.1126/science.280.5366.1077. [DOI] [PubMed] [Google Scholar]
  22. Weiss G., von Haeseler A. Inference of population history using a likelihood approach. Genetics. 1998 Jul;149(3):1539–1546. doi: 10.1093/genetics/149.3.1539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Wooding Stephen, Rogers Alan. The matrix coalescent and an application to human single-nucleotide polymorphisms. Genetics. 2002 Aug;161(4):1641–1650. doi: 10.1093/genetics/161.4.1641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Yang Z., Wong G. K., Eberle M. A., Kibukawa M., Passey D. A., Hughes W. R., Kruglyak L., Yu J. Sampling SNPs. Nat Genet. 2000 Sep;26(1):13–14. doi: 10.1038/79113. [DOI] [PubMed] [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES