The Hardy-Weinberg law1–4 presents a mathematical statement that describes the relationship between gene frequencies and genotype frequencies: gene frequencies at a locus in a randomly interbreeding diploid population and population genotype frequencies remain constant from generation to generation if mating is random and mutation, selection, and migration do not occur. The law states a fundamental principle of population genetics that is approximately true for small, and holds with increasing exactness for large populations. Should the frequencies be perturbed for any reason, they will come to the expected equilibrium frequencies after one generation of random mating.
The Hardy-Weinberg law can be used for analytical purposes. It is suitable to test the hypothesis of panmixia and evolutionary stasis. Moreover, it represents a null hypothesis to test in genetic studies. However, according to our personal experience, data for Hardy-Weinberg equilibrium (HWE) calculations in studied populations are not always presented in articles with data on the genotype distributions of biallelic polymorphisms of Mendelian inheritance. In this retrospective survey, we tested in papers published in Gut, if this important and qualifying law was checked in studies investigating genetic polymorphisms between 1998 and April 2003.
We collected genotype distributions published in papers in Gut from 1998 (volume 42) to 2003 (volume 52). Of 2389 total publications, we found 69 where genetic polymorphisms were part of the study. Of these, those articles that fulfilled the following criteria were selected: investigation of biallelic genetic polymorphism with Mendelian inheritance; use of healthy reference population in the study; and availability of genotype distribution data.
We recalculated HWE in each paper and in each study group. For this purpose, we used Arlequin software (http://anthropologie.unige.ch/arlequin/).5,6 The level of statistical significance was set at p<0.05. Deviations from HWE were further confirmed by manual recalculation.
Twenty publications in Gut fulfilled the enrolment criteria; these publications presented data on 166 genotypes. However, only four papers (20%) reported that HWE calculations were performed and genotype distribution fulfilled HW criteria. Genotype distributions in healthy reference populations did not fulfil HWE in two publications (10%).7,8 In two reports, HWE was fulfilled in controls but failed in the diseased population and this fact was not reported.9,10 In one publication, HWE was not fulfilled either in the healthy reference or in the diseased populations.11 In summary, we found 11 genotype distributions in five publications of the studied 166 genotypes and 20 papers where calculation of the HW law (in control or in the investigated populations) would provide additional information. We present the results of HW calculations of these 11 polymorphisms in table 1▶.
Table 1 .
Publication [reference] | Gene polymorphism | Subjects | Genotype distribution WW/WM/MM | HWE p value |
---|---|---|---|---|
1998;43:187–9.7 | CTLA-4 exon 1 position 49 | Controls | 62/47/21 | 0.034 |
HLA DR3 negative controls | 54/36/16 | 0.027 | ||
2001;48:836–42.9 | HFE H63D polymorphism haplotypes | Iron overload patients | 178/85/60 | 0.001 |
2002;50:520–4.11 | Methylene tetrahydrofolate reductase | Controls | 533/560/114 | 0.053 |
Control, 60 y< | 204/229/34 | 0.005 | ||
Female colorectal cancer patients | 134/101/35 | 0.031 | ||
Colorectal cancer stage: Dukes’ B | 94/64/28 | 0.003 | ||
Colorectal cancer and p53 mutation− | 148/116/39 | 0.036 | ||
2003;52:547–51.8 | CYPA1A1 exon7 | Controls | 122/22/5 | 0.012 |
EPXH exon 3 | Controls | 59/58/32 | 0.018 | |
2003;52:558–62.10 | NOD2 gene 1007fs variant | Crohn’s disease patients | 248/20/3 | 0.017 |
WW, number of patients with homozygosity for wild allele; WM, number of patients with heterozygosity, MM, number of patients with homozygosity for mutant allele.
There are several explanations why the observed genotype frequencies may deviate significantly from those expected by HW law and why genotypes have different likelihoods of being included, even when determination of genotype was methodologically correct. One or more of the assumptions of the model might be incorrect, non-random mating (inbreeding or an allele effect on the mating) or gene flow may have occurred, or selection operated. There could also be an error at sampling: the studied population was not well defined or the sample size may be too small.12 (Interestingly, the occurrence of HWE error was independent of the number of patients enrolled; the size of study populations in the affected papers ranged between 24 and 1207.)
If the genotype distribution in the control population misses the HWE, the results should be treated cautiously because the observed genotype distribution does not represent the genotype distribution in healthy (non-diseased) people and, therefore, conclusions cannot be drawn for the significance of the investigated polymorphism.
If the genotype distribution in the investigated (diseased) population does not fulfil the HWE law (while the healthy reference population fulfils it), it might be supporting evidence for the correlation between genotype and disease. Unreported or weak associations can be detected by calculating the HWE, even when statistically significant differences between genotype distributions is not present.
In conclusion, we suggest that providing genotype distribution data together with detailed results of HWE calculations should be a must when results of population genetic studies are published.
References
- 1.Hardy GH. Mendelian proportions in a mixed population. section: Discussion and correspondence. Science 1908;28:49–50 (Reprinted in Peters JH. Classic papers in genetics. Englewood Cliffs, NJ: Prentice-Hall Inc, 1959.). [DOI] [PubMed] [Google Scholar]
- 2.Weinberg W. Über den Nachweis der Vererbung beim Menchen. Jahresh. Verein f. vaterl. Naturk Wüttemberg 1908;64:368–82. [Google Scholar]
- 3.Roughgarden J. Theory of population genetics and evolutionary ecology. NJ: Prentice Hall Inc, 1996.
- 4.Stern C. The Hardy-Weinberg law. Science 1943;97:137–8. [DOI] [PubMed] [Google Scholar]
- 5.Schneider S, Roessli D, Excoffier L. Arlequin ver. 2.000. Software for population genetics data analysis. Switzerland: Genetics and Biometry Laboratory, University of Geneva, 2000.
- 6.Guo SW, Thompson EA. Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 1992;48:361–72. [PubMed] [Google Scholar]
- 7.Djilali-Saiah I, Schmitz J, Harfouch-Hammoud E, et al. CTLA-4 gene polymorphism is associated with predisposition to coeliac disease. Gut 1998;43:187–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.de Jong DJ, van der Logt EM, van Schaik A, et al. Genetic polymorphisms in biotransformation enzymes in Crohn’s disease: association with microsomal epoxide hydrolase. Gut 2003;52:547–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Aguilar-Martinez P, Bismuth M, Picot MC, et al. Variable phenotypic presentation of iron overload in H63D homozygotes: are genetic modifiers the cause? Gut 2001;48:836–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Helio T, Halme L, Lappalainen M, et al. CARD15/NOD2 gene variants are associated with familially occurring and complicated forms of Crohn’s disease. Gut 2003;52:558–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Shannon B, Gnanasampanthan S, Beilby J, et al. A polymorphism in the methylenetetrahydrofolate reductase gene predisposes to colorectal cancers with microsatellite instability. Gut 2002;50:520–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hedrick PW. Genetics of populations. New York: Van Nostrand Reinhold Co, 1983.