Skip to main content
Springer logoLink to Springer
. 2009 Jun 13;52(9):1846–1851. doi: 10.1007/s00125-009-1419-3

Is the thrifty genotype hypothesis supported by evidence based on confirmed type 2 diabetes- and obesity-susceptibility variants?

L Southam 1,2, N Soranzo 3,4, S B Montgomery 3, T M Frayling 5, M I McCarthy 1,6,7, I Barroso 3, E Zeggini 1,3,
PMCID: PMC2723682  PMID: 19526209

Abstract

Aims/hypothesis

According to the thrifty genotype hypothesis, the high prevalence of type 2 diabetes and obesity is a consequence of genetic variants that have undergone positive selection during historical periods of erratic food supply. The recent expansion in the number of validated type 2 diabetes- and obesity-susceptibility loci, coupled with access to empirical data, enables us to look for evidence in support (or otherwise) of the thrifty genotype hypothesis using proven loci.

Methods

We employed a range of tests to obtain complementary views of the evidence for selection: we determined whether the risk allele at associated ‘index’ single-nucleotide polymorphisms is derived or ancestral, calculated the integrated haplotype score (iHS) and assessed the population differentiation statistic fixation index (FST) for 17 type 2 diabetes and 13 obesity loci.

Results

We found no evidence for significant differences for the derived/ancestral allele test. None of the studied loci showed strong evidence for selection based on the iHS score. We find a high FST for rs7901695 at TCF7L2, the largest type 2 diabetes effect size found to date.

Conclusions/interpretation

Our results provide some evidence for selection at specific loci, but there are no consistent patterns of selection that provide conclusive confirmation of the thrifty genotype hypothesis. Discovery of more signals and more causal variants for type 2 diabetes and obesity is likely to allow more detailed examination of these issues.

Keywords: Genetic association, Haplotype, Obesity, Positive selection, Thrifty genotype hypothesis, Type 2 diabetes

Introduction

Type 2 diabetes and obesity are complex traits, caused by multiple environmental and genetic factors. In recent decades, there has been a dramatic rise in the prevalence of type 2 diabetes and obesity in the Western and developing world. Adaptation to powerful selective forces for genotypes that provide survival advantage has been proposed as an explanation for this observed capacity of a genetic disease to become so prevalent when unmasked by changes in environment. In 1962, James Neel suggested that exposure to periods of famine during human evolutionary history resulted in selection pressures in favour of a thrifty genotype that led to highly efficient fat storage during periods of abundance [1]. In the current climate of food overabundance and sedentary lifestyle, this thrifty genotype is suggested to lead to metabolically disadvantageous phenotypes.

Signals of positive selection resulting in reduced haplotype diversity can be identified by investigating haplotype structure and allelic architecture. For example, if the thrifty genotype hypothesis were true, we would expect to observe some of the following characteristics at disease loci: risk alleles would be derived alleles; there would be substantial differences in allele frequency across different populations; and there would be evidence that relatively recently emerging alleles have been swept to high frequency. These tests offer the possibility of detecting selection signals, operating over different time scales (ranging from recent positive selection identified through extreme integrated haplotype scores [iHSs] to the much older time frame of derived/ancestral allele status), and we would therefore not expect to obtain consistent evidence across the different tests.

The fields of type 2 diabetes and obesity genetics had until recent years met with limited success in identifying replicating loci. The advent of large-scale, well-designed association studies, coupled with large-scale follow-up and stringent criteria for declaring reproducible association, has led to the identification of well-established type 2 diabetes and obesity loci. This enables us for the first time to carry out a systematic examination of these genomic loci for evidence of signatures of selection, and thereby seek to corroborate or refute the thrifty genotype hypothesis.

Methods

For the purposes of this study, we define a confirmed type 2 diabetes or obesity locus as one that has been robustly replicated, reaching a genome-wide significance threshold of p < 5 × 10−8. This criterion yields 17 loci for type 2 diabetes (in or near the TCF7L2, PPARG, KCNJ11, CDKAL1, SLC30A8, IGF2BP2, NOTCH2, THADA, JAZF1, CDC123/CAMK1D, TSPAN8/LGR5, HHEX/IDE, CDKN2A/B, ADAMTS9, TCF2, WFS1 and KCNQ1 genes) [2] and 13 for obesity (associations with BMI) (in or near the FTO, TMEM18, MC4R, GNPDA2, SH2B1, KCTD15, MTCH2, NEGR1, PCSK1, LGR4/LIN7C/BDNF [two independent single nucleotide polymorphisms {SNPs}], ETV5/SFRS10/DGKG and MAF genes) [38] (Tables 1 and 2). We have selected a representative (index) SNP for each of these 30 independently associated loci and have examined several characteristics of the genomic sequence that might indicate evidence for selection.

Table 1.

Type 2 diabetes-associated risk allele characteristics

SNP Chr Position NCBI 36.1 (bp) No-risk allele Risk allele Risk allele frequencyb Nearest gene(s) iHS scorec FeST global FfST CEU-YRI FgST CEU-JPT + CHB FhST JPT + CHB-YRI
rs864745 7 28,147,081 C Ta 0.518 JAZF1 −1.562 (11.7) 0.098 (47.3) 0.119 (35.7) 0.160 (19.7) 0 (93.3)
rs12779790 10 12,368,016 Aa G 0.229 CDC123/CAMK1D NA 0.051 (67.4) 0.113 (37.1) 0.028 (58.7) 0.026 (71.7)
rs7961581 12 69,949,369 Ta C 0.233 TSPAN8/LGR5 −0.518 (61.1) 0 (98.3) 0 (85.1) 0 (88.9) 0 (96.4)
rs7578597 2 43,586,327 C Ta 0.917 THADA −0.999 (32.2) 0.214 (18.8) 0.126 (33.9) 0.096 (32.7) 0.336 (11.7)
rs4607103 3 64,686,944 T Ca 0.808 ADAMTS9 0.541 (59.5) 0.060 (62.8) 0.006 (80.1) 0.103 (31.2) 0.044 (64.2)
rs10923931 1 120,319,482 Ga T 0.117 NOTCH2 2.249 (2.3) 0.258 (13.1) 0.182 (23.4) 0.069 (40.7) 0.391 (8.2)
rs10946398 6 20,769,013 A Ca 0.308 CDKAL1 −0.161 (87.5) 0.122 (39.3) 0.234 (16.6) 0.009 (72.1) 0.142 (36.2)
rs5015480 10 94,455,539 T Ca 0.552 HHEX/IDE 0.479 (63.8) 0.181 (24.7) 0 (98.4) 0.236 (10.7) 0.246 (20.1)
rs10811661 9 22,124,094 Ca T 0.792 CDKN2A/B 0.328 (74.7) 0.229 (16.7) 0.199 (20.1) 0.088 (34.9) 0.373 (9.3)
rs4402960 3 186,994,381 Ga T 0.292 IGF2BP2 1.641 (9.9) 0.098 (47.3) 0.129 (33.4) 0 (94.3) 0.160 (32.8)
rs13266634 8 118,253,964 T Ca 0.75 SLC30A8 −1.869 (5.9) 0.190 (22.9) 0.123 (34.8) 0.084 (36.2) 0.314 (13.3)
rs7901695 10 114,744,078 T Ca 0.28 TCF7L2 −0.208 (83.8) 0.361 (5.2) 0.111 (37.5) 0.323 (5.2) 0.579 (2.1)
rs5215 11 17,365,206 Ta C 0.408 KCNJ11 −0.435 (66.9) 0.191 (22.7) 0.384 (5.9) 0.004 (76.4) 0.278 (16.6)
rs1801282 3 12,368,125 G Ca 0.925 PPARG −0.571 (57.4) 0.025 (80.9) 0.065 (51.3) 0.005 (75.9) 0.026 (71.3)
rs4430796 17 33,172,153 A Ga 0.533 TCF2 0.849 (40.2) 0.098 (47.2) 0.003 (82.7) 0.096 (32.9) 0.160 (32.7)
rs10010131 4 6,343,816 A Ga 0.733 WFS1 1.461 (14.3) 0.151 (31.2) 0 (97.5) 0.241 (10.3) 0.246 (20.1)
rs2237892d 11 2,796,327 T Ca 0.611 KCNQ1 −0.618 (54.3) 0.172 (26.5) 0 (89.8) 0.209 (13.4) 0.171 (30.7)

iHS scores and FST values are reported with their percentile rank in parentheses

aAncestral allele

bAllele frequencies taken from HapMap data release 23a/phase II Mar08, on NCBI B36 assembly, dbSNPb126, CEU population

cHaplotter—HapMap phase II data

dFor KCNQ1 the JPT + CHB population iHS score is displayed and the risk allele frequency is from JPT HapMap

e95% quantile over 2,911,292 markers is 0.365

f95% quantile over 2,859,309 markers is 0.406

g95% quantile over 2,454,054 markers is 0.327

h95% quantile over 2,817,341 markers is 0.465

NA, iHS score unavailable through Haplotter

Table 2.

Obesity-associated risk allele characteristics

SNP Chr Position NCBI 36.1 (bp) No-risk allele Risk allele Risk allele frequencyb Nearest gene(s) iHS scorec FdST global FeST CEU-YRI FfST CEU-JPT + CHB FgST JPT + CHB-YRI
rs9939609 16 52,378,028 T Aa 0.45 FTO 1.991 (4.4) 0.184 (24.1) 0.005 (81.7) 0.208 (13.5) 0.290 (15.4)
rs6548238 2 624,905 T Ca 0.861 TMEM18 0.162 (87.3) 0 (96.9) 0.001 (84.3) 0.003 (79.6) 0 (97.2)
rs17782313 18 56,002,077 Ta C 0.283 MC4R −1.166 (24.6) 0.029 (79.3) 0 (87.7) 0.022 (62.6) 0.057 (59.2)
rs10938397 4 44,877,284 Aa G 0.446 GNPDA2 −0.077 (94.0) 0.048 (69.0) 0.111 (37.6) 0.032 (56.6) 0.019 (75.2)
rs7498665 16 28,790,742 A Ga 0.358 SH2B1 0.908 (36.9) 0.073 (57.4) 0.081 (46.0) 0.120 (27.1) 0 (92.8)
rs11084753 19 39,013,977 A Ga 0.625 KCTD15 0.431 (67.2) 0.163 (28.6) 0.021 (70.7) 0.138 (23.4) 0.259 (18.6)
rs10838738 11 47,619,625 Aa G 0.408 MTCH2 −1.814 (6.8) 0.166 (27.9) 0.315 (9.6) 0 (91.4) 0.256 (18.9)
rs2815752 1 72,585,028 Ga A 0.65 NEGR1 −0.638 (53.0) 0.185 (23.9) 0.024 (69.5) 0.179 (17.0) 0.317 (13.1)
rs6235 5 95,754,654 Ga C 0.267 PCSK1 −0.294 (77.3) 0.046 (70.2) 0.089 (43.5) 0 (98.5) 0.081 (51.2)
rs7647305 3 187,316,984 Ta C 0.817 ETV5/SFRS10/DGKG −0.554 (58.6) 0.183 (24.2) 0.072 (48.9) 0.116 (27.9) 0.324 (12.6)
rs4923461 11 27,613,486 G Aa 0.8 LGR4/LIN7C/BDNF −0.965 (33.9) 0.123 (39.0) 0 (90.4) 0.126 (25.9) 0.169 (31.2)
rs925946 11 27,623,778 Ga T 0.358 LGR4/LIN7C/BDNF 0.542 (59.5) 0.153 (30.8) 0.006 (80.9) 0.266 (8.4) 0.179 (29.5)
rs1424233 16 78,240,252 G Aa 0.508 MAF −0.476 (64.2) 0.052 (66.6) 0.028 (66.8) 0.102 (31.2) 0.014 (78.6)

Risk allele is the BMI-increasing allele, no-risk allele is the BMI-decreasing allele. iHS scores and FST values are reported with their percentile rank in parentheses

aAncestral allele

bAllele frequencies taken from HapMap data release 23a/phase II Mar08, on NCBI B36 assembly, dbSNPb126, CEU population.

cHaplotter—HapMap phase II data

d95% quantile over 2,911,292 markers is 0.365

e95% quantile over 2,859,309 markers is 0.406

f95% quantile over 2,454,054 markers is 0.327

g95% quantile over 2,817,341 markers is 0.465

First, we determined whether the risk allele at the index SNPs is the ancestral or derived allele, using information available through dbSNP build 128 (www.ncbi.nlm.nih.gov/SNP/, accessed February 2009), based on chimpanzee/human sequence alignment.

We also calculated population differentiation statistics (fixation index FST) for the 30 loci in the three HapMap phase II populations: Centre d’Etude du Polymorphisme Humain (CEPH) (Utah residents with northern and western European ancestry) (CEU); Yoruba in Ibadan, Nigeria (YRI); and Japanese in Tokyo (JPT) + Han Chinese in Beijing, China (CHB) [9]. FST measures the proportion of total genetic variance that is caused by differences between two or more population samples. Local selection acting on a given locus can result in elevated FST values between two populations. We can identify loci that have unusually high FST values by comparing against the rest of the genome, which provides an empirical null distribution. The use of an empirical FST distribution in this case is advantageous, because it does not require assumptions about the structure of human populations, SNP ascertainment bias (which differs among the three HapMap population samples) and differences in local linkage disequilibrium patterns among different populations. We constructed an empirical FST distribution using over 2.9 million SNPs, or the subset of all HapMap Phase II SNPs with genotype data available in all the three reference samples (HapMap Release 22, April 2007). We compared the observed FST values for the obesity and type 2 diabetes loci with the upper 95% tail of the distribution to obtain a one-tailed test for diversifying selection.

We additionally investigated evidence for natural selection by examining the iHS, a measure of recent positive selection for variants that have not yet reached fixation [10, 11]. This statistic identifies SNPs for which alleles have rapidly changed in frequency by comparing the haplotype background of the ancestral and derived alleles. Negative iHS values indicate that the derived allele resides on a longer haplotype, whereas positive iHS values suggest that the ancestral allele resides on a longer haplotype. For the purposes of this study, we define iHS <−1.5 and iHS >1.5 as suggestive evidence for natural selection, and iHS scores <−2 or >2 as evidence for a powerful selection signal [10]. We determined the iHS score for each locus in HapMap phase II data using Haplotter (http://hg-wen.uchicago.edu/selection/haplotter.htm, accessed February 2009) [10, 11].

Results

Evidence that type 2 diabetes- or obesity-associated risk alleles were more often derived than ancestral would be consistent with positive selection. In type 2 diabetes, we found the risk allele to be the derived allele at six of the 17 loci (CDC123/CAMK1D, TSPAN8/LGR5, NOTCH2, CDKN2A/B, IGF2BP2 and KCNJ11) (binomial test one-sided p = 0.93) (Table 1). Similarly, we did not observe a significant overrepresentation of derived status for the obesity-risk alleles (seven [MC4R, GNPDA2, MTCH2, NEGR1, PCSK1, LGR4/LIN7C/BDNF and ETV5/SFRS10/DGKG], p = 0.50) (Table 2). Among the type 2 diabetes loci, ten risk alleles are major and seven minor (binomial test two-sided p = 0.63) (Table 1). Among the obesity-risk alleles, six are major and seven are minor (p = 1.00) (Table 2).

Only one locus (rs7901695 at TCF7L2) showed an elevated FST value of 0.579 (2.1 percentile), between the JPT + CHB and YRI sample (previously also noted [12]), and in the comparison between CEU and JPT + CHB (FST = 0.323, 5.2 percentile) (Table 1). SNP rs5215 at KCNJ11 demonstrated an elevated FST value of 0.384 between CEU and YRI (5.9 percentile) (Table 1).

Among the type 2 diabetes-associated loci, the NOTCH2 rs10923931 index SNP demonstrated an elevated iHS value (2.249, 2.3 percentile) for the protective, ancestral allele (Table 1). Among the BMI-associated SNPs, the strongest signal of positive selection was obtained for the FTO locus, with an iHS value of 1.991 (4.4 percentile) (Table 2). No general enrichment for high FST or long haplotypes was observed for the set of diabetes- or obesity-associated SNPs (using Mann–Whitney significance testing).

Discussion

We have not observed significant evidence for overrepresentation of ancestral/derived status or for minor/major frequency at type 2 diabetes- or obesity-risk alleles. Only one locus (at the type 2 diabetes TCF7L2 locus) demonstrates large allele frequency differences across populations. Although this is consistent with chance, we note that TCF7L2 represents the strongest effect size to be identified in type 2 diabetes to date and, as such, may have been more susceptible to selection forces. Notably, we did not find strong evidence for high differentiation of rs2237892 at KCNQ1 between the European and East Asian sample (FST = 0.209, 13.3 percentile of the empirical distribution). The risk allele C at this locus has frequencies close to 90% in the CEU and YRI HapMap samples and close to 60% in the two East Asian samples.

Our analyses indicate the presence of extended haplotypes at the FTO locus, the largest effect size for obesity found to date. However, we have not identified any consistent footprint of selection across the loci that would support the notion of a universal mechanism to explain the high prevalence of type 2 diabetes and obesity. The number of robustly replicating type 2 diabetes and obesity loci identified is poised to grow, offering the promise of an extended established disease locus list. In addition, expansion of association studies to populations of non-European descent is likely to broaden the spectrum of robustly associated allelic variation and may help identify loci with prominent evidence for population differentiation, for example where risk alleles at a SNP have rapidly changed in frequency since population separation. Importantly, the truly causal, functional variants for the majority, if not all, of established type 2 diabetes- and obesity-susceptibility loci have not been determined yet. We have therefore been restricted to studying index SNPs, representative of the replicating associations, which could have an effect on the variant-specific analyses we have carried out, as these may provide only indirect glimpses of the history of the causal mutations.

This study has been exhaustive in terms of comprehensively considering all known, well-established type 2 diabetes- and BMI-susceptibility variants. Some loci appear to have more ‘thrifty gene’ characteristics than others, but there is no clear globally consistent transpiring picture. Further emerging insights into the genetic aetiology of these complex traits are likely to help us distinguish between apparent and real signals for positive selection.

Acknowledgements

This work was funded by the Wellcome Trust (WT088885/Z/09/Z and WT077016/Z/05/Z), MRC (G0601261), EU FP6 grant LSHM-CT-2006-037197 (INTERACT) and the Oxford NIHR Biomedical Research Centre. E. Zeggini is a Wellcome Trust Research Career Development Fellow. L. Southam is supported by EC Framework 7 Programme Grant 200800 (TREAT-OA).

Duality of interest The authors declare that there is no duality of interest associated with this manuscript.

Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Abbreviations

CEU

Centre d’Etude du Polymorphisme Humain (CEPH) (Utah residents with northern and western European ancestry)

CHB

Han Chinese in Beijing, China

FST

Population differentiation statistics (fixation index)

iHS

Integrated haplotype score

JPT

Japanese in Tokyo

SNP

Single nucleotide polymorphism

YRI

Yoruba in Ibadan, Nigeria

Footnotes

L. Southam and N. Soranzo contributed equally to this study.

References

  • 1.Neel JV (1962) Diabetes mellitus: a ‘thrifty’ genotype rendered detrimental by ‘progress’? Am J Hum Genet 14:353–362 [PMC free article] [PubMed]
  • 2.McCarthy MI, Zeggini E (2009) Genome-wide association studies in type 2 diabetes. Curr Diab Rep 9:164–171 [DOI] [PMC free article] [PubMed]
  • 3.Frayling TM, Timpson NJ, Weedon MN et al (2007) A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316:889–894 [DOI] [PMC free article] [PubMed]
  • 4.Loos RJ, Lindgren CM, Li S, Wheeler E et al (2008) Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet 40:768–775 [DOI] [PMC free article] [PubMed]
  • 5.Willer CJ, Speliotes EK, Loos RJ et al (2009) Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet 41:25–34 [DOI] [PMC free article] [PubMed]
  • 6.Thorleifsson G, Walters GB, Gudbjartsson DF et al (2009) Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet 41:18–24 [DOI] [PubMed]
  • 7.Benzinou M, Creemers JW, Choquet H et al (2008) Common nonsynonymous variants in PCSK1 confer risk of obesity. Nat Genet 40:943–945 [DOI] [PubMed]
  • 8.Meyre D, Delplanque J, Chèvre JC et al (2009) Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations. Nat Genet 41:157–159 [DOI] [PubMed]
  • 9.International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861 [DOI] [PMC free article] [PubMed]
  • 10.Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS Biol 4:e72 [DOI] [PMC free article] [PubMed]
  • 11.Kudaravalli S, Veyrieras JB, Stranger BE, Dermitzakis ET, Pritchard JK (2009) Gene expression levels are a target of recent natural selection in the human genome. Mol Biol Evol 26:649–658 [DOI] [PMC free article] [PubMed]
  • 12.Myles S, Davison D, Barrett J, Stoneking M, Timpson N (2008) Worldwide population differentiation at disease-associated SNPs. BMC Med Genomics 1:22 [DOI] [PMC free article] [PubMed]

Articles from Diabetologia are provided here courtesy of Springer

RESOURCES