Skip to main content
F1000Research logoLink to F1000Research
letter
. 2016 Feb 9;4:666. Originally published 2015 Sep 4. [Version 2] doi: 10.12688/f1000research.7023.2

High Frequency Haplotypes are Expected Events, not Historical Figures

Elsa G Guillot 1,2, Murray P Cox 1,a
PMCID: PMC4722684  PMID: 26834987

Version Changes

Revised. Amendments from Version 1

This version of the manuscript has changes in three main areas. First, additional references have been added showing the role of cultural transmission of reproductive success in other settings, including from non-genetic data. Second, the concluding paragraph has been rephrased to clarify the main points. Third, intermediate files from the analysis pipeline have been added to the online resources.

Abstract

Cultural transmission of reproductive success states that successful men have more children and pass this raised fecundity to their offspring. Balaresque and colleagues found high frequency haplotypes in a Central Asian Y chromosome dataset, which they attribute to cultural transmission of reproductive success by prominent historical men, including Genghis Khan. Using coalescent simulation, we show that these high frequency haplotypes are consistent with a neutral model, where they commonly appear simply by chance. Hence, explanations invoking cultural transmission of reproductive success are statistically unnecessary.

Keywords: Cultural Transmission of Reproductive Success, Neutrality, Haplotype Frequencies


Cultural transmission of reproductive success states that successful men have more children and pass this increased fecundity on to their offspring. Observed in modern human populations from genealogies and surname studies 1, cultural transmission of reproductive success in a genetic setting should cause particular male lines to dominate on the Y chromosome. Identified from historical records in Quebec 2, cultural transmission of reproductive success has previously been measured using pedigree data, as well as being detected in the genetic record 3. Balaresque and colleagues 4 examined a Y chromosome dataset from Central Asia to determine whether they could reconstruct historic instances of this behavior. Screening 8 microsatellites on the Y chromosome in 5,321 Central Asian men (distribution in Figure 1), they identified 15 haplotypes that are carried by more than 20 men (grey bars). The authors described these haplotypes as ‘unusually frequent,’ but did not provide any statistical support for this statement. These lineages were subsequently connected by the authors to prominent historical figures, including Genghis Khan and Giocangga.

Figure 1. Microsatellite haplotype frequency distribution.

Figure 1.

The distribution (black and grey bars) is identical to Figure 2 of Balaresque et al. 4. Grey bars indicate the 15 haplotypes that Balaresque and colleagues describe as ‘unusually frequent.’ Red shading indicates the 95% confidence intervals of haplotype frequencies from one million simulations under a fitted neutral model. All of the high frequency haplotypes (grey bars) fall within these 95% confidence bounds.

However, in any given haplotype frequency distribution, a number of haplotypes are expected to occur at high frequency simply by chance. In neutrally evolving systems, haplotype frequency distributions follow a Zipfian power law 5: most lineages are carried by only a few men ( Figure 1, left side), while a small number of lineages are carried by many men ( Figure 1, right side). The Y chromosome distribution observed by Balaresque and colleagues closely follows such a power law, thus providing strong preliminary evidence that their Y chromosome dataset may just be selectively neutral.

To more explicitly test whether the observed high frequency haplotypes are actually unusually frequent, we simulated genetic data under the standard coalescent, a neutral model that does not include cultural transmission of reproductive success. We modeled the evolution of 5,321 Y chromosomes, each carrying 8 fully linked microsatellites, to match the observed data. The code for these simulations, including full details of parameter values, is available online ( http://elzaguillot.github.io/Allele-Frequency-Spectrum-simulations).

Simulations were first run across a sweep of θ values to find the best match with the power law distribution observed in the Central Asian Y chromosome dataset. The least squares fit between observed and simulated distributions was minimized at θ = 131. In one million simulations run at this value, we found that 27.2% of the simulations contained at least 15 haplotypes carried by more than 20 men, thus illustrating that high frequency haplotypes like those observed among Central Asian Y chromosomes are relatively common, even when cultural transmission of reproductive success is not acting. The Y chromosome haplotype frequency distribution observed by Balaresque and colleagues falls within the 95% confidence intervals of our simulations ( Figure 1, red shading) and is therefore indistinguishable from our simulated neutral data.

The most parsimonious explanation is therefore that the high frequency haplotypes observed by Balaresque and colleagues in Central Asia are simply expected chance events. While we strongly encourage further research into cultural transmission of reproductive success, no statistical evidence has yet been presented to show that this process has acted on this particular dataset of Central Asian Y chromosomes. As no additional evidence is presented to support the proposed links to famous historical men, these haplotypes instead most likely reflect the chance proliferation of random male lines, probably from historically unrecorded, but biologically lucky Central Asian men.

Software availability

Latest source code for allele frequency spectrum simulations

http://elzaguillot.github.io/Allele-Frequency-Spectrum-simulations

Archived source code as at the time of publication

http://doi.org/10.5281/zenodo.45254 6

License

Lesser GNU Public License 3.0 https://www.gnu.org/licenses/lgpl.html

Funding Statement

The author(s) declared that no grants were involved in supporting this work.

[version 2; referees: 1 approved

References

  • 1. Kolk M: Multigenerational transmission of family size in contemporary Sweden. Popul Stud (Camb). 2014;68(1):111–129. 10.1080/00324728.2013.819112 [DOI] [PubMed] [Google Scholar]
  • 2. Austerlitz F, Heyer E: Social transmission of reproductive behavior increases frequency of inherited disorders in a young-expanding population. Proc Natl Acad Sci U S A. 1998;95(25):15140–15144. 10.1073/pnas.95.25.15140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Heyer E, Sibert A, Austerlitz F: Cultural transmission of fitness: genes take the fast lane. Trends Genet. 2005;21( 4):234–239. 10.1016/j.tig.2005.02.007 [DOI] [PubMed] [Google Scholar]
  • 4. Balaresque P, Poulet N, Cussat-Blanc S, et al. : Y-chromosome descent clusters and male differential reproductive success: young lineage expansions dominate Asian pastoral nomadic populations. Eur J Hum Genet. 2015;23(10):1413–1422. 10.1038/ejhg.2014.285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Berestycki J, Berestycki N, Limic V: Asymptotic sampling formulae for Λ-coalescents. Ann I H Poincaré-Pr. 2014;50(3):715–731. 10.1214/13-AIHP546 [DOI] [Google Scholar]
  • 6. Guillot EG, Cox MP: Allele Frequency Spectrum simulations: AFS2.0. Zenodo. 2016. Data Source
F1000Res. 2016 Jan 18. doi: 10.5256/f1000research.7561.r11952

Referee response for version 1

Nick Patterson 1

This short note considers the recent paper by Balaresque et al. 1 on the distribution of Y-chromosome haplotypes in Central Asia. Through simulation they show that the haplotype frequency distribution is not very surprising and suggest that the results of 1 are most likely due to chance.

But there is more to the analysis of 1 than just the haplotype frequency. Their analysis groups haplotypes into `descent clusters', estimates the time to the most recent common ancestor (TMRCA) and looks into the spatial distribution of the haplotypes. None of this was simulated. There is no formal test applied in 1, but visually the results look to this reviewer very surprising under a scenario where to quote Guillot and Cox the results are

chance proliferation of random male lines... from culturally undistinguished but biologically lucky...men

In autosomal analysis of admixture events 2, overwhelming genetic evidence was found for the Mongol expansion across Eurasia. This by no means shows that the Y-chromosome signal was, at least partially, driven by high status Mongols, but to this reviewer this still seems more likely than not.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

References

  • 1. Balaresque P, Poulet N, Cussat-Blanc S, Gerard P, Quintana-Murci L, Heyer E, Jobling MA: Y-chromosome descent clusters and male differential reproductive success: young lineage expansions dominate Asian pastoral nomadic populations. Eur J Hum Genet.2015;23(10) : 10.1038/ejhg.2014.285 1413-22 10.1038/ejhg.2014.285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Hellenthal G, Busby GB, Band G, Wilson JF, Capelli C, Falush D, Myers S: A genetic atlas of human admixture history. Science.2014;343(6172) : 10.1126/science.1243518 747-51 10.1126/science.1243518 [DOI] [PMC free article] [PubMed] [Google Scholar]
F1000Res. 2016 Jan 15. doi: 10.5256/f1000research.7561.r11823

Referee response for version 1

Heather Norton 1

Summary

In this manuscript Guillot and Cox test the claim made by Balaresque et al. (2015) that a subset of Y-chromosome haplotypes from Central Asian men occur at “unusually” high frequency, possibly indicating social selection for men carrying these lineages. Using simulations designed to match the data reported by Balaresque et al. the authors demonstrate that the reported distribution of Y chromosome haplotypes can be obtained under neutral conditions. This suggests that it is not necessary to invoke a model that includes cultural transmission of reproductive success to explain the observed distribution.

Comment

While the focus of this correspondence article is on the Balaresque data, can the authors briefly comment on other papers that have also investigated cultural transmission of reproductive success—specifically, have there been other studies that report high frequency Y haplotypes in other populations that are not consistent with neutrality?

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

F1000Res. 2015 Oct 1. doi: 10.5256/f1000research.7561.r10223

Referee response for version 1

Sohini Ramachandran 1

Guillot and Cox present a very interesting criticism of Balaresque et al.'s work in press on high frequency haplotypes in Central Asian Y chromosomes, by showing that distributions like those observed by Balaresque and colleagues can be observed using neutral simulations.

I have three comments I would like to see the authors address:

  1. The "Genghis Khan reproductive success" hypothesis emerged in Zerjal et al.'s work in 2003 and I think it would be helpful for the authors to comment on what analyses in that work support Zerjal et al.'s conclusions; can their simulations reproduce what Zerjal et al. observed under strictly neutral processes without a high number of mergers in the coalescent process?

  2. The phrase "historically unrecorded, culturally undistinguished, but biologically lucky Central Asian men." should be changed to "historically unrecorded but biologically lucky Central Asian men."

  3. The authors should provide sample output files for their simulation pipeline for users to analyze, and their code so that the number of simulations run is a user-provided argument. Given that the pipeline can take at least hours and perhaps days to generate the million simulations they studied, a toy example is worth looking at quickly and the authors could allow readers to generate examples more quickly without needing to fiddle with the bash/python/R pipeline on their own.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.


Articles from F1000Research are provided here courtesy of F1000 Research Ltd

RESOURCES