Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2013 Dec;7(6):568–572. doi: 10.1016/j.fsigen.2013.03.014

A comparison of Y-chromosomal lineage dating using either resequencing or Y-SNP plus Y-STR genotyping

Wei Wei 1, Qasim Ayub 1, Yali Xue 1, Chris Tyler-Smith 1,
PMCID: PMC3820021  PMID: 23768990

Abstract

We have compared phylogenies and time estimates for Y-chromosomal lineages based on resequencing ∼9 Mb of DNA and applying the program GENETREE to similar analyses based on the more standard approach of genotyping 26 Y-SNPs plus 21 Y-STRs and applying the programs NETWORK and BATWING. We find that deep phylogenetic structure is not adequately reconstructed after Y-SNP plus Y-STR genotyping, and that times estimated using observed Y-STR mutation rates are several-fold too recent. In contrast, an evolutionary mutation rate gives times that are more similar to the resequencing data. In principle, systematic comparisons of this kind can in future studies be used to identify the combinations of Y-SNP and Y-STR markers, and time estimation methodologies, that correspond best to resequencing data.

Keywords: Human Y chromosome, Male history, Time estimation, Networks, BATWING

1. Introduction

The combination of its male-specific inheritance, small effective population size and geographical specificity make the Y chromosome the locus of choice for investigating many questions about both forensics [1] and human male history and prehistory [2]. In some cases, the distribution [3] or sharing [4,5] of a Y-chromosomal lineage may itself provide the information sought, but often an estimate of a date or time is an integral part of a study: for example, the time when a lineage originated [6–8] or spread [9], or when a population began to expand in numbers [10,11]. Such a date can then be compared with other genetic or non-genetic dates to generate integrated insights.

Two kinds of data are needed in order to obtain a date estimate from present-day Y chromosomes: information about the genetic diversity of the Y chromosomes, and a measure of the mutation rate of the loci used to determine the diversity. Over the last two decades, Y-chromosomal studies have increasingly used Y-STRs to measure genetic diversity [2]. The commonly used Y-STRs are variable in all populations (http://www.yhrd.org/) and mutate quickly enough that their mutation rates can be measured in deep-rooting families [12] or father–son pairs [13], and refined using information from levels of population variation [14]. Thus Y-STR mutation rates are now estimated with some precision. Nevertheless, there are complications in using variants that mutate so fast (about once in 500 transmissions) to estimate evolutionary times: after ∼15 thousand years, a typical Y-STR will have mutated more than once. Since most mutations are increases or decreases of single repeat units [13], about half of double mutations will recreate the original allele and thus not be readily detected by simply examining the haplotype. A number of approaches have been taken to address such issues. The construction of networks representing the evolutionary history of a set of haplotypes can recover some of the non-observed haplotypes [15], and this process is aided by including Y-SNPs (which have much lower mutation rates) in the network, or analysing only single groups of closely related haplotypes defined by Y-SNPs (haplogroups). Times can be estimated from networks of such data using the rho statistic [16]. Other approaches to time estimation include models of the Y-STR mutational process that allow back-and-forth mutations [17]. Nevertheless, it has also been suggested that mutation rates measured in families or father-son pairs should be calibrated to lower values when used for evolutionary purposes [18,19].

While these methods for estimating times have found widespread acceptance and use, it has been difficult to assess their reliability because of the lack of datasets where the true time of interest is known. Indeed, comparisons with simulated datasets have suggested that times estimated from networks using the rho statistic are not always reliable [20]. Improvements in technology now mean that another form of test, comparison with Y-chromosomal variation discovered by large-scale resequencing, is possible [21]. Such comparisons have the advantage that the variation discovered by resequencing and Y-STR genotyping on a lineage share the same history, so testing of reliability is not complicated by, for example, choice of a particular demographic scenario in a simulation. In addition, Y-SNPs, the most abundant variants discovered by resequencing, mutate so slowly that recurrent mutations on the human Y chromosome will generally have negligible effects on resequencing-based time estimates.

We have previously constructed a calibrated phylogenetic tree based on Y-SNPs discovered by resequencing ∼9 Mb of unique Y-chromosomal DNA, or subsets of this DNA [21]. The branch lengths (numbers of SNPs) on this tree between any pair of chromosomes are proportional to the time separating the chromosomes, and since these SNP numbers are large, usually a few hundred, they are determined accurately. However, since substantial effort is required to generate ∼9 Mb of sequence data per individual and the number of Y chromosome sequences available is still small, we wished to compare conclusions from full resequencing with conclusions from more standard Y-STR plus Y-SNP genotyping, where vastly more datasets are available. We have therefore typed this set of resequenced Y chromosomes with 23 common Y-STRs, applied a widely used method for estimating times from traditional Y-SNP plus Y-STR genotypes, and compared the time estimates with those from resequencing.

2. Materials and methods

2.1. DNA samples and genotyping

We analysed 33 of the 36 males for whom sequence data from the Y chromosome are available [21]; the three individuals not included in the current study were three of the sons in the CEU pedigree. 32 of these individuals had been sequenced by Complete Genomics, and one by The Wellcome Trust Sanger Institute, as reported in the previous study [21]. DNA samples were obtained from the Coriell Institute for Medical Research (Camden, NJ, USA).

23 Y-STRs were genotyped using the PowerPlex® Y23 system (Promega). Each sample was amplified in 5 μl volume containing 1 μl of PowerPlex® Y23 5X Master Mix, 0.5 μl of PowerPlex® Y23 10X Primer Pair Mix and at least 0.5 ng of template. Cycling conditions used an initial denaturation of 96 °C for 2 min, 30 cycles of 94 °C for 10 s, 61 °C for 1 min, and 30 s at 72 °C followed by a 20 min hold at 60 °C and a final 4 °C soak using the max ramp rate on a MJ Research DNA Engine Tetrad 2. Separation of amplification products was performed on the Applied Biosystems 3730xl. A 36 cm capillary array was used with POP-7® Polymer (Life Technologies™). Samples were prepared for separation and analysis by adding 1 μl of 1:10 dilution amplified sample or allelic ladder to 10 μl of Hi-Di™ Formamide (Life Technologies™) and 1 μl of CC5 Internal Lane Standard 500 (ILS). Samples were denatured for 3 min at 95 °C followed by a snap cool in an ice bath and subsequently injected for 23 s at 1.2 kV. GeneMapper® ID Software, Version 3.0 (Life Technologies™) was used to determine fragment size and allele calls with a 50 RFU analytical threshold. Genotypes for a set of 29 standard Y-SNPs that define the major Y haplogroups (Supplementary Table 1) were extracted from the sequence data.

Supplementary Table 1

Y-SNP and Y-STR genotypes in the 33 samples analysed. The yellow highlights indicate Y-STRs that show a mutation within the family.

mmc1.xlsx (57.5KB, xlsx)

2.2. Data analysis

Median-joining networks [15] of haplotypes consisting of 21 Y-STRs and 29 Y-SNPs were constructed using Network 4.61.1 (http://www.fluxus-engineering.com/sharenet.htm). The duplicated locus DYS385 was not used in these analyses since the constituent loci are not distinguished in this assay, and DYS389 was treated as DYS389I and DYS389b [DYS389b = DYS389II − DYS389I]. The 29 Y-SNPs were assigned high weights of 99, and the 21 Y-STRs lower weights that ranged from 1 to 5, depending upon the inverse of the variance of each STR (Supplementary Table 1). We manually counted the number of Y-STR mutational steps on the network between each pair of individuals.

Time estimates were made using BATWING (Bayesian Analysis of Trees With Internal Node Generation) [17] based on 26 Y-SNPs and 21 Y-STRs. We excluded three SNPs that were not variable in the 33 individuals, treated DYS385 and DYS389 as above, and used a population model of exponential growth from an initially constant-sized population with the settings, priors and convergence assessments described previously [10]. Five sets of Y-STR mutation rates were used. These included two compilations of “observed” mutation rates (OMR) [11,14], a widely used calibrated “evolutionary” mutation rate (EMR) [19], a recalibrated evolutionary mutation rate (rEMR) [11] and a mutation rate predicted from the logistic model (lmMR) [14] (Supplementary Table 2). We evaluated both the Time to the Most Recent Common Ancestor (TMRCA) of the entire sample, and the times of individual Y-SNPs within the phylogeny. The time estimates for individual Y-SNPs were compared with the times of the same SNPs estimated by GENETREE from sequence data, as reported previously [21]. Pearson's correlation coefficient (R2), Spearman's rank correlation coefficient (rho), and their significance were calculated using the correlation test in R2.15.1 (http://www.r-project.org).

Supplementary Table 2

Five sets of Y-STR mutation rates and the prior distributions used in BATWING analyses.

mmc2.xlsx (12.1KB, xlsx)

3. Results

We generated a standard: Y-SNP plus Y-STR dataset for 33 of the 36 Y chromosomes previously analysed by resequencing. The 29 Y-SNPs were chosen to correspond to those that might have been used if the resequencing data had not been available: they defined common haplogroups (including three that were not present in this sample) and subdivided haplogroups known to be frequent in Africa and Europe, where many of the samples originated. The 23 Y-STRs included those most commonly used. We began by constructing a phylogenetic network from the data (Fig. 1). The resulting network clusters groups of chromosomes from the same haplogroup together and reconstructs most of the expected features of the phylogeny. For example, the diverse E1b1b chromosomes are grouped together, next to their sister group E1b1a, with the two E1b haplogroups linked next to E1a and then D. However, not all deep relationships are reconstructed correctly. R1a chromosomes are placed between R2 and R1b, with the SRY10831.2 SNP mutation that defines this haplogroup being incorrectly assigned as recurrent, mutating from G to A between R2 and R1a, and then back from A to G between R1a and R1b. Similarly, the M89 and M9 mutations deep within the network are placed as recurrent, showing that ancient structure is not reconstructed correctly here.

Fig. 1.

Fig. 1

Median-joining network representing the relationships between 33 Y chromosomes based on 26 variable Y-SNPs and 21 Y-STRs. Each circle represents a haplotype and has an area proportional to its frequency. All haplotypes are present once, except one within E1b1a and one R1b. Lines represent Y-SNP plus Y-STR mutational steps between the haplotypes and have a length proportional to the number of steps. The branches on which the Y-SNP mutations lie are indicated; note that there is no information about location or ordering within the branch on which they lie. Y-SNP names in red represent sites assigned as recurrent on the network but not on the sequence-based phylogenetic tree in Supplementary Figure 1.

We next compared the Y-SNP distance on the previous resequencing-based phylogenetic tree (Supplementary Figure 1) between each pair of chromosomes with the Y-STR distance on the network between the same pair of chromosomes (Fig. 2). Although the absolute numbers will differ, they should be correlated. They are indeed very significantly correlated (p < 2.2 × 10−16), but the value of the Spearman correlation coefficient is only 0.52 and Pearson's R2 only 0.28, reflecting both the large scatter of points seen in Fig. 2 and the striking saturation of Y-STR distances for chromosomes separated by large Y-SNP distances (Fig. 2): on the sequence-based tree, the haplogroup A chromosome is very distinct from all of the others (Supplementary Figure 1), but on the network it does not lie on an exceptionally long branch (Fig. 1). These findings also alert us to the possibility that conclusions about deep relationships based on genotyping Y-SNPs plus Y-STRs may be unreliable.

Fig. 2.

Fig. 2

Comparison of Y-SNP and Y-STR distances between pairs of chromosomes. Y-SNP distances were determined from the phylogenetic tree based on resequencing, while the Y-STR distances were determined from the Y-SNP plus Y-STR network shown in Fig. 1.

Supplementary Figure 1.

Supplementary Figure 1

Phylogenetic tree based on resequencing [21] showing the locations of the 26 Y-SNP mutations used in the current study.

Distances measured in mutational steps can be converted into times measured in years if the mutation rate is known, for example using the program BATWING. While there have been numerous measurements of Y-STR mutation rates, there is, as discussed above, controversy about which rate should be used in evolutionary studies. We therefore compared times estimated from the resequencing data with times estimated using five different assessments of the Y-STR mutation rate: two compilations of observed mutation rates, an observed mutation rate adjusted for population variation, an evolutionary mutation rate and a recalibrated evolutionary mutation rate (Supplementary Table 2). For simplicity, we present here a single time estimate: that of the TMRCA of 33 chromosomes, based on each of the mutation rates. For the resequencing data, five point estimates made using the same dataset and calibration rate but different methods of calculation were available. These ranged from 101 to 115 KYA [21] (Fig. 3, shaded horizontal bar). In contrast, the TMRCAs estimated from the Y-SNP plus Y-STR data using the five different Y-STR mutation rates were all much younger and ranged from 16 to 71 KYA (Fig. 3). In this comparison, the TMRCAs based on the two evolutionary mutation rate estimates are more consistent with resequencing-based TMRCAs than those based on the observed mutation rates: the EMR, rEMR and resequencing confidence intervals overlap. The resequencing-based TMRCAs of course depend on the calibration used, in this case derived from a small number of observed Y-SNP mutations in a deep-rooting pedigree [22]. Nevertheless, the resequencing-based TMRCAs are consistent with well-supported dates such as the migration of modern humans out of Africa ∼60 KYA [23], while the much more recent dates estimated using the observed Y-STR mutation rates are highly implausible.

Fig. 3.

Fig. 3

Comparison of TMRCA estimates based on Y-SNP plus Y-STR genotyping using five different Y-STR mutation rates, with the range of published estimates based on resequencing. Dark horizontal bar: range of five point estimates from resequencing; light horizontal bar: standard deviation of the point estimate with the largest uncertainty [21].

Since the EMR produced the time estimates most consistent with the resequencing-based estimates, we compared the times of additional nodes within the tree between these two methods (Fig. 4). The two sets of time estimates are highly correlated (Spearman's rho = 0.96, R2 = 0.88, p-value = 0.00018).

Fig. 4.

Fig. 4

Comparison of time estimates from Y-SNP plus Y-STR genotyping with published time estimates based on resequencing for eight Y-SNPs within the phylogeny. Colours represent the haplogroups defined by the SNPs, using the same conventions as Fig. 1.

4. Discussion

In this study, we have documented large differences between conclusions about times based on full resequencing compared to standard Y-SNP plus Y-STR genotyping. Here, we discuss three major aspects of these differences, and the more general conclusions that may be drawn from our study.

First, and independent of any choices about calibration rate, Y-STR mutation counts between lineages appear to saturate rapidly (Fig. 2). A line relating the two must pass through the 0,0 point on the graph; if a straight line were drawn through this point based on the Y-SNP distances <100, under-counting of Y-STR steps would be apparent even for Y-SNP distances of 100–200. It is widely appreciated that raw Y-STR mutational differences between haplotypes saturate rapidly, but the use of networks, especially those incorporating Y-SNPs, is expected to recover many of the Y-STR mutational steps obscured by recurrent mutation. We see that this strategy is only partially successful.

Second, the comparison with time estimates based on resequencing provides an opportunity to evaluate the different Y-STR mutation rates that have been proposed. Even when times are estimated using an approach that models recurrent mutation [17], the use of observed mutation rates leads to time estimates that are several-fold too recent. In contrast, an evolutionary mutation rate [19] leads to more plausible time estimates. The Y-SNP mutation rate [22] used for calibration of the resequencing data has itself wide confidence intervals, so could these instead be responsible for the discrepancy? Current debate about the human SNP mutation rate and its implications for the timing of evolutionary events contemplates the possibility of a longer timescale rather than a shorter one [24], so this seems unlikely: older Y-sequence-based times would be even less consistent with the observed Y-STR mutation rates.

Third, if the Y-STR mutation rate that generates a TMRCA that matches the resequencing TMRCA is chosen (i.e. the EMR), the times of additional nodes in the tree also match. While this finding is expected, it is nevertheless reassuring to see the strength of the correlation between the resequencing times calculated by GENETREE and the Y-SNP plus Y-STR times calculated by BATWING.

Despite the limitations identified, we now have a tool to evaluate time estimates based on Y-SNP plus Y-STR genotyping in a systematic way. For example, do some Y-STRs lead to more reliable estimates than others [8], and can a subset of the most useful ones be identified? How does the inclusion of Y-SNPs influence the time estimates? Do alternative methods of estimating time from Y-SNP plus Y-STR data correspond more or less well with resequencing data? Do bigger datasets, especially ones containing groups of more closely related chromosomes, lead to better recovery of recurrent mutations and thus more reliable time estimates?

5. Conclusions

We have compared a laborious and expensive ‘gold standard’ method for estimating Y-chromosomal lineage times – resequencing ∼9 Mb of DNA – with the much easier and more cost-effective standard approach of genotyping sets of Y-SNPs plus Y-STRs. While the times estimated using the two approaches can vary several-fold, we conclude that BATWING time estimates based on an evolutionary mutation rate correlate best with the resequence data.

Acknowledgements

We thank Lutz Roewer and Walther Parson for their invitation to the workshop “DNA in Forensics: Exploring the Phylogenies” where the sequence data were presented and discussed for the first time, and Richard Rance in the Sanger 454 Sequencing and Capillary Loading team for genotyping. This work was supported by grant number 098051 from The Wellcome Trust; the funders had no role in the study design, collection, analysis and interpretation of data, or in the writing of the report or decision to submit the article for publication.

Footnotes

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

References

  • 1.Jobling M.A., Pandya A., Tyler-Smith C. The Y chromosome in forensic analysis and paternity testing. Int. J. Legal Med. 1997;110:118–124. doi: 10.1007/s004140050050. [DOI] [PubMed] [Google Scholar]
  • 2.Jobling M.A., Tyler-Smith C. The human Y chromosome: an evolutionary marker comes of age. Nat. Rev. Genet. 2003;4:598–612. doi: 10.1038/nrg1124. [DOI] [PubMed] [Google Scholar]
  • 3.Rosser Z.H., Zerjal T., Hurles M.E., Adojaan M., Alavantic D., Amorim A., Amos W., Armenteros M., Arroyo E., Barbujani G., Beckman G., Beckman L., Bertranpetit J., Bosch E., Bradley D.G., Brede G., Cooper G., Corte-Real H.B., de Knijff P., Decorte R., Dubrova Y.E., Evgrafov O., Gilissen A., Glisic S., Golge M., Hill E.W., Jeziorowska A., Kalaydjieva L., Kayser M., Kivisild T., Kravchenko S.A., Krumina A., Kucinskas V., Lavinha J., Livshits L.A., Malaspina P., Maria S., McElreavey K., Meitinger T.A., Mikelsaar A.V., Mitchell R.J., Nafa K., Nicholson J., Norby S., Pandya A., Parik J., Patsalis P.C., Pereira L., Peterlin B., Pielberg G., Prata M.J., Previdere C., Roewer L., Rootsi S., Rubinsztein D.C., Saillard J., Santos F.R., Stefanescu G., Sykes B.C., Tolun A., Villems R., Tyler-Smith C., Jobling M.A. Y-chromosomal diversity in Europe is clinal and influenced primarily by geography, rather than by language. Am. J. Hum. Genet. 2000;67:1526–1543. doi: 10.1086/316890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Foster E.A., Jobling M.A., Taylor P.G., Donnelly P., de Knijff P., Mieremet R., Zerjal T., Tyler-Smith C. Jefferson fathered slave's last child. Nature. 1998;396:27–28. doi: 10.1038/23835. [DOI] [PubMed] [Google Scholar]
  • 5.Firasat S., Khaliq S., Mohyuddin A., Papaioannou M., Tyler-Smith C., Underhill P.A., Ayub Q. Y-chromosomal evidence for a limited Greek contribution to the Pathan population of Pakistan. Eur. J. Hum. Genet. 2007;15:121–126. doi: 10.1038/sj.ejhg.5201726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Semino O., Passarino G., Oefner P.J., Lin A.A., Arbuzova S., Beckman L.E., De Benedictis G., Francalacci P., Kouvatsi A., Limborska S., Marcikiae M., Mika A., Mika B., Primorac D., Santachiara-Benerecetti A.S., Cavalli-Sforza L.L., Underhill P.A. The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective. Science. 2000;290:1155–1159. doi: 10.1126/science.290.5494.1155. [DOI] [PubMed] [Google Scholar]
  • 7.Balaresque P., Bowden G.R., Adams S.M., Leung H.Y., King T.E., Rosser Z.H., Goodwin J., Moisan J.P., Richard C., Millward A., Demaine A.G., Barbujani G., Previdere C., Wilson I.J., Tyler-Smith C., Jobling M.A. A predominantly neolithic origin for European paternal lineages. PLoS Biol. 2010;8:e1000285. doi: 10.1371/journal.pbio.1000285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Busby G.B., Brisighelli F., Sanchez-Diz P., Ramos-Luis E., Martinez-Cadenas C., Thomas M.G., Bradley D.G., Gusmao L., Winney B., Bodmer W., Vennemann M., Coia V., Scarnicci F., Tofanelli S., Vona G., Ploski R., Vecchiotti C., Zemunik T., Rudan I., Karachanak S., Toncheva D., Anagnostou P., Ferri G., Rapone C., Hervig T., Moen T., Wilson J.F., Capelli C. The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269. Proc. R. Soc. B. 2012;279:884–892. doi: 10.1098/rspb.2011.1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zalloua P.A., Platt D.E., El Sibai M., Khalife J., Makhoul N., Haber M., Xue Y., Izaabel H., Bosch E., Adams S.M., Arroyo E., Lopez-Parra A.M., Aler M., Picornell A., Ramon M., Jobling M.A., Comas D., Bertranpetit J., Wells R.S., Tyler-Smith C. Identifying genetic traces of historical expansions: phoenician footprints in the Mediterranean. Am. J. Hum. Genet. 2008;83:633–642. doi: 10.1016/j.ajhg.2008.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Xue Y., Zerjal T., Bao W., Zhu S., Shu Q., Xu J., Du R., Fu S., Li P., Hurles M.E., Yang H., Tyler-Smith C. Male demography in East Asia: a north-south contrast in human population expansion times. Genetics. 2006;172:2431–2439. doi: 10.1534/genetics.105.054270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Shi W., Ayub Q., Vermeulen M., Shao R.G., Zuniga S., van der Gaag K., de Knijff P., Kayser M., Xue Y., Tyler-Smith C. A worldwide survey of human male demographic history based on Y-SNP and Y-STR data from the HGDP-CEPH populations. Mol. Biol. Evol. 2010;27:385–393. doi: 10.1093/molbev/msp243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Heyer E., Puymirat J., Dieltjes P., Bakker E., de Knijff P. Estimating Y chromosome specific microsatellite mutation frequencies using deep rooting pedigrees. Hum. Mol. Genet. 1997;6:799–803. doi: 10.1093/hmg/6.5.799. [DOI] [PubMed] [Google Scholar]
  • 13.Ballantyne K.N., Goedbloed M., Fang R., Schaap O., Lao O., Wollstein A., Choi Y., van Duijn K., Vermeulen M., Brauer S., Decorte R., Poetsch M., von Wurmb-Schwark N., de Knijff P., Labuda D., Vezina H., Knoblauch H., Lessig R., Roewer L., Ploski R., Dobosz T., Henke L., Henke J., Furtado M.R., Kayser M. Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications. Am. J. Hum. Genet. 2010;87:341–353. doi: 10.1016/j.ajhg.2010.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Burgarella C., Navascués M. Mutation rate estimates for 110 Y-chromosome STRs combining population and father-son pair data. Eur. J. Hum. Genet. 2011;19:70–75. doi: 10.1038/ejhg.2010.154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bandelt H.J., Forster P., Röhl A. Median-joining networks for inferring intraspecific phylogenies. Mol. Biol. Evol. 1999;16:37–48. doi: 10.1093/oxfordjournals.molbev.a026036. [DOI] [PubMed] [Google Scholar]
  • 16.Saillard J., Forster P., Lynnerup N., Bandelt H.J., Norby S. mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion. Am. J. Hum. Genet. 2000;67:718–726. doi: 10.1086/303038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wilson I.J., Weale M.E., Balding D.J. Inferences from DNA data: population histories, evolutionary processes and forensic match probabilities. J. R. Stat. Soc. 2003;116:155–188. [Google Scholar]
  • 18.Forster P., Röhl A., Lunnemann P., Brinkmann C., Zerjal T., Tyler-Smith C., Brinkmann B. A short tandem repeat-based phylogeny for the human Y chromosome. Am. J. Hum. Genet. 2000;67:182–196. doi: 10.1086/302953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zhivotovsky L.A., Underhill P.A., Cinnioglu C., Kayser M., Morar B., Kivisild T., Scozzari R., Cruciani F., Destro-Bisol G., Spedini G., Chambers G.K., Herrera R.J., Yong K.K., Gresham D., Tournev I., Feldman M.W., Kalaydjieva L. The effective mutation rate at Y chromosome short tandem repeats, with application to human population-divergence time. Am. J. Hum. Genet. 2004;74:50–61. doi: 10.1086/380911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Cox M.P. Accuracy of molecular dating with the rho statistic: deviations from coalescent expectations under a range of demographic models. Hum. Biol. 2009;81:911–933. doi: 10.3378/027.081.0631. [DOI] [PubMed] [Google Scholar]
  • 21.Wei W., Ayub Q., Chen Y., McCarthy S., Hou Y., Carbone I., Xue Y., Tyler-Smith C. A calibrated human Y-chromosomal phylogeny based on resequencing. Genome Res. 2013;23:388–395. doi: 10.1101/gr.143198.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Xue Y., Wang Q., Long Q., Ng B.L., Swerdlow H., Burton J., Skuce C., Taylor R., Abdellah Z., Zhao Y., MacArthur D.G., Quail M.A., Carter N.P., Yang H., Tyler-Smith C. Human Y chromosome base-substitution mutation rate measured by direct sequencing in a deep-rooting pedigree. Curr. Biol. 2009;19:1453–1457. doi: 10.1016/j.cub.2009.07.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jobling M.A., Hurles M.E., Tyler-Smith C. Garland Science; Abingdon, UK: 2004. Human Evolutionary Genetics: Origins, Peoples and Disease. [Google Scholar]
  • 24.Scally A., Durbin R. Revising the human mutation rate: implications for understanding human evolution. Nat. Rev. Genet. 2012;13:745–753. doi: 10.1038/nrg3295. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1

Y-SNP and Y-STR genotypes in the 33 samples analysed. The yellow highlights indicate Y-STRs that show a mutation within the family.

mmc1.xlsx (57.5KB, xlsx)
Supplementary Table 2

Five sets of Y-STR mutation rates and the prior distributions used in BATWING analyses.

mmc2.xlsx (12.1KB, xlsx)

RESOURCES