. Author manuscript; available in PMC: 2010 Mar 24.

Published in final edited form as: Nature. 2009 Sep 24;461(7263):489–494. doi: 10.1038/nature08365

Table 2.

Detection and quantification of population mixture along the Indian Cline

Indian Cline group	Samples	Z-score from 3 Population Test for mixture	% ANI ancestry	±1 stand. error	Genetic drift D from the best fitting combination of ANI and ASI ^*	Wright’s fixation index F (estimates inbreeding) ^†	Estimated fraction of recessive diseases due to founder events ^††
Mala	3	-2.5	38.8%	1.2%	0.0023	0	100%
Madiga	4	-2.7	40.6%	1.2%	0.0018	0.0061	23%
Chenchu	6	31.3 (not significant)	40.7%	1.3%	0.0492	0	100%
Bhil	7	-10.6	42.9%	1.1%	0.0024	0	100%
Satnami	3	-5.6	43.0%	1.3%	0.0019	0	100%
Kurumba	6	-12.6	43.2%	1.1%	0.0001	0.0052	2%
Kamsali	3	-6.5	44.5%	1.3%	0.0016	0.0066	19%
Vysya	5	5.4 (not significant)	46.2%	1.2%	0.0083	0.0071	54%
Lodi	5	-8.9	49.9%	1.1%	0.0027	0.0056	32%
Naidu	4	-3.3	50.1%	1.2%	0.0022	0.0435	5%
Tharu	5	-20.6	51.0%	1.2%	0.0000	0	na
Velama	4	-3.2	54.7%	1.3%	0.0044	0.0197	18%
Srivastava	2	-7.5	56.4%	1.5%	0.0023	0	100%
Meghawal	5	-13.3	60.3%	1.2%	0.0035	0	100%
Vaish	4	-22.0	62.6%	1.2%	0.0012	0	100%
Kashmiri Pandit	5	-20.6	70.6%	1.2%	0.0019	0	100%
Sindhi	10	-26.3	73.7%	1.1%	0.0008	0.0043	16%
Pathan	15	-34.3	76.9%	1.1%	0.0001	0.0039	3%

Estimates of genetic drift (the variance in allele frequencies on any lineage) are based on a model in which each group is a simple mixture of ANI and ASI, followed by subsequent genetic drift specific to that group (corrected for inbreeding). To fit the model, we use the algorithm described in Note S4, and fit f₂, f₃ and f₄ statistics that are calculated in a way that is unbiased by inbreeding (Appendix).

^†

Wright’s fixation index F is estimated as the excess rate at which the two copies of a chromosome within an individual from a group are identical by state, compared within across individuals from that group (Appendix). We set negative values to 0; standard errors are typically around 0.003. Because of the small sample sizes, these estimates are heavily influenced by the samples that happen to have been included in our analysis, and thus should be considered approximate.

^††

To estimate the proportion of recessive disease cases that are due to founder events, we consider the two alleles that a single individual carries at any locus. With probability F given by Wright’s Fixation Index, they coalesce in the last few generations due to consanguinity, and with probability (1-F)D, they coalescence more recently than ANI-ASI mixture due to founder events specific to that group. The fraction of recessive diseases due to founder events can thus be estimated as D(1-F)/(F+D(1-F)).