. Author manuscript; available in PMC: 2016 Dec 6.

Published in final edited form as: Nat Genet. 2016 Jun 6;48(7):817–820. doi: 10.1038/ng.3583

Table 1. Comparison of methods on the UK Biobank dataset.

Sample size	Method	Clustering	New MCMC	Switch Error (%)	Run time (hrs)	Run time scaling	Sample size scaling
1,072	SHAPEIT3	No	Yes	2.6	0.25	1	1
10,072	SHAPEIT2	No	No	1.1	4.2	16.8	9.4
10,072	SHAPEIT3	No	Yes	1.1	3.3	13.2	9.4
10,072	SHAPEIT3	Yes	Yes	1.3	2.5	10.0	9.4
152,112	SHAPEIT3	Yes	Yes	0.4	38.5	154	142

Each row shows the performance on a subset of the full dataset. The clustering column indicates whether the new method for choosing copying states was used or not. The new MCMC column indicates whether the new MCMC routine, which uses completely parallel updates and local algorithm termination, was used or not. Performance is measured as median switch error on the trio children. Run time is given in hours. The Scaling column shows the relative run time compared to the SHAPEIT3 run on a sample size of 1,072. 10 threads were used for all runs.