Table 3. A comparison of Dsuite and a number of other tools in terms of computational efficiency of D statistic estimation.
Dataset | Software | Options | Peak memory | Run time |
---|---|---|---|---|
Malawi scaffold_0 | Dsuite Dtrios | --no-f4-ratio | 92MB | 74m59s |
Admixtools qpDstat | blgsize: 0.01 | 27,212MB | 125m2s | |
HyDe run_hyde.py | none | 178MB | 231m38s | |
Comp-D† | -d -H -b10 | 8,300MB | 24hours+ | |
PopGenome | do.df=F block.size=1000 | 1,170MB | 24hours+ | |
Simulation small (20 species) | Dsuite Dtrios | --no-f4-ratio | 8MB | 28m18s |
Admixtools qpDstat | blgsize: 0.01 | 17,100MB | 13m59s | |
HyDe run_hyde.py | none | 258MB | 19m38s | |
Comp-D† | -d -H -b10 | 22,100MB | 24hours+ | |
PopGenome | do.df=F block.size=1000 | 440MB | 1m50s | |
Simulation large (100 species) | Dsuite Dtrios | --no-f4-ratio | 223MB | 215m52s (×100‡) |
Admixtools qpDstat | blgsize: 0.05 | 1,117,314MB | 331m39s (×100‡) | |
HyDe run_hyde.py | none | 18,716MB | 576m32s (×100‡) | |
Comp-D† | -d -H -b10 | 1,000,185MB+ | 24hours+ (×100‡) | |
PopGenome | do.df=F block.size=1000 | 470MB | 274m53s (×100‡) |
Comp-D cannot use allele frequencies calculated across multiple individuals, so only one individual per species included.
Because of the size of the dataset, we divided the analysis into 100 equally sized jobs to run in parallel; the run time and memory requirements are given for the first job