TABLE 1.
Platform | RAM (GB) | Runtime (min) | Speedup | GPU speedup | Mobile speedup |
---|---|---|---|---|---|
Original CPU Xeon Gold 6242 | 5.6 | 504 | 1× | ||
CPU Mobile i7-8565U | 8.1 | 28.2 | 18× | 1× | |
CPU Mobile i7-8850H | 8.1 | 18.7 | 27× | 1.5× | |
CPU Xeon Gold 6242 | 8.1 | 4.8 | 105× | 1× | |
GPU Mobile GTX 1050 Max-Q | 6.6 | 3.8 | 170× | 1.3× | 7.4× |
GPU T4 | 7.8 | 1.5 | 340× | 3.2× | |
GPU RTX2080TI | 8.4 | 0.73 | 690× | 6.6× | |
GPU V100 PCIE 32GB | 8.2 | 0.75 | 670× | 6.4× | |
GPU A100 PCIE 40GB | 7.8 | 0.62 | 810× | 7.7× | |
GPU RTX3090 | 8.4 | 0.53 | 950× | 9.0× | |
GPU RTX8000 | 7.8 | 0.48 | 1,050× | 10.0× |
Speedup is relative to performance on the same data using Striped UniFrac from McDonald et al. (10). In all cases, all available compute resources for an architecture were utilized. Peak resident memory for the runs is provided.