TABLE IV.
MD performance (μs/step/atom) for water, Cu, and HEA systems. “FP64” means double floating precision, “FP32” means single floating precision, and “FP64c” and “FP32c” mean the compressed model109 for double and single floating precision, respectively. “EPYC” performed on 128 AMD EPYC 7742 cores, “3080 Ti” performed on an NVIDIA GeForce RTX 3080 Ti card, “V100” performed on an NVIDIA Tesla V100 card, “A100” performed on an NVIDIA Tesla A100 card, “MI250” performed on an AMD Instinct MI250 Graphics Compute Die (GCD), and “VU9P” performed NVNMD110 on a Xilinx Virtex Ultrascale+ VU9P FPGA board.
loc_frame | se_e2_a | se_e2_a+se_e2_r | se_e2_a+se_e3 | se_atten | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
System | Hardware | FP64 | FP32 | FP64 | FP32 | FP64c | FP32c | FP64 | FP32 | FP64c | FP32c | FP64 | FP32 | FP64c | FP32c | FP64 | FP32 |
Water | EPYC | 1.25 | 0.699 | 19.3 | 8.73 | 3.89 | 2.61 | 8.33 | 3.43 | 3.78 | 1.86 | 37.2 | 15.1 | 5.04 | 3.63 | 221 | 83.8 |
3080 Ti | 12.9 | 8.63 | 29.0 | 4.21 | 9.71 | 1.73 | 20.8 | 3.43 | 9.06 | 1.99 | 69.5 | 10.5 | 18.5 | 2.89 | 294 | 32.3 | |
V100 | 16.1 | 16.8 | 8.25 | 4.59 | 1.94 | 1.51 | 6.21 | 3.53 | 2.22 | 1.62 | 22.2 | 11.3 | 3.31 | 2.41 | 91.2 | 37.2 | |
A100 | 35.7 | 33.9 | 4.37 | 3.01 | 1.56 | 1.42 | 4.11 | 2.44 | 2.07 | 1.53 | 12.5 | 7.17 | 2.64 | 2.25 | 35.6 | 22.4 | |
MI250 | 40.2 | 39.6 | 7.74 | 3.96 | 1.74 | 1.41 | 6.03 | 3.20 | 2.00 | 1.54 | 30.5 | 18.8 | 3.51 | 2.64 | 55.0 | 30.2 | |
VU9P | ⋯ | ⋯ | 0.306 | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | |
Cu | EPYC | 1.14 | 0.702 | 22.2 | 9.38 | 3.43 | 2.04 | 11.9 | 5.28 | 3.09 | 1.56 | 47.9 | 19.5 | 4.20 | 2.73 | 200 | 62.1 |
3080 Ti | 14.9 | 8.98 | 30.5 | 4.18 | 8.52 | 1.51 | 18.8 | 3.15 | 7.98 | 1.81 | 74.6 | 11.2 | 14.7 | 2.32 | 294 | 33.0 | |
V100 | 15.7 | 15.7 | 8.73 | 4.81 | 1.56 | 1.27 | 5.71 | 3.18 | 1.84 | 1.38 | 24.3 | 12.2 | 2.60 | 1.83 | 91.1 | 37.3 | |
A100 | 36.9 | 36.9 | 4.41 | 2.65 | 1.36 | 1.15 | 3.35 | 2.15 | 1.63 | 1.42 | 13.5 | 7.49 | 2.15 | 1.78 | 36.2 | 21.0 | |
MI250 | 39.0 | 39.1 | 8.27 | 4.13 | 1.37 | 1.21 | 5.62 | 2.98 | 1.59 | 1.35 | 26.9 | 12.6 | 2.56 | 2.00 | 55.4 | 29.5 | |
VU9P | ⋯ | ⋯ | 0.310 | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | ⋯ | |
HEA | EPYC | ⋯ | ⋯ | 32.8 | 13.0 | 7.04 | 4.58 | 15.3 | 7.64 | 6.83 | 3.80 | 81.0 | 33.4 | 8.56 | 5.68 | 156 | 45.9 |
3080 Ti | ⋯ | ⋯ | 65.3 | 9.72 | 10.5 | 2.51 | 36.1 | 6.83 | 11.9 | 3.24 | 171 | 24.9 | 29.6 | 5.37 | 290 | 32.8 | |
V100 | ⋯ | ⋯ | 20.1 | 10.9 | 2.88 | 2.39 | 12.3 | 6.86 | 12.3 | 2.85 | 55.2 | 28.4 | 9.42 | 5.47 | 91.2 | 37.4 | |
A100 | ⋯ | ⋯ | 10.4 | 6.09 | 2.13 | 1.83 | 7.25 | 5.48 | 2.98 | 2.83 | 30.1 | 17.1 | 4.21 | 4.22 | 35.0 | 20.0 | |
MI250 | ⋯ | ⋯ | 20.1 | 11.6 | 4.57 | 4.22 | 16.2 | 12.0 | 7.01 | 6.44 | 76.0 | 44.9 | 9.09 | 7.61 | 55.7 | 30.5 |