Table 7. Scaling Across Multiple CPU Instancesa.
instances | total vCPUs | ranks × threads | PME ranks | MEM (ns/d) | EMEM | RIB (ns/d) | ERIB |
---|---|---|---|---|---|---|---|
1 | 96 | 48 × 2/96 × 1 | 0/24 | 127.5 | 1.00 | 10.81 | 1.00 |
2 | 192 | 48 × 4/192 × 1 | 12/48 | 158.8 | 0.62 | 20.35 | 0.94 |
4 | 384 | 48 × 8/384 × 1 | 12/96 | 201.1 | 0.39 | 37.63 | 0.87 |
8 | 768 | 384 × 2 | 96 | 182.9 | 0.18 | 59.93 | 0.69 |
16 | 1536 | 128 × 12/384 × 4 | 32/96 | 151.9 | 0.07 | 87.21 | 0.50 |
32 | 3072 | 384 × 8/768 × 4 | 96/192 | 144.3 | 0.04 | 115.49 | 0.33 |
GROMACS 2020 performances for MEM and RIB over multiple hpc6a instances. The third column lists the optimal decomposition into MPI ranks and OpenMP threads, and the fourth column lists the optimal number of separate PME ranks; the left entry is for MEM, the right entry for RIB if they differ.