Table 11.
No. of nodes | Processor(s) Intel | GPUs, IB | DD | Grid | N PME/node | N th | DLB | P (ns/d) | E | ||
---|---|---|---|---|---|---|---|---|---|---|---|
x | y | z | |||||||||
1 | E3‐1270v2 | 770, | 1 | 1 | 1 | – | 8 | (ht) | – | 20.5 | 1 |
2 | (4 cores) | QDR[a] | 2 | 1 | 1 | – | 8 | (ht) | (✓) | 27.2 | 0.66 |
4 | 4 | 1 | 1 | – | 8 | (ht) | (✓) | 22.1 | 0.27 | ||
8 | 8 | 1 | 1 | – | 8 | (ht) | (✓) | 68.3 | 0.42 | ||
16 | 16 | 1 | 1 | – | 8 | (ht) | (✓) | 85.7 | 0.26 | ||
32 | 8 | 4 | 1 | – | 8 | (ht) | (✓) | 119 | 0.18 | ||
1 | E5‐1620 | 680, | 1 | 1 | 1 | – | 8 | ht | – | 21 | 1 |
2 | (4 cores) | QDR | 2 | 1 | 1 | – | 8 | ht | (✓) | 29 | 0.69 |
4 | 4 | 1 | 1 | – | 8 | ht | (✓) | 46.9 | 0.56 | ||
1 | E5‐2670v2 | 780Ti×2, | 10 | 1 | 1 | – | 4 | ht | ✓ | 56.9 | 1 |
2 | (2×10 cores) | QDR | 4 | 5 | 1 | – | 2 | ✓ | 74.2 | 0.65 | |
4 | 8 | 1 | 1 | 2 | 5 | ✗ | 103.4 | 0.45 | |||
8 | 8 | 1 | 2 | 2 | 5 | ✗ | 119.1 | 0.26 | |||
16 | 8 | 4 | 1 | 2 | 5 | ✗ | 164.8 | 0.18 | |||
32 | 8 | 8 | 1 | 2 | 5 | ✗ | 193.1 | 0.11 | |||
1 | E5‐2670v2 | 980×2, | 10 | 1 | 1 | – | 4 | ht | (✓) | 58 | 1 |
2 | (2×10 cores) | QDR | 4 | 5 | 1 | – | 2 | (✓) | 75.6 | 0.65 | |
4 | 8 | 5 | 1 | – | 2 | ✗ | 96.6 | 0.42 | |||
1 | E5‐2680v2 | – | 8 | 2 | 2 | 8 | 1 | ht | ✓ | 26.8 | 1 |
2 | (2×10 cores) | FDR‐14, | 4 | 5 | 3 | 10 | 1 | ht | ✓ | 42 | 0.78 |
4 | 8 | 5 | 3 | 10 | 1 | ht | ✓ | 76.3 | 0.71 | ||
8 | 8 | 7 | 2 | 6 | 2 | ht | ✓ | 122 | 0.57 | ||
16 | 8 | 8 | 4 | 4 | 1 | ✓ | 162 | 0.38 | |||
32 | 8 | 8 | 8 | 4 | 1 | ✓ | 209 | 0.24 | |||
64 | 10 | 8 | 6 | 2.5 | 2 | ✓ | 240 | 0.14 | |||
1 | E5‐2680v2 | K20X×2 | 8 | 1 | 1 | – | 5 | ht | ✓ | 55.2 | 1 |
2 | (2×10 cores) | (732 MHz), | 4 | 5 | 1 | – | 4 | ht | ✓ | 74.5 | 0.67 |
4 | FDR‐14 | 8 | 1 | 2 | – | 5 | ✓ | 118 | 0.53 | ||
8 | 8 | 1 | 2 | 2 | 5 | ✗ | 163 | 0.37 | |||
16 | 8 | 4 | 1 | 2 | 5 | ✗ | 226 | 0.26 | |||
32 | 8 | 8 | 1 | 2 | 5 | ✗ | 304 | 0.17 |
[a] A black “ht” symbol indicates that using all hyperthreading cores resulted in the fastest execution, otherwise using only the physical core count was more advantageous. A gray “(ht)” denotes that this benchmark was done only with the hyperthreading core count (=2 × physical).
Note: These nodes cannot use the full QDR IB bandwidth due to insufficient number of PCIe lanes, see “Strong Scaling” section.