Skip to main content
. 2015 Aug 4;36(26):1990–2008. doi: 10.1002/jcc.24030

Table 11.

Scaling of the MEM benchmark on different node types with performance P and parallel efficiency E.

No. of nodes Processor(s) Intel GPUs, IB DD Grid N PME/node N th DLB P (ns/d) E
x y z
1 E3‐1270v2 770, 1 1 1 8 (ht) 20.5 1
2 (4 cores) QDR[a] 2 1 1 8 (ht) (✓) 27.2 0.66
4 4 1 1 8 (ht) (✓) 22.1 0.27
8 8 1 1 8 (ht) (✓) 68.3 0.42
16 16 1 1 8 (ht) (✓) 85.7 0.26
32 8 4 1 8 (ht) (✓) 119 0.18
1 E5‐1620 680, 1 1 1 8 ht 21 1
2 (4 cores) QDR 2 1 1 8 ht (✓) 29 0.69
4 4 1 1 8 ht (✓) 46.9 0.56
1 E5‐2670v2 780Ti×2, 10 1 1 4 ht 56.9 1
2 (2×10 cores) QDR 4 5 1 2 74.2 0.65
4 8 1 1 2 5 103.4 0.45
8 8 1 2 2 5 119.1 0.26
16 8 4 1 2 5 164.8 0.18
32 8 8 1 2 5 193.1 0.11
1 E5‐2670v2 980×2, 10 1 1 4 ht (✓) 58 1
2 (2×10 cores) QDR 4 5 1 2 (✓) 75.6 0.65
4 8 5 1 2 96.6 0.42
1 E5‐2680v2 8 2 2 8 1 ht 26.8 1
2 (2×10 cores) FDR‐14, 4 5 3 10 1 ht 42 0.78
4 8 5 3 10 1 ht 76.3 0.71
8 8 7 2 6 2 ht 122 0.57
16 8 8 4 4 1 162 0.38
32 8 8 8 4 1 209 0.24
64 10 8 6 2.5 2 240 0.14
1 E5‐2680v2 K20X×2 8 1 1 5 ht 55.2 1
2 (2×10 cores) (732 MHz), 4 5 1 4 ht 74.5 0.67
4 FDR‐14 8 1 2 5 118 0.53
8 8 1 2 2 5 163 0.37
16 8 4 1 2 5 226 0.26
32 8 8 1 2 5 304 0.17

[a] A black “ht” symbol indicates that using all hyperthreading cores resulted in the fastest execution, otherwise using only the physical core count was more advantageous. A gray “(ht)” denotes that this benchmark was done only with the hyperthreading core count (=2 × physical).

Note: These nodes cannot use the full QDR IB bandwidth due to insufficient number of PCIe lanes, see “Strong Scaling” section.