Table 2. Single GPU Throughput Timings (ns/day) for AMBER GB Simulations with a Time Step of 2 fs Using the Parallel CPU Version on One Node (12 Intel X5670 Cores or 32 AMD Opteron 6136 Cores) and the Serial GPU Version with the SPDP Precision Model on One Node (One Intel X5670 Core and One GPU)a.
CPU/GPU | TRPCage (304 atoms) | ubiquitin (1231 atoms) | apo-myoglobin (2492 atoms) | nucleosome (25 095 atoms) |
---|---|---|---|---|
GPU version | ||||
M2090 (6 GB) | 399.9 | 184.2 | 78.1 | 1.42 |
C2070 (6 GB) | 364.1 | 157.2 | 64.3 | 1.09 |
C1060 (4 GB) | 234.6 | 78.3 | 31.5 | 0.40 |
GTX580 (1.5 GB, PNY XLR8) | 471.1 | 215.9 | 88.7 | – |
CPU version | ||||
32 × Opteron 6136 | 225.0b | 29.9 | 10.3 | 0.08 |
12 × X5670 | 247.1 | 19.8 | 6.6 | 0.07 |
For details on the hardware and software stack, see text. A dash indicates insufficient GPU memory for the simulation.
The CPU code requires >10 atoms per core, and thus the TRPCage simulation was run on 24 CPU cores.