Table 1.
Time performance.
Platform | Forward | Backward | Iteration (s) | Improvement |
---|---|---|---|---|
projection (s) | projection (s) | factor | ||
CPU (np) | 417 | 488 | 905 | 1 |
| ||||
CPU (MPI-1) | 523 (0.79) | 748 (0.65) | 1,272 | 0.71 |
CPU (MPI-2) | 315 (1.32) | 560 (0.87) | 875 | 1.03 |
CPU (MPI-4) | 214 (1.94) | 357 (1.36) | 571 | 1.58 |
CPU (MPI-8) | 209 (1.99) | 316 (1.54) | 526 | 1.72 |
| ||||
GPU (Overwrite hazard ignored) | 79 (5.27) | 121 (4.03) | 200 | 4.52 |
GPU (SRMreload) | (58+)97 (2.7) | (57+)87 (3.38) | (115+)184 | 3.03 |
GPU (SRMatomic) | 79 (5.27) | 232 (2.10) | 311 | 2.90 |
GPU (SRMrand-at) | 79 (5.27) | 132 (3.69) | 211 | 4.28 |