Table 2.
Task | CPU | GPU | Speed-up |
---|---|---|---|
Assembly of the system matrices | |||
Compute element contributions | |||
Mesh level 1 | n.a.1 | 0.04 ms | |
Mesh level 2 | n.a.1 | 0.10 ms | |
Mesh level 3 | n.a.1 | 0.66 ms | |
Convert to CRS format | |||
Mesh level 1 | n.a.1 | 0.72 ms | |
Mesh level 2 | n.a.1 | 2.06 ms | |
Mesh level 3 | n.a.1 | 13.17 ms | |
Total | |||
Mesh level 1 | 0.63 ms | 0.76 ms | 0.83 |
Mesh level 2 | 6.37 ms | 2.16 ms | 2.95 |
Mesh level 3 | 121.97 ms | 13.83 ms | 8.82 |
Solution of the linear systems | |||
PBCG without multigrid | |||
Forward solution (Vx, Vm) | 10.12 s | 736.64 ms | 13.73 |
Adjoint solution (Wx, Wm) | 8.84 s | 633.92 ms | 13.94 |
PBCG with multigrid | |||
Forward solution (Vx, Vm) | 5.45 s | 461.60 ms | 11.81 |
Adjoint solution (Wx, Wm) | 6.02 s | 508.29 ms | 11.85 |
Computation of measurements | |||
D⊤Vm | 496.40 ms | 6.53 ms | 76.01 |
Assembly of the sensitivity matrix | |||
assembleSensitivity | 16.68 s | 1.03 s | 16.23 |
Solution of the Gauß-Newton system | 10.87 s | 331.92 ms | 32.75 |
Total reconstruction time | |||
Without multigrid | 6.39 min | 27.39 s | 14.00 |
With multigrid | 5.41 min | 24.94 s | 13.01 |
On the CPU the system matrices are assembled in one step