Table 1. CUDA implementation versus C based sequential implementation.
Image Size | CPU/GPU | Phase extraction (ms) | Residue Identification (ms) | Branch cut Placement (ms) | Unwrap (ms) | Total (ms) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
1024 × 1024 | CPU | 317.42 | 43.42 | 6.74 | 89.32 | 460.7 | |||||
1 frame | GPU | 5.05 | 0.58 | 1.125 | 10.014 | 24.55 | |||||
Speedup factor | 62.86 | 74.19 | 5.99 | 8.92 | 18.77 | ||||||
1024 × 1024 | CPU | 3174.2 | 434.2 | 67.4 | 893.2 | 4607.4 | |||||
10 frames | GPU | 40.486 | 5.55 | 1.128 | 45.285 | 111.1 | |||||
Speedup factor | 78.4 | 78.19 | 59.71 | 19.72 | 41.47 | ||||||
512 × 512 | CPU | 71 | 11 | 5 | 16 | 105 | |||||
1 frame | GPU | 2.18 | 0.2 | 0.02 | 1.87 | 8 | |||||
Speedup factor | 32.61 | 55.84 | 250 | 8.55 | 13.13 | ||||||
512 × 512 | CPU | 710 | 110 | 50 | 160 | 1050 | |||||
10 frames | GPU | 11.57 | 1.4 | 0.02 | 6.722 | 26 | |||||
Speedup factor | 61.37 | 78.57 | 2500 | 23.8 | 40.38 |