Skip to main content
. 2017 Nov 8;2017:8348671. doi: 10.1155/2017/8348671

Table 3.

Performance of cuDNN SGEMM versus that of the 3D WMFA on 3D convolution layers. Performance is measured in effective TFLOPS.

Layer C × D × H × W × N K TFLOPS Speedup
cuDNN SGEMM 3D WMFA
conv2 32 × 16 × 56 × 56 × 32 64 1.21 1.28 1.05
conv3 64 × 8 × 28 × 28 × 32 256 2.38 3.31 1.39
conv4 256 × 4 × 14 × 14 × 32 256 2.4 4.72 1.96
conv5 256 × 2 × 7 × 7 × 32 256 1.46 2.1 1.44