Skip to main content
. 2020 Aug 1;20:100405. doi: 10.1016/j.imu.2020.100405

Table 2.

A standard convolution is represented by “Conv”.“t1” and “t2” represent convolution stride 1 × 1 and 2 × 2 respectively. “dw” stands for Depth wise separable convolution which is made up with two layers. (i) Depth wise convolutions are used to apply a single filter in every input channel, and (ii) A 1 × 1 Pointwise convolution is used to create a linear combination of the output of the depth wise layer.

Type Shape Input size
Conv/t2 3 × 3 × 3 × 32 224 × 224 × 3
Conv/t1 3 × 3 × 32 dw 112 × 112 × 32
Conv/t1 1 × 1 × 32 × 64 112 × 112 × 32
Conv/t2
3 × 3 × 64 dw
112 × 112 × 64
Conv/t1 1 × 1 × 64 × 128 56 × 56 × 64
Conv/t1 3 × 3 × 128 dw 56 × 56 × 128
Conv/t1 1 × 1 × 128 × 128 56 × 56 × 128
Conv/t2
3 × 3 × 128 dw
56 × 56 × 128
Conv/t1 1 × 1 × 128 × 256 28 × 28 × 128
Conv/t1 3 × 3 × 256 dw 28 × 28 × 256
Conv/t1 1 × 1 × 256 × 256 28 × 28 × 256
Conv/t2
3 × 3 × 256 dw
28 × 28 × 256
Conv/t1 1 × 1 × 256 × 512 14 × 14 × 256
5 × conv/t1 13 × 3 × 512 dw 14 × 14 × 512
5 × conv/t1 1 × 1 × 512 × 512 14 × 14 × 512