Skip to main content
. 2017 Jun 30;17(7):1534. doi: 10.3390/s17071534

Table 2.

Output size, numbers and sizes of filters, number of strides, and padding in our deep residual CNN structure (3* represents that 3 pixels are included as padding in left, right, up, and down positions of input image of 224 × 224 × 3 whereas 1* shows that 1 pixel is included as padding in left, right, up, and down positions of feature map) (2/1** means 2 at the 1st iteration and 1 from the 2nd iteration).

Layer Name Size of Feature Map Number of Filters Size of Filters Number of Strides Amount of Padding Number of Iterations
Image input layer 224 (height) × 224 (width) × 3 (channel)
Conv1 112 × 112 × 64 64 7 × 7 × 3 2 3* 1
Max pool 56 × 56 × 64 1 3 × 3 2 0 1
Conv2 Conv2-1 56 × 56 × 64 64 1 × 1 × 64 1 0 3
Conv2-2 56 × 56 × 64 64 3 × 3 × 64 1 1*
Conv2-3 56 × 56 × 256 256 1 × 1 × 64 1 0
Conv2-4 (Shortcut) 56 × 56 × 256 256 1 × 1 × 64 1 0
Conv3 Conv3-1 28 × 28 × 128 128 1 × 1 × 256 2/1** 0 4
Conv3-2 (Bottleneck) 28 × 28 × 128 128 3 × 3 × 128 1 1*
Conv3-3 28 × 28 × 512 512 1 × 1 × 128 1 0
Conv3-4 (Shortcut) 28 × 28 × 512 512 1 × 1 × 256 2 0
Conv4 Conv4-1 14 × 14 × 256 256 1 × 1 × 512 2/1** 0 6
Conv4-2 (Bottleneck) 14 × 14 × 256 256 3 × 3 × 256 1 1*
Conv4-3 14 × 14 × 1024 1024 1 × 1 × 256 1 0
Conv4-4 (Shortcut) 14 × 14 × 1024 1024 1 × 1 × 512 2 0
Conv5 Conv5-1 7 × 7 × 512 512 1 × 1 × 1024 2/1** 0 3
Conv5-2 (Bottleneck) 7 × 7 × 512 512 3 × 3 × 512 1 1*
Conv5-3 7 × 7 × 2048 2048 1 × 1 × 512 1 0
Conv5-4 (Shortcut) 7 × 7 × 2048 2048 1 × 1 × 1024 2 0
AVG pool 1 × 1 × 2048 1 7 × 7 1 0 1
FC layer 2 1
Softmax 2 1