Skip to main content
. 2019 Apr 23;9:6381. doi: 10.1038/s41598-019-42294-8

Table 1.

Architecture of the original, off-the-shelf, and fine-tuned ResNet-50.

Layer name Output size Original 50-layer Off-the-shelf Fine-tuned
conv1 112 × 112 7 × 7, 64-d, stride 2 same fine-tuned
pooling1 56 × 56 3 × 3, 64-d, max pool, stride 2 same same
conv2_x 56 × 56 [1×1,64d,stride13×3,64d,stride11×1,256d,stride1]×3 same fine-tuned
conv3_0 28 × 28 [1×1,128d,stride23×3,128d,stride11×1,512d,stride1] same fine-tuned
conv3_x 28 × 28 [1×1,128d,stride13×3,128d,stride11×1,512d,stride1]×3 same fine-tuned
conv4_0 14 × 14 [1×1,256d,stride23×3,256d,stride11×1,1024d,stride1] same fine-tuned
conv4_x 14 × 14 [1×1,256d,stride13×3,256d,stride11×1,1024d,stride1]×5 same fine-tuned
conv5_0 7 × 7 [1×1,512d,stride23×3,512d,stride11×1,2048d,stride1] same fine-tuned
conv5_x 7 × 7 [1×1,512d,stride13×3,512d,stride11×1,2048d,stride1]×2 same fine-tuned
pooling2 1 × 1 7 × 7, 2048-d, average pool, stride 1 same same
dense 1 × 1 1000-d, dense-layer 15-d, dense-layer
loss 1 × 1 1000-d, softmax 15-d, sigmoid, BCE

In our experiments, we use the ResNet-50 architecture and this table shows differences between the original architecture and ours (off-the-shelf and fine-tuned ResNet-50). If there is no difference to the original network, the word “same” is written in the table. The violet and bold text emphasizes, which parts of the network are changed for our application. All layers do employ automatic padding (i.e. depending on the kernel size) to keep spatial size the same. The conv3_0, conv4_0, and conv5_0 layers perform a down-sampling of the spatial size with a stride of 2.