Skip to main content
. 2021 Mar 4;32(5):2535–2549. doi: 10.1109/TCSVT.2021.3063952

TABLE I. The Configuration Setting for Spatial-Channel-Attention ResNet.

Layer name Output size Configuration
input Inline graphic -
conv1 Inline graphic Inline graphic, 64, stride 2
Inline graphic, max pool, stride 2
conv2 Inline graphic Inline graphic
conv3 Inline graphic Inline graphic
conv4 Inline graphic Inline graphic
conv5 Inline graphic Inline graphic
hidden layer 1024 fc(1024,Relu)
hidden layer 512 fc(512,Relu)
hidden layer 4 fc(4,Relu)
output 4 softmax

1 fc denotes fully-connected layer.

2 conv denotes convlution layer.

3 avg and max denote the average and max pooling operation respectively.

4 channel Inline graphic and spatial Inline graphic denote the channel-wise and spatial attention respectively.