Skip to main content
. 2022 Jan 26;32(7):4749–4759. doi: 10.1007/s00330-021-08532-2

Table 2.

Layer-by-layer description of the CNNs used in the two ensemble models SEG and noSEG

Name Layer Filter kernel (shape, count) Output size
Main branch Shortcut noSEG SEG
in Input - 50 × 50 × 50 × 1 50 × 50 × 50 × 2
res1a 3D convolution 3 × 3 × 3, 16 3 × 3 × 3, 1 25 × 25 × 25 × 16
res1b 3D convolution 3 × 3 × 3, 16 id 25 × 25 × 25 × 16
add1 Add - 25 × 25 × 25 × 16
res2a 3D convolution 3 × 3 × 3, 32 3 × 3 × 3, 1 13 × 13 × 13 × 32
res2b 3D convolution 3 × 3 × 3, 32 id 13 × 13 × 13 × 32
add2 Add - 13 × 13 × 13 × 32
res3a 3D convolution 3 × 3 × 3, 64 3 × 3 × 3, 1 7 × 7 × 7 × 64
res3b 3D convolution 3 × 3 × 3, 64 id 7 × 7 × 7 × 64
add3 Add - 7 × 7 × 7 × 64
pool Global average pooling - 64
drop Dropout - 64
out Fully connected layer - 1

The convolutional part of each network (up to layer “add3”) consisted of a main branch, containing three-dimensional convolutions, and a shortcut branch, containing either a single convolution kernel for downscaling or an identity mapping (“id”). At each add layer (“add1”, “add2”, “add3”), the main branch and the shortcut branch were added. After add1 and add2, the images were split up again into main and shortcut branches