Skip to main content
. 2021 Jan 16;113:107828. doi: 10.1016/j.patcog.2021.107828

Table 1.

Network architecture of the proposed M2UNet. The network has three main components: 1) a encoding module containing five encoding blocks; 2) a classification sub-network containing the embedding-level MIL and image-level MIL, and a classifier; and 3) a segmentation sub-network consisting of a decoding module with five decoding blocks. MIL: multi-instance learning; Num.: Number of layers, K: kernel size; PAD: padding size; STR: stride; #: Number of learnable parameters; cov: convolution; GCP: global contrast pooling; concat: concatenation.

Block Name Num. Layers Parameter Setting Input #
Encoding block 1 2 {conv, batchnorm, ReLU} K: {3×3×64}, PAD:1, STR:1 2D image patches 37K
Pool 1 1 max-pooling K: {2×2}, STR:2 Encoding block 1 -
Encoding block 2 2 {conv, batchnorm, ReLU} K: {3×3×64}, PAD:1, STR:1 Pool 1 72K
Pool 2 1 max-pooling K: {2×2}, STR:2 Encoding block 2 -
Encoding block 3 2 {conv, batchnorm, ReLU} K: {3×3×64}, PAD:1, STR:1 Pool 2 72K
Pool 3 1 max-pooling K: {2×2}, STR:2 Encoding block 3 -
Encoding block 4 2 {conv, batchnorm, ReLU} K: {3×3×64}, PAD:1, STR:1 Pool 3 72K
Pool 4 1 max-pooling K: {2×2}, STR:2 Encoding block 4 -
Encoding block 5 1 {conv, batchnorm, ReLU} K: {3×3×64}, PAD:1, STR:1 Pool 4 2595K
1 {conv, batchnorm, ReLU} K: {3×3×512}, PAD:1, STR:1
Embedding-Level MIL 1 GCP Num. Concepts: 256 Encoding block 5 193K
1 conv K: {1×1×256}, PAD:0, STR:1
Image-Level MIL 1 GCP Num. Concepts: 128 Embedding-Level MIL 48K
1 conv K: {1×1×128}, PAD:0, STR:1
Classifier 1 conv K: {1×1×128}, PAD:0, STR:1 Image-Level MIL 0.3K
Decoding block 5 1 {up-sample, conv, batchnorm, ReLU, concat} K: {3×3×512}, PAD:1, STR:1 Encoding block 5 397K
Decoding block 4 1 {up-sample, conv, batchnorm, ReLU, concat} K: {3×3×64}, PAD:1, STR:1 Decoding block 5 145K
2 {conv, batchnorm, ReLU} K: {3×3×128}, PAD:1, STR:1 Encoding block 3
Decoding block 3 1 {up-sample, conv, batchnorm, ReLU, concat} K: {3×3×64}, PAD:1, STR:1 Decoding block 4 145K
2 {conv, batchnorm, ReLU} K: {3×3×128}, PAD:1, STR:1 Encoding block 2
Decoding block 2 1 {up-sample, conv, batchnorm, ReLU, concat} K: {3×3×64}, PAD:1, STR:1 Decoding block 3 145K
2 {conv, batchnorm, ReLU} K: {3×3×128}, PAD:1, STR:1 Encoding block 1
Decoding block 1 1 conv K: {1×1×64}, PAD:0, STR:1 Decoding block 2 0.5K