TABLE II. Network Architectures.
| CAN1-2D | CNN-3D |
|---|---|
| Input (64x64x3 image) | Input (64x64x20 image) |
| Conv2D (64, 3x3 kernels) | Conv3D (64, 3x3x3 kernels) |
| Conv2D (64, 3x3 kernels) | Conv3D (64, 3x3x3 kernels) |
| Max Pooling (2x2 kernel) | Max Pooling (2x2x2 kernel) |
| Conv2D (128, 3x3 kernels) | Conv3D (128, 3x3x3 kernels) |
| Conv2D (128, 3x3 kernels) | Conv3D (128, 3x3x3 kernels) |
| Max Pooling (2x2 kernel) | Max Pooling (2x2x2 kernel) |
| Conv2D (256, 3x3 kernels) | Conv3D (256, 3x3x3 kernels) |
| Conv2D (256, 3x3 kernels) | Conv3D (256, 3x3x3 kernels) |
| Conv2D (256, 3x3 kernels) | Conv3D (256, 3x3x3 kernels) |
| Max Pooling (2x2 kernel) | Max Pooling (2x2x2 kernel) |
| Conv2D (512, 3x3 kernels) | Conv3D (512, 3x3x3 kernels) |
| Conv2D (512, 3x3 kernels) | Conv3D (512, 3x3x3 kernels) |
| Conv2D (512, 3x3 kernels) | Conv3D (512, 3x3x3 kernels) |
| Max Pooling (2x2 kernel) | Max Pooling (2x2x2 kernel) |
| Conv2D (512, 3x3 kernels) | Conv3D (512, 3x3x3 kernels) |
| Conv2D (512, 3x3 kernels) | Conv3D (512, 3x3x3 kernels) |
| Conv2D (512, 3x3 kernels) | Conv3D (512, 3x3x3 kernels) |
| Max Pooling (2x2 kernel) | Max Pooling (2x2x1 kernel) |
| Slice-wise Attention | Fully-Connected (128) |
| Fully-Connected (128) | Fully-Connected (128) |
| Fully-Connected (2, softmax) | Fully-Connected (2, softmax) |
A comparison of the proposed network with its closest 3-D equivalent for single-time-point classification. In the left column, a 2-D network processes incoming 2-D slices from a 3-D scan, and the attention mechanism combines these features into a single feature vector while suppressing irrelevant slice information (see Fig. 3 for more details).