Table 4.
VGG11 based architecture used for both the first and the second neural networks in the proposed algorithm. Each conv2d layer comprises 2D convolutions with the parameters kernel_size = 3 and padding = 1. Parameters of the Max-pooling layer: kernel_size = 2, stride = 2. The conv2d and the linear layers (except the last one) are followed by batch normalization and ReLU. The network is trained using the binary cross entropy (BCE) loss via stochastic gradient descent with learning rate 0.001, momentum 0.99 and weight decay with decay parameter 10−7.
| Feature extraction layers | |
|---|---|
| Layer | Number of filters |
| conv2d | 64 |
| Max-pooling(M-P) | |
| conv2d | 128 |
| M-P | |
| conv2d | 256 |
| conv2d | 256 |
| M-P | |
| conv2d | 512 |
| conv2d | 512 |
| M-P | |
| Classification layers | |
| Layer | Output size |
| Linear | 4096 |
| Linear | 4096 |
| Linear | 1 |