Skip to main content
. 2023 Dec 13;21(12):e3002366. doi: 10.1371/journal.pbio.3002366

Table 12. CochCNN9 architecture.

Input (cochleagram) (1,211,390)
BatchNorm2d_1 [1] (1,211,390)
Conv2d_1(1, 96, kernel_size = [7, 14], stride = [3, 3], padding = ’same’) [96, 71, 130]
ReLU_1 [96, 71, 130]
MaxPool2d_1(kernel_size = [2,5], stride = [2,2], padding = ’same’) [96, 36, 65]
BatchNorm2d_2 [96] [96, 36, 65]
Conv2d_2(96, 256, kernel_size = [4,8], stride = [2,2], padding = ’same’) (256, 18, 33)
ReLU_2 (256, 18, 33)
MaxPool2d_2(kernel_size = [2,5], stride = [2,2], padding = ’same’) (256, 9, 17)
BatchNorm2d_3 (256) (256, 9, 17)
Conv2d_3 (512, kernel_size = [2,5], stride = [1,1], padding = ’same’) (512, 9, 17)
ReLU_3 (512, 9, 17)
Conv2d_4 (1024, kernel_size = [2,5], stride = [1,1], padding = ’same’) (1,024, 9, 17)
ReLU_4 (1,024, 9, 17)
Conv2d_5 (512, kernel_size = [2,5], stride = [1,1], padding = ’same’) (512, 9, 17)
ReLU_5 (512, 9, 17)
AvgPool_1 (kernel_size = [2,5], stride = [2,2], padding = ’same’) (512, 5, 9)
Linear_1 (4,096)
ReLU_6 (4,096)
Dropout_1 (p = 0.5) (4,096)
Linear_2 (num_classes)