Skip to main content
. 2018 Apr 26;8:6600. doi: 10.1038/s41598-018-25005-7

Figure 3.

Figure 3

Architecture of ZF model. An 3 channels image with 224*224 is as the input. It is convolved with 96 7*7 filters with a stride of 2 in x and y. Then the process is: (1) processed by rectified linear function (Omit here), (2) using stride 2, max pooled with 3*3 regions, (3) processed by contrast normalized, yielding 96 55*55 feature maps. The following layers 2, 3, 4, 5 perform the same operation, (4) layer 6 and layer 7 are fully connected. They extract features from layer 5 which is as input in the form of vector. The output layer is a softmax function and the “1000” in the figure is the number of classes.