Table 3. Number of units and features for each CNN layer.

Units and features of the deep neural network architecture were similar as proposed in (Krizhevsky et al., 2012). All deep neural networks were identical with the exception of the number of nodes in the last layer (output layer) as dictated by the number of training categories, i.e. 683 for the deep object network, 216 for deep scene network. Abbreviations: Conv = Convolutional layer, Pool = Pooling layer; Norm = Normalization layer; FC1-3 = fully connected layers. The 8 layers referred to in the manuscript correspond to the convolution stage for layers 1–5, and the FC103 stage for layers 6–8 respectively.

Layer	Conv1	Pool/Norm1	Conv2	Pool/Norm2	Conv3	Conv4	Conv5	Pool 5	FC1	FC2	FC3
Units	96	96	256	256	384	384	256	256	4096	4096	683/216
Feature	55×55	27×27	27×27	13×13	13×13	13×13	13×13	6×6	1	1	1