Skip to main content
. 2019 Apr 18;32(4):672–677. doi: 10.1007/s10278-018-0167-7

Table 1.

The architecture of the Inception V3 model. Each row represents a layer of the network, and the input of a particular layer is the output of the previous layer. For the inception layers, each row represents parallel sub-layers that were concatenated prior to being passed to the subsequent layer. The list of parameters in each row represents serial processes within the sub-layer. With the Inception 6 layer, the parameters within the parentheses represent additional parallel sub-layers within the serial processes. The number of times each Inception layer is repeated is prepended to each layer name

Type Patch size/strides Input size
Conv 1 3 × 3/2 300 × 300 × 1
Conv 2 3 × 3/1 149 × 149 × 32
Conv 3 3 × 3/1 147 × 147 × 32
Max pool 1 3 × 3/2 147 × 147 × 64
Conv 4 1 × 1/1 73 × 73 × 64
Conv 5 3 × 3/1 73 × 73 × 80
Max pool 2 3 × 3/2 71 × 71 × 192
3× Inception 1 1 × 1/1 35 × 35 × 192
1 × 1/1, 3 × 3/1
1 × 1/1, 3 × 3/1, 3 × 3/1
Avg pool 3 × 3/1, 1 × 1/1
Inception 2 3 × 3/2 35 × 35 × 288
Max pool 3 × 3/2
1 × 1/1, 3 × 3/1, 3 × 3/2
4× Inception 3 1 × 1/1 17 × 17 × 768
1 × 1/1, 1 × 7/1, 7 × 1/1
1 × 1/1, 1 × 7/1, 7 × 1/1, 1 × 7/1, 7 × 1/1
Avg pool 3 × 3/1, 1 × 1/1
Auxiliary Avg pool 5 × 5/3, 1 × 1/1, linear, softmax 17 × 17 × 768
Inception 5 1 × 1/1, 3 × 3/2 17 × 17 × 768
1 × 1/1, 1 × 7/1, 7 × 1/1, 3 × 3/2
Max pool 3 × 3/2
2× Inception 6 1 × 1/1 8 × 8 × 1280
1 × 1/1, (1 × 3/1, 3 × 1/1)
1 × 1/1, 3 × 3/1, (1 × 3/1, 3 × 1/1)
Avg pool 3 × 3/1, 1 × 1/1
Avg pool 8 × 8/1 8 × 8 × 2048
Output Dropout, linear, softmax 1 × 1 × 2048