Table 3.
Unit | Layer | Filter/Stride | Output Size | |
---|---|---|---|---|
Input | 0 | 112 × 112 × 1 | ||
Facial Landmark Feature Network |
1 | Conv-BN-ReLU Conv-BN-ReLU Conv-BN-ReLU |
7 × 7, 64/2 7 × 7, 128/2 7 × 7, 256/2 |
53 × 53 × 64 24 × 24 × 12 89 × 9 × 256 |
Output | 2 | GlobalAvgPool | 256 |
BN: batch normalization. The stride is 2, but the feature map size is reduced by more than 0.5-times because padding is not performed during convolution.