Table 4.
The detection performance of average precision metrics and average recall metrics with different SAF configurations on the nuScenes validation dataset. The configuration of “1×1” stands for a convolution layer with kernel size , stride (1, 1), padding [0, 0]. As for the layers of “3×3”, “5×5” and “7×7”, the configurations are {, (1, 1), [1, 1]}, {, (1, 1), [2, 2]}, and {, (1, 1), [3, 3]}, respectively.
| SAF | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| (100) | (100) | (100) | (100) | (100) | (100) | ||||
| 60.7 | 84.6 | 65.6 | 43.6 | 58.7 | 72.7 | ||||
| ✓ | 58.6 | 84.5 | 62.9 | 39.6 | 57.5 | 70.9 | |||
| ✓ | 68.8 | 88.7 | 75.3 | 51.7 | 67.2 | 80.1 | |||
| ✓ | 69.8 | 89.8 | 76.2 | 53.2 | 67.9 | 80.6 | |||
| ✓ | 68.1 | 88.8 | 74.0 | 49.8 | 66.8 | 79.0 | |||
| ✓ | ✓ | 69.4 | 89.7 | 76.5 | 52.7 | 67.6 | 80.1 | ||
| ✓ | ✓ | 67.0 | 88.3 | 72.8 | 49.0 | 65.2 | 78.5 | ||
| ✓ | ✓ | 54.7 | 82.5 | 58.0 | 33.6 | 53.3 | 68.8 | ||
| ✓ | ✓ | ✓ | 70.2 | 89.9 | 76.7 | 54.3 | 68.6 | 80.6 | |
| ✓ | ✓ | ✓ | 69.8 | 89.6 | 75.9 | 53.6 | 67.7 | 80.6 | |
| ✓ | ✓ | ✓ | ✓ | 70.1 | 89.9 | 76.7 | 52.4 | 68.8 | 80.3 |
| SAF | |||||||||
| (1) | (10) | (100) | (100) | (100) | (100) | ||||
| 12.2 | 56.9 | 70.5 | 55.8 | 70.0 | 79.7 | ||||
| ✓ | 12.0 | 55.5 | 68.4 | 50.6 | 68.1 | 78.5 | |||
| ✓ | 12.6 | 63.7 | 77.0 | 64.9 | 76.2 | 85.0 | |||
| ✓ | 12.6 | 64.9 | 77.2 | 66.0 | 76.4 | 84.8 | |||
| ✓ | 12.6 | 63.0 | 76.2 | 64.5 | 75.5 | 83.8 | |||
| ✓ | ✓ | 12.7 | 63.8 | 76.6 | 66.5 | 75.1 | 84.7 | ||
| ✓ | ✓ | 12.6 | 62.1 | 75.4 | 63.9 | 74.2 | 83.9 | ||
| ✓ | ✓ | 11.7 | 52.8 | 64.7 | 45.7 | 63.8 | 76.8 | ||
| ✓ | ✓ | ✓ | 12.7 | 65.2 | 77.4 | 66.8 | 76.3 | 85.0 | |
| ✓ | ✓ | ✓ | 12.6 | 64.3 | 77.0 | 66.8 | 75.8 | 84.7 | |
| ✓ | ✓ | ✓ | ✓ | 12.8 | 64.3 | 76.8 | 65.8 | 75.8 | 84.5 |