1
|
Input: k-th layer attention map/heatmap from ResNet-50 |
2
|
Output: coordinates (x1, y1, x2, y2) of the bounding box |
3
|
Scale heatmap intensities to [0, 255] and create a mask matrix with the same dimension as the heatmap. |
4
|
if
pixel > 180
then
|
5
|
⌊ mask[pixel] = 1 |
6
|
else
|
7
|
⌊ mask[pixel] = 0 |
8
|
Using dynamic programming [24], generate k maximum area rectangles as candidate BB. |
9
|
Expand candidate rectangles uniformly across the edge till newly added ratio (0s count, 1s count) > 1 |
10
|
Select the rectangle with the maximum average pixel intensity mapped in the heatmap and return its coordinates. |