Skip to main content
. 2021 Nov 25;124:108452. doi: 10.1016/j.patcog.2021.108452

Fig. 3.

Fig. 3

An illustration of ESM and ASSM. Firstly, the low resolution feature maps from the stage Si are resized to the same size H×W with the input image by using bilinear interpolation up-sampling. Then all high resolution feature maps are reduced to a feature map by using 1×1 convolutions. Finally each pixel value of the obtained feature map is converted to a probability by using Sigmoid function σ(·), and the prediction image of the Si stage is obtained. (a) ESM: the edge supervision is achieved by comparing between the obtained edge prediction image Sedgei and the corresponding edge Ground Truth (GT) Gedge based on Eq.(1). (b) ASSM: the auxiliary semantic supervision is achieved by comparing between the obtained coarse segmented image Smaski and the corresponding Ground Truth (GT) of segmentation mask Gmask based on Eq.(2).