Skip to main content
. 2022 May 8;146:105587. doi: 10.1016/j.compbiomed.2022.105587

Fig. 4.

Fig. 4

Human annotations and explanations generated by RISE [10], Grad-CAM [9], Occlusion [13], and LIME [8]. A strong explanation has at least one technique with high precision and recall. Human annotations have different salient regions highlighted for a COVID-19 CT image. The salient regions responsible for the prediction are represented by red color when using RISE, Grad-CAM, and Occlusion. LIME presents pixels supporting the prediction with green color and pixels negating the prediction with red color.annotations was measured by the pointing game as used in the previous studies [10,11,35]. For the evaluation, the highest (or maximum) saliency points of each explanation is extracted. In Fig. 4, this would mean the dark red attributed regions of the explanations provided by RISE, Grad- CAM and Occlusion and green regions for LIME. If the point lies within the human-annotated region, it is counted as a hit. Otherwise, the point is considered as a miss. For example, if we consider the first explanation of RISE for COVID-19 CT image in Fig. 4, there are two high saliency points that lie in the region indicated by a human annotation. Hence, it has 2 hits and 0 miss. However, for LIME, it would result in 2 hits and 5 miss The pointing precision is then computed as #hit. In addition, we also compute the recall of the explanations with respect to the ground truth annotations as #hit. Each ML explanation is compared to at least #ground−truth two human annotations. Further, the pointing game precision and recall is used to compare alignment between human explanations for COVID-19 and pneumonia CT images. Hence, one expert is considered as ground-truth to compare other two human experts. The assumption is that humans could focus on different regions of the CT images to make a decision.