Skip to main content
. 2021 Oct 6;3(6):e200267. doi: 10.1148/ryai.2021200267

Figure 1:

Visualization of the different questions addressed in this work. Note that the top rows of images and saliency maps demonstrate ideal (and less commonly observed) high-performing examples (“pass”), while the bottom rows of images demonstrate realistic (and more commonly observed) poor-performing examples (“did not pass”). First, we examined whether saliency maps are good localizers in regard to the extent of the maps’ overlap with pixel-level segmentations or ground truth bounding boxes. Next, we evaluated whether saliency maps were affected when trained model weights were randomized, indicating how closely the maps reflect model training. Then we generated saliency maps from separately trained InceptionV3 models to assess their repeatability. Finally, we assessed the reproducibility by calculating the similarity of saliency maps generated from different models (InceptionV3 and DenseNet-121) trained on the same data.

Visualization of the different questions addressed in this work. Note that the top rows of images and saliency maps demonstrate ideal (and less commonly observed) high-performing examples (“pass”), while the bottom rows of images demonstrate realistic (and more commonly observed) poor-performing examples (“did not pass”). First, we examined whether saliency maps are good localizers in regard to the extent of the maps’ overlap with pixel-level segmentations or ground truth bounding boxes. Next, we evaluated whether saliency maps were affected when trained model weights were randomized, indicating how closely the maps reflect model training. Then we generated saliency maps from separately trained InceptionV3 models to assess their repeatability. Finally, we assessed the reproducibility by calculating the similarity of saliency maps generated from different models (InceptionV3 and DenseNet-121) trained on the same data.