Impact of data augmentations on attention heatmaps and segmentation results. The heatmaps calculated using Grad‐CAM highlight the attention regions. The four convolutional neural networks (CNNs) had the same U‐Net structure; however, trained by different datasets (A: baseline, B: baseline + noise injection [NI], C: baseline + random overlay [RO], D: baseline + NI + RO). DEC4, decoder block next to the bottom block; ENC4, encoder block next to the bottom block; GT, ground truth; pCTV, projected clinical‐target‐volume; XF, X‐ray fluoroscopic image