Skip to main content
. 2024 Jul 23;7:1427517. doi: 10.3389/frai.2024.1427517

Figure 3.

Figure 3

Labeling methods for computer vision. (A) Image-level labeling without further annotation enables the classification of entire images. (B) Annotating images using bounding boxes enables the training of object detection models. As the output of an object detection model is a predicted bounding box, it is unsuitable for the precise demarcation of anatomical structures. Here, third molar teeth are roughly localized. (C) Pixel-wise masks enable the training of semantic segmentation models whose outputs are pixel-wise masks as well. Importantly, no differentiation is made between multiple instances of the same structure. Here, all occurrences of a third molar are marked without differentiation between them. (D) Separately labeled pixel-wise masks enable further distinction between multiple instances of the same structure using instance segmentation models. Here, in addition to all occurrences of a third molar being marked, the two third molars are clearly distinguished from one another.