Skip to main content
. 2024 May 3;69(10):10TR01. doi: 10.1088/1361-6560/ad387d

Figure 5.

Figure 5.

This diagram illustrates the process of medical image captioning using language models. Based on different captioning methods, it can be broadly categorized into direct generation approaches, retrieval-based approaches, and deep learning-based approaches. Among the deep learning approaches, there is further classification based on the involvement of an attention mechanism, leading to encoder–decoder-based approaches and attention-based approaches. Most medical image captioning methods primarily focus on encoder–decoder approaches.