Skip to main content
. 2024 May 3;69(10):10TR01. doi: 10.1088/1361-6560/ad387d

Table 13.

Comparative assessment of limitations and future perspectives for research in multimodal learning.

References Specific limitation Specific future perspective
Huemann et al (2023a) Single dataset; limited multimodal datasets; primary annotator bias; language as input requirement; comparison challenges Diverse datasets; inter-observer variability; task expansion; reducing language dependency; AI-assisted clinical workflow
Huang et al (2021) Focus on chest radiographs; dependence on report quality; designed for english-language reports Expand to other modalities and regions; improve robustness to report quality; adapt for different languages
Li et al (2023d) 2D segmentation limitation; manual text annotation requirement 3D segmentation; automating text annotation generation
Khare et al (2021) Small labeled datasets; model interpretability; incomplete consideration of expert diagnosis Larger datasets; enhanced interpretability; clinical integration
Chen et al (2022) Complexity and diversity handling in medical image-text data Sophisticated techniques for diverse medical scenarios
Delbrouck et al (2022) Visio-linguistic reasoning and understanding; overfitting and generalization; transparency in medical AI; NLG evaluation in medical AI Enhanced interpretability in multimodal scenarios; refined training methodologies; improved documentation and accessibility; advanced evaluation metrics