Investigation of Recognition Areas by Explainable Artificial Intelligence for Colonoscopy Images of Irritable Bowel Syndrome

Hiroshi Mihara; Shun Kuraishi; Haruka Fujinami; Takayuki Ando; Ichiro Yasuda

doi:10.1159/000546183

. 2025 Apr 29:1–7. Online ahead of print. doi: 10.1159/000546183

Investigation of Recognition Areas by Explainable Artificial Intelligence for Colonoscopy Images of Irritable Bowel Syndrome

Hiroshi Mihara ^a,^b,^✉, Shun Kuraishi ^c, Haruka Fujinami ^b,^c, Takayuki Ando ^b, Ichiro Yasuda ^b,^c

PMCID: PMC12148314 PMID: 40300572

Abstract

Introduction

Irritable bowel syndrome (IBS) is a condition in which gastroenterological endoscopists cannot detect anomalies using colonoscopy, yet an artificial intelligence (AI) developed for IBS colonoscopy images has been able to distinguish between IBS and healthy individuals with high accuracy. However, it was unclear which areas the AI identified as abnormal. The aim of this study was to elucidate how AI identifies regions typical of IBS by constructing an additional explainable AI (XAI).

Methods

Colonoscopy images of healthy individuals, patients with constipation-predominant IBS, and patients with diarrhea-predominant IBS, which are available in a repository (https://doi.org/10.5061/dryad.9s4mw6mkp), were used. After setting up a Python environment on a local PC, the XAI models for the three groups were developed. Images not used in the AI construction were then evaluated using XAI. XAI-generated images were independently assessed by two evaluators, H.M. and S.K., to record and reconcile the characteristic differences among the three groups.

Results

Images correctly identified as those of healthy individuals by XAI were evaluated as characteristics over the entire image. By contrast, for IBS, only parts of the images were evaluated as characteristic regions. For diarrhea-predominant IBS, regions characterized by clear vascular boundaries, homogeneity or erythematous tones, or narrow and somewhat dark-appearing sections of the intestinal tract were identified. For constipation-predominant IBS, regions characterized by unclear vascular boundaries, faded tones, or dark sections where the end was not visible were identified.

Conclusion

An XAI for IBS was collaboratively developed by endoscopists and clinical engineers, enabling the visualization of regions characteristic of IBS and healthy individuals. The real-time display of XAI is expected to further advance the elucidation of IBS pathophysiology.

Keywords: Irritable bowel syndrome, Saliency Map, Grad-CAM, XRAI, Deep learning, Colonoscopy image analysis

Plain Language Summary

Irritable bowel syndrome (IBS) is a common condition that affects the digestive system and causes symptoms like lower abdominal pain, bloating, constipation, or diarrhea. When doctors examine the colon with a camera (colonoscopy), they usually do not find clear abnormalities in people with IBS. However, artificial intelligence (AI) can detect subtle patterns that human eyes might miss. In this study, researchers used a type of AI called explainable AI (XAI), which highlights the areas that influenced the AI’s decision. They used colonoscopy images from three groups: healthy people, people with constipation-type IBS, and people with diarrhea-type IBS. The AI could accurately classify the images, and XAI showed which areas of the images it focused on. In healthy individuals, the AI tended to evaluate the entire image. In contrast, for IBS cases, the AI identified specific areas. For constipation-type IBS, it focused on dark or unclear regions. For diarrhea-type IBS, it highlighted reddish or mucus-rich areas. This study helps explain how AI recognizes patterns in images where no obvious findings are visible. Real-time use of XAI might also allow for targeted testing or treatment by guiding where to take tissue samples during colonoscopy. By making AI decisions more transparent, this approach could eventually support better diagnosis and treatment for IBS, a condition that currently lacks visible signs in standard medical imaging.

Introduction

The application of artificial intelligence (AI) technology in image classification for the detection and diagnosis of lesions in the medical field has seen a significant increase, driven by its accuracy and resulting rise in clinical use [1]. These AI models have been developed specifically to detect lesions in the lower gastrointestinal tract in real time, with several models already being integrated into clinical practice [2]. In the realm of functional gastrointestinal diseases, a challenge has been the absence of training datasets necessary for AI development, as these conditions often do not present with abnormalities visible on endoscopy [3]. However, by including information such as the presence or absence of symptoms in the training datasets, AI models may potentially detect subtle changes in the colon that are imperceptible to human observers. Indeed, our research has demonstrated the capability of AI to distinguish between images of individuals with IBS and healthy individuals [4]. Currently, diagnostic criteria for IBS, such as the Rome IV Criteria, are based solely on symptom-based assessments and do not include endoscopic findings. Our study does not seek to redefine these criteria. Instead, we propose that explainable AI (XAI) may uncover subtle mucosal features not previously recognized by human observers. If future studies can validate these features through targeted biopsies or microbiome analyses, endoscopic findings may eventually serve as helpful references – or even diagnostic aids – in future updates to IBS criteria. Decision-making process remains a “black box,” leaving unclear which image regions the AI uses as the basis for its judgments, thus hindering targeted interventions such as biopsies or examinations of intestinal bacteria even when real-time judgments are possible. To overcome this issue, the use of XAI can be instrumental [5]. XAI elucidates the regions upon which a model bases its decisions by generating saliency maps, thereby providing more transparent outcomes. Unlike organic diseases with visible lesions, IBS has long been considered to lack endoscopic abnormalities. The application of XAI enables visualization of potential subtle abnormalities used by AI for classification, potentially bridging the gap between functional and organic perspectives on IBS. This study aimed to construct XAI for endoscopic images of IBS, revealing how AI identifies regions that are typical of IBS, thus enhancing our understanding and potential treatment of this condition.

Material and Methods

The constipation-predominant IBS (IBS-C) and diarrhea-predominant IBS (IBS-D) groups were selected based on insurance diagnostic codes recorded during colonoscopy procedures between 2010 and 2012. The normal group included asymptomatic individuals who underwent colonoscopy due to abnormal health checkup results but had no symptoms and no endoscopic abnormalities. Age and sex data were not collected, but the distribution of cases aligns with real-world epidemiology due to the clinical basis of diagnosis.

These images were obtained using an endoscopy reporting system with Olympus CF-HQ290Z and PCF-H290Z, which were the endoscopes routinely used during the image acquisition period (2010–2012). These models provide high-resolution imaging, and to ensure consistency in image quality, we limited our dataset to images obtained with these scopes. Although these endoscopes are equipped with zoom functionality, it was not specifically used or required for the purposes of this study. In an addition, these data were shared in a public repository (https://doi.org/10.5061/dryad.9s4mw6mkp). A total of 1,551 colonoscopy images were available. Of these, 1,451 images were used for model construction (80% training, 20% validation). The remaining 100 images, which were completely independent from training and validation sets, were used for performance evaluation of the final model and for all XAI visualizations presented in this study. The classification threshold for assigning a label was set to 0.5. That is, an image was assigned to the class with the highest predicted confidence if the confidence exceeded 50%. Although this threshold is adjustable, the construction of XAI involved the use of general-purpose image classification models for transfer learning with the target data to create a new model for classifying IBS. Subsequently, two types of algorithms (Grad-CAM [6] and XRAI [7, 8]) were employed to generate saliency maps that were then combined with the input images. Ultimately, a system was developed to perform classification prediction and output XAI images. The evaluation method involved an independent assessment of the obtained XAI images by an endoscopist (H.M.) and a clinical engineer (S.K.), who recorded and evaluated the characteristic differences in the images. The research environment utilized Python 3.9 as the language and TensorFlow 2.10 as the framework. The local PC was equipped with an NVIDIA GeForce RTX^™ 3070 GPU. The image classification model used was EfficientNetV2-S, and the saliency map generation algorithms employed were Grad-CAM and XRAI.

Results

A total of 1,551 colonoscopy images were used in this study. Among them, 1,451 images were allocated for the development of the IBS classification model and were split into training and validation datasets at an 8:2 ratio. The model was trained to classify the images into three groups: IBS-C, IBS-D, and normal. Table 1 presents the performance metrics of the final model when applied to an independent evaluation dataset consisting of 100 images, which were not included in the training or validation process. This same dataset was also used for the qualitative visualization of characteristic regions using XAI. The accuracy of the IBS classification model exceeded 85% for both precision and recall, with an overall accuracy of 93% (Table 1). The execution speed of XAI using GPU computation was approximately 0.2 s for classification prediction, with the XAI image output taking about 1 s for Grad-CAM, whereas XRAI required longer, between 10 and 15 s, indicating a longer processing time compared to Grad-CAM (Table 2).

Table 1.

Accuracy of IBS classification model

Classification	Precision	Recall	Accuracy
IBS-C	0.89	0.91	0.93
IBS-D	0.91	0.88
Normal	1.00	1.00

Open in a new tab

Table 2.

Execution speed of XAI

Type	Time
Classification prediction	0.2 s
XAI image output
Grad-CAM	1 s
XRAI	10–15 s

Open in a new tab

XAI for IBS-C

For IBS-C, the characteristic areas visualized by XAI were agreed upon by two observers as part of the image, including regions with unclear vascular boundaries, dark areas where the oral side was not visible, relatively narrow areas with a faded tone, and areas where the folds appeared wrinkled (Fig. 1).

Fig. 1. — XAI results in IBS-C areas with unclear vascular boundaries and darkness where the oral side is not visible (①▶). Relatively narrow areas with a faded tone (②▶). Areas where folds appear wrinkled (③▶).

XAI for IBS-D

Similarly, in IBS-D, the characteristic areas were often parts of the image, including regions with clear vascular boundaries, areas with the same or reddish color, and areas where the intestinal tract on the oral side was visible, but appeared narrow and slightly dark. The two observers also agreed that the images captured by XRAI often featured regions with a high amount of mucus (reflected by increased light reflection) as characteristic areas (Fig. 2).

Fig. 2. — XAI results in IBS-D areas with clear vascular boundaries, homogeneity, or erythematous tones (①▶). Sections where the oral side of the intestinal tract is visible but appear narrow and slightly dark (②▶). Regions with a high amount of mucus (high light reflection) identified by XRAI (③▶).

XAI in Normal Subjects

For normal subjects, in contrast to IBS subjects, the entire image was evaluated as the characteristic area (Fig. 3). Specifically, with Grad-CAM, the whole image was assessed as the characteristic area, whereas XRAI generally provided a weaker evaluation, with specific areas highlighted only in a minority of images. Two observers reached a consensus on these findings.

Discussion

The construction of XAI has enabled the visualization of the features within the images that AI uses as the basis for its judgments. The insufflation volume during endoscopy, which is dependent on the endoscopist, does not differ specifically for patients with IBS, suggesting the possibility of distension abnormalities in the colons of patients with IBS. The evaluation of XAI images by the two observers allowed for a consensus on the distinct characteristics between the IBS-C, IBS-D, and normal groups. While there was agreement on the differences between IBS-C and IBS-D, – with IBS-C characterized by unclear vascular boundaries, faded tones, and wrinkled areas, and IBS-D characterized by clear vascular boundaries, homogenous or erythematous areas, and regions with high reflection, the quantification of these observations was challenging, and further quantitative analysis is required to establish their universality. XRAI, compared to Grad-CAM, allows for the visualization of detailed characteristic regions but requires more computational resources and longer image output times. However, an output time of 10–15 s was not considered excessively long for a pause during colonoscopy procedures. Future improvements should aim at enabling real-time display and using the visualized characteristic regions to guide biopsies and collection of mucosal adherent bacteria. This approach could contribute to elucidating what is histopathologically or bacteriologically normal or different from other mucosae, thereby aiding the understanding of the disease state. While current diagnostic criteria such as Rome IV do not include endoscopic findings, our study suggests that AI may detect subtle mucosal patterns associated with IBS. If these findings are confirmed through targeted biopsies or microbiome analysis of the visualized regions, future revisions of IBS diagnostic frameworks may consider endoscopic features as supportive or reference information. Several limitations of this study should be acknowledged. First, the normal group was defined as asymptomatic individuals who underwent colonoscopy following a positive fecal occult blood test during routine health screening and exhibited no endoscopic abnormalities. While practical as a control group, these individuals had abnormal screening results, and the presence of subclinical or unrelated conditions cannot be fully ruled out. Additionally, this screening population typically consisted of individuals aged 40 years or older, potentially introducing an age distribution bias compared to the IBS group. Second, the AI model was trained to distinguish between IBS-diagnosed patients without endoscopic abnormalities and this specific “Normal” population. As a result, the visual features identified by the model reflect this binary classification task, and not necessarily features that are uniquely specific to IBS. We did not evaluate the model’s ability to differentiate IBS from other gastrointestinal conditions with similarly subtle or absent endoscopic findings, such as eosinophilic colitis, collagenous colitis, or functional diarrhea. To validate the disease specificity of the detected features, future studies should include a wider range of functional and inflammatory gastrointestinal disorders in both model training and evaluation phases. Limitations also include the retrospective design, absence of pathological validation for visualized regions, and potential overfitting to images from specific endoscopy equipment. Further multicenter studies are required.

Conclusion

In conclusion, we successfully constructed an XAI for IBS and visualized characteristic regions distinguishing healthy individuals from those with IBS. By enabling the real-time display of XAI, further advances in the elucidation of IBS pathophysiology is anticipated.

Acknowledgments

A summary of this study is presented at the 106th Congress of the Japan Gastroenterological Endoscopy Society.

Statement of Ethics

Opt-out informed consent protocol was used for use of participant data for research purposes. This consent procedure was reviewed and approved by the Ethics Committee of the Toyama University Hospital, approval No. R2021032, date of decision 2021/05/11.

Conflict of Interest Statement

The authors have no conflicts of interest to declare.

Funding Sources

This work was supported by operating funds from the Mathematical, Data Science and AI Education Program of the University of Toyama.

Author Contributions

H.M., S.K., H.F., T.A., and I.Y. contributed to the conception and design of the study. H.M. published the colonoscopy images in the repository. S.K. and H.F. set up the Python environment on a local PC and developed the XAI system. The XAI-generated images were assessed by H.M. and S.K. T.A. and I.Y. contributed to the interpretation of the results and provided critical insights into the study’s conceptual framework. H.M. drafted the manuscript, and all authors reviewed and approved the final version of the manuscript.

Funding Statement

This work was supported by operating funds from the Mathematical, Data Science and AI Education Program of the University of Toyama.

Data Availability Statement

All the raw images are available in a repository at https://doi.org/10.5061/dryad.9s4mw6mkp.

References

1. Kudo SE, Misawa M, Mori Y, Hotta K, Ohtsuka K, Ikematsu H, et al. Artificial intelligence-assisted system improves endoscopic identification of colorectal neoplasms. Clin Gastroenterol Hepatol. 2020;18(8):1874–81.e2. [DOI] [PubMed] [Google Scholar]
2. Nagao S, Tani Y, Shibata J, Tsuji Y, Tada T, Ishihara R, et al. Implementation of artificial intelligence in upper gastrointestinal endoscopy. DEN Open. 2022;2(1):e72. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Lacy BE, Mearin F, Chang L, Chey WD, Lembo AJ, Simren M, et al. Bowel disorders. Gastroenterology. 2016;150(6):1393–407.e5. [DOI] [PubMed] [Google Scholar]
4. Tabata K, Mihara H, Nanjo S, Motoo I, Ando T, Teramoto A, et al. Artificial intelligence model for analyzing colonic endoscopy images to detect changes associated with irritable bowel syndrome. PLoS Digit Health. 2023;2(2):e0000058. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Quellec G, Al Hajj H, Lamard M, Conze PH, Massin P, Cochener B. ExplAIn: explanatory artificial intelligence for diabetic retinopathy diagnosis. Med Image Anal. 2021;72:102118. [DOI] [PubMed] [Google Scholar]
6. Allgaier J, Mulansky L, Draelos RL, Pryss R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif Intell Med. 2023;143:102616. [DOI] [PubMed] [Google Scholar]
7. Singh RK, Gorantla R, Allada SGR, Narra P. SkiNet: a deep learning framework for skin lesion diagnosis with uncertainty estimation and explainability. PLoS One. 2022;17(10):e0276836. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Vecchi JT, Mullan S, Lopez JA, Hansen MR, Sonka M, Lee A. NeuriteNet: a convolutional neural network for assessing morphological parameters of neurite growth. J Neurosci Methods. 2021;363:109349. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All the raw images are available in a repository at https://doi.org/10.5061/dryad.9s4mw6mkp.

[B1] 1. Kudo SE, Misawa M, Mori Y, Hotta K, Ohtsuka K, Ikematsu H, et al. Artificial intelligence-assisted system improves endoscopic identification of colorectal neoplasms. Clin Gastroenterol Hepatol. 2020;18(8):1874–81.e2. [DOI] [PubMed] [Google Scholar]

[B2] 2. Nagao S, Tani Y, Shibata J, Tsuji Y, Tada T, Ishihara R, et al. Implementation of artificial intelligence in upper gastrointestinal endoscopy. DEN Open. 2022;2(1):e72. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3. Lacy BE, Mearin F, Chang L, Chey WD, Lembo AJ, Simren M, et al. Bowel disorders. Gastroenterology. 2016;150(6):1393–407.e5. [DOI] [PubMed] [Google Scholar]

[B4] 4. Tabata K, Mihara H, Nanjo S, Motoo I, Ando T, Teramoto A, et al. Artificial intelligence model for analyzing colonic endoscopy images to detect changes associated with irritable bowel syndrome. PLoS Digit Health. 2023;2(2):e0000058. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5. Quellec G, Al Hajj H, Lamard M, Conze PH, Massin P, Cochener B. ExplAIn: explanatory artificial intelligence for diabetic retinopathy diagnosis. Med Image Anal. 2021;72:102118. [DOI] [PubMed] [Google Scholar]

[B6] 6. Allgaier J, Mulansky L, Draelos RL, Pryss R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif Intell Med. 2023;143:102616. [DOI] [PubMed] [Google Scholar]

[B7] 7. Singh RK, Gorantla R, Allada SGR, Narra P. SkiNet: a deep learning framework for skin lesion diagnosis with uncertainty estimation and explainability. PLoS One. 2022;17(10):e0276836. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8. Vecchi JT, Mullan S, Lopez JA, Hansen MR, Sonka M, Lee A. NeuriteNet: a convolutional neural network for assessing morphological parameters of neurite growth. J Neurosci Methods. 2021;363:109349. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Investigation of Recognition Areas by Explainable Artificial Intelligence for Colonoscopy Images of Irritable Bowel Syndrome

Hiroshi Mihara

Shun Kuraishi

Haruka Fujinami

Takayuki Ando

Ichiro Yasuda

Abstract

Introduction

Methods

Results

Conclusion

Plain Language Summary

Introduction

Material and Methods

Results

Table 1.

Table 2.

XAI for IBS-C

Fig. 1.

XAI for IBS-D

Fig. 2.

XAI in Normal Subjects

Fig. 3.

Discussion

Conclusion

Acknowledgments

Statement of Ethics

Conflict of Interest Statement

Funding Sources

Author Contributions

Funding Statement

Data Availability Statement

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases