Abstract
Automated image analysis methods have shown potential for replicating expert interpretation of histology and endoscopy images, which traditionally require highly specialized and experienced reviewers. Inflammatory bowel disease (IBD) diagnosis, severity assessment, and treatment decision-making require multimodal expert data interpretation and integration, which could be significantly aided by applications of machine learning analyses. This review introduces fundamental concepts of machine learning for imaging analysis and highlights research and development of automated histology and endoscopy interpretation in IBD. Proof-of-concept studies strongly suggest that histologic and endoscopic images can be interpreted with similar accuracy as knowledge experts. Encouraging results support the potential of automating existing disease activity scoring instruments with high reproducibility, speed, and accessibility, therefore improving the standardization of IBD assessment. Though challenges surrounding ground truth definitions, technical barriers, and the need for extensive multicenter evaluation must be resolved before clinical implementation, automated image analysis is likely to both improve access to standardized IBD assessment and advance the fundamental concepts of how disease is measured.
Keywords: automation, image analysis, pathology, endoscopy, inflammatory bowel disease
This review discusses automated medical image analysis using machine learning for inflammatory bowel disease (IBD) pathology and endoscopy. Near term and future applications of computational medical image analysis in the diagnosis and management of patients with IBD is explored.
INTRODUCTION
The assessment of inflammatory bowel disease (IBD) incorporates multimodal data, relying heavily on the impressions and interpretation of endoscopic activity and the emerging histologic scoring.1, 2 However, grading disease severity using endoscopies and histology fundamentally depends on qualitative assessments, even when performed by experienced experts using ordinal measurement instruments.2 Ultimately, this unavoidable subjectivity of interpretation affects the accuracy and reproducibility of disease severity assessments, not to mention accessibility when the best available scoring relies on a short supply of experts. Automated image analysis capabilities can help address the clinical challenges that characterize image interpretation, aiding experts in the scoring and reviewing endoscopic and histologic images.3, 4 Machine learning approaches for image analysis have the ability to identify, reproduce, and quantify both perceived and unperceived image features and patterns, some of which are intuitively used by trained expert reviewers.5–7 However, the reproducibility and speed afforded by automated image analysis pipelines can both better standardize existing scoring systems and interpretation and find new insights across multiple medical domains, particularly in IBD.
Accurate phenotyping and disease activity assessments at scale present key gaps in IBD research and population health that can be addressed, in part, by automated image analysis.8, 9 Research in image classification using computational techniques has matured in other medical fields depending on imaging. Examples include ophthalmology and oncology, where data sets are increasingly incorporating enhanced phenotyping by including computational assessments of histology and cross-sectional imaging for more granular descriptions of disease phenotype and behavior.10–13 In contrast, the last decade of big data analytics in IBD research has been hindered by coarse descriptions of phenotype, leaving investigators and clinicians to infer or estimate IBD severity, character, and behavior based on administrative claims and partial laboratory records.9, 14 As a result, often models and predictions fail to incorporate and adjust for key objective differences in IBD between patients.
The potential of automating the extraction of endoscopic disease activity, histologic indices, and disease phenotyping using machine learning image analysis applications will dramatically enhance the value of existing national and international research efforts. Merging outcomes data from commercial claims data sets and research cooperatives is likely to improve the accuracy and precision of predictive models in both Crohn’s disease (CD) and IBD for therapeutic response, long-term outcomes, cost effectiveness, and patient experience. Multi-institutional ongoing clinical data collaboratives including Risk Stratification of Rapid Disease Progression in Children With Crohn’s Disease (RISK), a pediatric prospective inception cohort,8, 15 Predicting Response to Standardized Pediatric Colitis Therapy (PROTECT),16 Study of a Prospective Adult Research Cohort with IBD (SPARC IBD),17 and the Mucosal-Luminal Interface (MLI) cohort at Cedars-Sinai Medical Center18 are examples of data sets well positioned for augmentation by population-based scaled libraries clinical image analysis. Linking carefully collected clinical outcomes, treatment covariates, and mechanistic data with more granular and informative clinical phenotyping powered by new image analytic approaches have the potential to redefine our expectations of predictive modeling in IBD.9, 14 In this review, we will introduce concepts of automated image analysis for both histology and endoscopy and highlight research and development in the IBD arena.
KEY CONCEPTS IN MACHINE LEARNING FOR IMAGING ANALYSIS
Modern automated image analysis is achieved through a variety of machine learning (ML) computational techniques (Table 1). Machine learning is a subset of artificial intelligence (AI), the colloquial concept for the ability of computational programs to understand and solve problems with similarity to human learning and decision-making processes.19–21 The process of learning is achieved by constructing a training set, processing that data set through analytic computational techniques, and deriving an optimized model capturing the patterns of training set data associated with the outcome of interest. Training sets are composed of an information data set, which can be text or numeric data in administrative claims or laboratory data, waveform signals, or digital imaging, paired with labels or classifications. Examples of labels, annotations, and classifications include clinical outcomes, human interpretations of imaging or labs, or expert treatment decisions using elements from the data set for an individual or group of individuals. Resulting ML models quantifying the relationship between data sources and outcomes can be used to classify new observations, make predictions, or identify groups of similar characteristics where the relationship is unknown. Machine learning model performance can then be evaluated in a test set or a sample of the annotated data set that was unseen in the learning process. Strengths of machine learning approaches over traditional regression-based models are their ability to handle high-dimensional data that subject common statistical techniques to overfitting models.22
TABLE 1.
Definitions of Terms Used to Describe Different Areas and Approaches for Automated Inflammatory Bowel Disease Image Analysis
| Terms | Definition |
|---|---|
| Artificial Intelligence | It is a generic broader term for the ability of computer programs to learn and solve problems i.e. incorporation of human intelligence to machines. |
| Machine Learning | These models utilize the input data via computational algorithms to produce a specific output (subset of Artificial Intelligence). |
| Deep Learning | These models involve recognition of specific patterns in the input to classify various types of information (subset of Machine Learning). |
| Image Analysis | This term used to describe the extraction of meaningful information as “outputs” from image data. |
| Deep Neural Networks | |
| Artificial Neural Networks | These deep neural networks process input data though an interconnected network of specialized nodes, similar to biologic neural networks, to implement decisions and provide output information and classifications of the data |
| Convolutional Neural Networks | These deep neural networks process an input image through multiple layers or filters classifying the presence of the image features assessed by each particular filter. The pattern of output stemming from image processing is used to determine the relative importance of image features used for assigned image classification. They are similar to Artificial Neural Networks but contain a fully connected final layer. |
| Convolutions | Layers added to Convolutional Neural Networks (CNNs) that process the input image for producing an output. These are where the multiple types of information from the input are intertwined in a CNN. |
| Supervised Learning | Each input and output image is labeled by an expert reference to establish which group or classification is associated with the image. |
| Unsupervised Learning | The input images or any other kind of data is unlabeled, and the deep learning model further identifies the labels (or classes) for the input to organize it along with identifying unique patterns within the images. |
DEEP LEARNING AND ARTIFICIAL NEURAL NETWORKS
Deep learning (DL) is a subset of ML where information is processed through interconnected networks of layers, with each designed to detect a specific feature within input data.23 In DL, the pattern of how the input data is related to each feature is used to encode the predicted output (eg, cat vs dog; adenoma vs hyperplastic polyp) similar to the organization and function of human neural networks. In image analysis, artificial neural networks (ANNs) are often used for complex structured data analysis. Convolutional neural networks (CNNs) are well suited for image processing, as these networks are designed to subsample image data into chunks and then process each chunk through filters (or layers) related to quantitation of image characteristics23, 24 (Fig. 1). Early layers in deep neural networks are conceptionally straight forward and can include color intensity and boundary detection but become increasingly complex and unfamiliar deeper in the layer network, resulting in the term “deep learning.” Ultimately, the pattern of layer output stemming from image processing is used to determine the relative importance of image features used to classify an image. For example, if a CNN is trained to distinguish biopsies with features of Crohn’s disease and histologically normal ileal tissue images, it will use the similarity of both the relationship of image to feature components and the connection between features to predict the expert’s histology label.
FIGURE 1.
Pipeline for inflammatory bowel disease (IBD) image analysis. Supervised learning method is highlighted where the data is split into a training and testing set. The same deep neural network (a representative image is drawn) utilizes labeled input data (histology images for colonic Crohn’s disease and histologically normal colon) to learn from the images and utilizes what it has learned to then label and classify the unlabeled testing set. This is different from unsupervised learning where the training set utilizes and learns from unlabeled input data. Histology images by Nephron, own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=15058551.
Deep neural networks training can use either supervised or unsupervised learning strategies (Fig. 1). In supervised learning, images are labeled by an expert reference to establish which group(s) or classification(s) are associated with the image.25, 26 The purpose of supervised learning is to deduce a functional relationship between the images to allow the generalization of features in the training source image data. This allows for the model to classify images previously unseen by the model. These relationships are captured as equations and numerical coefficients, also known as weights, to measure the similarity of each filter to the input image. In unsupervised learning, the training imaging data is unlabeled for diagnoses, grades, or outcomes.26 The purpose of unsupervised learning is to discover relationships and reveal hidden patterns or features within the images without a priori bias or concepts to influence the pattern identification.27 Unsupervised learning methods can be invaluable in hypothesis-generating work and represent the potential for maximizing new data insight aided by DL rather than just replicating our existing ideas and interpretations. Current uses of unsupervised learning include image and data clustering, where content experts can attempt to infer clinical significance. Future unsupervised learning methods will be improved by incorporating other known medical knowledge using transfer and reinforcement learning techniques but remain early in development.
PROGRESS FOR AUTOMATING PATHOLOGIC SCORING IN IBD
By adapting proof-of-concept studies used in cancer biopsy data sets,10, 28 early image analysis work is paving the way for automated pathology assessment and scoring in gastroenterology. A recent study by Syed et al29 utilized a CNN to recognize and distinguish disease patterns in biopsy images from patients with multiple inflammatory enteropathies. The CNN model, which used 4 convolutions, classified duodenal biopsy images as celiac disease, environmental enteropathy, or histologically normal tissue. This ML-based histopathological analysis model demonstrated 93.4% classification accuracy. Additionally, this model used feature recognition to highlight secretory cell characteristics as a principle factor in the model’s ability to differentiate between the histologically similar inflammatory enteropathies. The example of feature recognition for identifying image variable importance demonstrates the ability of machine learning to contribute new knowledge and insights.
With success in other domains, both supervised and unsupervised image analysis models have been used for IBD pathology analysis. Mossotto and colleagues utilized pediatric IBD endoscopic and histologic data from 287 children diagnosed with either Crohn’s disease, ulcerative colitis (UC), or unclassified inflammatory bowel disease.30 For unsupervised models, principal component analysis and multidimensional scaling algorithms were used. These unsupervised models did not reveal distinct clusters even though patients with Crohn’s disease and ulcerative colitis had different distributions across the 3D space. In the supervised models, a linear support vector machine (SVM) was used. The SVM utilizes labeled data to find the optimal boundary to separate disease classes of either Crohn’s disease or ulcerative colitis using histopathology alone, endoscopy alone, or the combination of both histology and endoscopic images. The supervised SVM models exhibited classification accuracies of 76.9%, 71.0%, and 82.7%, respectively. Interestingly, visual findings of ileal inflammation were identified across each model as a key feature for the diagnosis of Crohn’s disease.
Beyond utilizing image analysis to broadly classify pathology images of Crohn’s disease vs normal tissue, a study by Klein et al also utilized colonic biopsies at baseline diagnosis and at 5-year follow-up to predict Crohn’s colitis Montreal phenotypes of inflammatory (B1), stricturing (B2), or penetrating (B3) disease.31 Expert-directed ground-truth quantification and labeling of inflammatory cells, mucosal crypts, and both mature and young collagen were performed with confirmation by pancytokeratin, Mason Tri-chrome, and reticulin staining, respectively. Using a supervised neural network, each of these measurements were used to predict the phenotype. Using the aforementioned features, the model differentiated between B1 and B2 phenotypes (70.5% accuracy) but not B2 and B3 phenotypes.
Automated image analysis may not only improve the speed and reproducibility of IBD histologic assessment but may also allow for new capability in characterizing and quantifying tissue properties that can be difficult to capture. Crypt distortion is recognized as an important but potentially ambiguous characteristic of disease activity that is challenging to measure. Matalka et al developed an automated system to assess mucosal damage via architectural crypt distortion quantification with a 98.3% accuracy compared with human expert reference.32, 33 Beyond just classifying features, by utilizing a SegNet architecture Pradhan et al were able to automatically localize and separate mucosal and submucosal histologic regions with a good F1 accuracy score of 0.75, which is a key capability considering the importance of spatial relationships.34 Other groups have used automated image analysis to successfully identify and quantify histologic elements glandular epithelium and lumen regions, goblet cells, and stroma in colonic biopsies specimens with high classification performance.35 These results offer an example of the potential for new measures and concepts of histologic activity and characteristics, here examining crypt morphology and damage, which may improve IBD chronicity assessments and histologic damage scoring.
LIMITATIONS OF MACHINE LEARNING ANALYTIC MODELS FOR IBD HISTOPATHOLOGY
Though studies utilizing deep neural networks for IBD pathology image analysis have found success, there are several limitations that remain. First, there can be considerable intra- and intersite heterogeneity source slides themselves. Source slide variation includes staining and biopsy preparation due to different institutional protocols or changes in processing techniques over time. Standardization of either staining procedures or, more likely, digital color and contrast optimization methods are needed to address this limitation.36 Additionally, the development of ML models has spurred a growing interest in visualizing the spatial regions and characteristics the models use for classification, but current IBD literature lacks the necessary details for recognition of disease patterns across biopsy images. Furthermore, previously published initial IBD ML models have not been validated or tested across vast national data sets, meaning much more work is needed for generalizability and use in clinical or pathologic workflows. Other considerations that merit discussion involve careful curation of the biopsy data sets that are planned to be used for analytic purposes: (1) assessment of biopsy size where endoscopic biopsies ideally should be of the same or similar size for uniform assessment; (2) consistency in the technique in which histologic tissue is sectioned; and (3) controls for uniform hematoxylin and eosin (H&E) staining quality across multiple sites. These are very real challenges when real-world, clinical data are sourced across multiple sites. Another subtle yet critical computational limitation would be centered around whether the machine can identify distinguishing histologic features between IBD types (Crohn’s and UC) and still be able to also appreciate the other “routine” causes of intestinal inflammation such as intestinal infection, drug reactions, ischemia, etc. in everyday practice. This may be more difficult than it seems because the body has a limited number of ways to react and respond to the various insults and damages received. This holds true for the GI tract. Thus, there is complicating overlap in the histology of various insults seen in the bowel. Having a team with an experienced gastrointestinal (GI) specialty–trained pathologist would be critical, especially when working closely with gastroenterology colleagues, to get the biopsy assessment right. Finally, there are few means of balanced tissue collection, standardization of the biopsy themselves, and histologic time course studies to inform the value and best use cases for future histologic analyses.
PROGRESS IN AUTOMATED ENDOSCOPY ASSESSMENT
Similar to work in pathology, automated image analysis methods have repeatedly been shown to be capable of replicating expert judgment in general endoscopy, with approaches applicable to IBD. Automated polyp detection is becoming increasingly familiar to gastroenterologists, with reported accuracies of over 90% using CNNs.37, 38 Evidence also supports that computer vision can improve polyp detection compared with unassisted endoscopists; though so far, most polyps missed by endoscopists were diminutive and potentially of little clinical relevance.39 Although the incremental value of automated polyp detection remains debated, CNN approaches show promise for adding capabilities by predicting underlying tissue histology using visual information. Using narrow band imaging, CNN models have demonstrated an accuracy of 94% (95% confidence interval [CI], 86%–97%) for distinguishing adenomatous from hyperplastic colonic polyps.40 Furthermore, sophisticated real-time endohistologic visualizing systems are of value for providing histologic inference at the time of colonoscopy, but challenges in interpretations by nonexperts present a major barrier to widespread use. Using endocytoscopic imaging providing over 500-fold magnification of the mucosal surface, CNN model predictions of polyp histology were similar to those provided by well-trained experts (96.0% vs 94.6%; P = 0.141) and significantly better than the interpretation of nonexpert general gastroenterologists (96.0% vs 70.4%; P < 0.0001).41
Parallel to applying machine learning image analysis in colon adenoma detection and pathology interpretation, these methods are being explored for automation of endoscopic assessment in IBD. Successful automation of endoscopy assessments and scoring has obvious advantages for standardization, reduced interobserver variation, and dramatically increased speed of disease assessment. Automation efforts have focused on ulcerative colitis considering the relative homogeneity and consistency of visual endoscopic features compared with Crohn’s disease. Several groups have demonstrated the capabilities and performance of disease severity grading in UC using still images captured from endoscopy. Using an endoscopic still image training set from 444 unique patients, Ozawa and colleagues reported that CNN-based methods could distinguish Mayo Endoscopic Scores (MES) of 0 to 1 vs 2 to 3 with near perfect replication of expert labeling (AUC 0.98; 95% CI, 0.97–98).42 In a separate study of 3082 unique UC patients, authors reported similar CNN performance for distinguishing an MES of 0 to 1 vs 2 to 3, with an area under the curve of 0.97 (95% CI, 0.967–0.972).43 In this study, the exact agreement between automated and expert-adjudicated scores for still images were similar to the agreement between paired experts (κ = 0.84 vs κ = 0.86, respectively). Though overall expert agreement on MES was very good, agreement was better at MES score extremes (MES 0, 77.0%; MES 3, 85.7%) compared with intermediate grades (MES 1, 54.7%; MES 2: 69.3%). Importantly, the upper boundary of automated computer vision systems is limited by the reliability and accuracy of expert reviewers used to construct the training set.
Automated endoscopic scoring systems can also be trained to replicate other instruments to assess disease activity such as the ulcerative colitis endoscopic index of severity (UCEIS). Although the MES is familiar and easy to use, the UCEIS provides increased dynamic range (0–8 vs 0–3) and potentially improved reproducibility compared with the MES.44, 45 Takenaka and colleagues at Tokyo Medical and Dental University recently reported excellent results automatically assigning UCEIS scores to still images from colonoscopies.46 In their prospective study of 875 UC subjects, they reported an intraclass correlation coefficient between automated and expert-adjudicated UCEIS scores of 0.917 (95% CI, 0.911–0.921). Automated evaluation methods accurately classified endoscopic remission (UCEIS = 0) with 90.1% accuracy and a kappa co-efficient of κ = 0.80 (95% CI, 0.78–0.81). With evidence supporting the value of histologic remission, there is increasing interest in incorporating both endoscopic and histologic end points in therapeutic trials and clinical care. Similar to work in inferring polyp histology using visual features, endoscopic image characteristics were used to predict histologic remission (defined as a Geboes score of <=3) with 92.9% accuracy and κ = 0.86 (95% CI, 0.84–0.88). In summary, these data suggest that the visual endoscopic features of UC can be readily detected and classified by modern CNN architecture. Such architecture is flexible enough to be retrained using multiple endoscopic scoring instruments. Furthermore, the potential for reliably inferring underlying histology using visual characteristics from endoscopy would increase the efficiency of comprehensive disease activity assessment.
Although the homogenous distribution and appearance of UC lends itself to computer vision methods, the morphologic and anatomic variation typical of Crohn’s disease poses problems for current image classification technologies. As a result, work replicating common Simple Endoscopic Score for Crohn’s Disease (SES-CD) and Crohn’s Disease Endoscopic Index of Severity (CDEIS) endoscopic instruments has been limited. However, current CNN-based image classification methods are proving useful for aiding the detection of small bowel ulcerations in Crohn’s disease using video capsule endoscopy (VCE). In a study of 49 patients, Crohn’s disease–related small bowel ulcers were detected on VCE with an accuracy of 95.4% to 96.7% using an expert gastroenterologist as a reference.47 Other studies aimed to detect multiple features on capsule endoscopy, including CD-related ulcerations, have reported similar results, potentially making capsule review in the near future more convenient and feasible. Automated lesion detection methods reduced mean VCE review times from 96.6 ± 22.53 minutes by conventional reading to 5.9 ± 2.23 minutes when using CNN-based assisted reading with no differences in sensitivity for disease findings (P < 0.001).48 Although the heterogeneity of Crohn’s disease characteristics presents challenges that will require further technologic developments, current methods may still prove useful in easing time burdens and sensitivities for VCE review.
LIMITATIONS OF MACHINE LEARNING ANALYTIC MODELS FOR IBD ENDOSCOPY
Applications of image analysis in endoscopy to date are encouraging; however, several limitations need resolution for it to become a reality. First, still image interpretation is insufficient for disease activity assessments. Secondly, methods are needed to intelligently aggregate all relevant visual data, as a 15-minute endoscopic video contains approximate 27,000 still frames. Additionally, most automated systems are trained and tested on idealized, carefully cleaned, and well-curated image data sets. Fully automated systems will need to intelligently account for noise and ambiguous features such as stool or debris, postinterventional bleeding, or other confounders that could be erroneously confused for disease activity. To address these issues, a pilot study examined merging multiple image classification systems for automating endoscopic scoring in 264 videos sourced from an international therapeutic trial in UC.49 The fully automated analysis of unaltered source videos predicted the exact MES in 69.5% of cases compared with central reader scoring; central reviewers agreed on MES in 83.7% of cases. More work and research are needed to overcome the barriers to fully automating both CD and UC endoscopic assessment beyond the widely available CNN-based image classification approaches.
CONCLUSIONS
Efforts to automate and standardize pathology and endoscopy interpretation for IBD continue to make progress, and introduction into routine workflows should be expected in the near-term. Early application of these image classification methods will likely work in tandem with expert interpretation to reduce the time needed for review, provide automated secondary confirmatory assessment, and report preliminary findings. After rigorous validation and testing in real-world environments, full automation of existing endoscopic and pathologic scoring will arrive first for industry therapeutic trials and later for clinical care. Explorations of utilizing computational image analysis to improve existing scoring (endoscopy) and develop scoring systems (pathology) for evaluating UC and CD are underway. In time, the discussed image analysis technologies may be applied to improve upon or completely rebuild our methods for assessing disease activity and predicting future disease behavior (Fig. 2).
FIGURE 2.
Overview of current and future efforts to automate and standardize pathology and endoscopy interpretation for inflammatory bowel diseases.
However, the enthusiasm surrounding the technical achievements of AI in IBD must be tempered by the sobering realities and requirements of clinical implementation. The road to qualification and regulatory approval of AI measures may be long, especially for new computer-generated scoring metrics, which may prove to supersede the capabilities of existing scores like the MES, UCEIS, Gaboes, and Robarts Histopathology Index. Before routine use and acceptance of automated measures, health care providers will need to become fluent in the language of ML and AI. Structured education for medical trainees, physicians, and advanced practice providers will be imperative if we are to become critical and conscientious users of AI technologies. As methods improve and practitioners become comfortable with computer-based interpretations, questions of responsibility and liability for decision-making where information or recommendations from AI systems are used, or perhaps ignored, will need to be addressed. These implementation issues will be addressed, and if done so with transparency, academic discourse, patience, and creativity, AI may indeed become a truly transformative tool in research and clinical care for the foreseeable future.
ACKNOWLEDGMENTS
The authors would like to note the invaluable input provided by Lubaina Ehsan, Lee A. Denson, Hans Vitzthum von Eckstaedt, and Aman Shrivastava in early drafts of this article. No institutional ethical approval was required for this review.
Supported by: Research reported in this article was supported by National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under award number K23DK117061 (Syed) and K23DK101687 (Stidham).
Conflicts of Interest: RWS does consultancy for Abbvie, Janssen, Merck, Takeda, and Corrona and has investigator-initiated research support from Abbvie. University of Michigan has filed provisional patents on behalf of RWS for image analysis technologies with licenses to AMI, inc.
REFERENCES
- 1. Escher J, Dias JA, Bochenek K, et al. Inflammatory bowel disease in children and adolescents: recommendations for diagnosis-the Porto criteria. J Pediatr Gastr Nutr. 2005;41:1. [DOI] [PubMed] [Google Scholar]
- 2. Bernstein CN, Fried M, Krabshuis JH, et al. World Gastroenterology Organization Practice Guidelines for the diagnosis and management of IBD in 2010. Inflamm Bowel Dis. 2010;16:112–124. [DOI] [PubMed] [Google Scholar]
- 3. Min JK, Kwak MS, Cha JM. Overview of deep learning in gastrointestinal endoscopy. Gut Liver. 2019;13:388–393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Patel V, Khan MN, Shrivastava A, et al. Artificial intelligence applied to gastrointestinal diagnostics: a review. J Pediatr Gastroenterol Nutr. 2020;70:4–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. de Bruijne M. Machine learning approaches in medical image analysis: from detection to diagnosis. Med Image Anal. 2016;33:94–97. [DOI] [PubMed] [Google Scholar]
- 6. Komura D, Ishikawa S. Machine learning methods for histopathological image analysis. Comput Struct Biotechnol J. 2018;16:34–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Madabhushi A, Lee G. Image analysis and machine learning in digital pathology: Challenges and opportunities. Med Image Anal. 2016;33:170–175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Kugathasan S, Denson LA, Walters TD, et al. Prediction of complicated disease course for children newly diagnosed with Crohn’s disease: a multicentre inception cohort study. Lancet. 2017;389:1710–1718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Scott FI, Rubin DT, Kugathasan S, et al. Challenges in IBD research: pragmatic clinical research. Inflamm Bowel Dis. 2019;25:S40–S47. [DOI] [PubMed] [Google Scholar]
- 10. Beck AH, Sangoi AR, Leung S, et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci Transl Med. 2011;3:108ra113. [DOI] [PubMed] [Google Scholar]
- 11. Zhang Z, Xie Y, Xing F, et al., eds. Mdnet: a semantically and visually interpretable medical image diagnosis network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Hawaiʻi Convention Center, Honolulu, HI; 2017.
- 12. He J, Shang L, Ji H, et al., eds. Deep learning features for lung adenocarcinoma classification with tissue pathology images. In: International Conference on Neural Information Processing. Berlin, Germany: Springer; 2017. [Google Scholar]
- 13. Abràmoff MD, Lou Y, Erginay A, et al. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Invest Ophthalmol Vis Sci. 2016;57:5200–5206. [DOI] [PubMed] [Google Scholar]
- 14. Denson LA, Curran M, McGovern DPB, et al. Challenges in IBD research: precision medicine. Inflamm Bowel Dis. 2019;25:S31–S39. [DOI] [PubMed] [Google Scholar]
- 15. Pediatric RISK Stratification Study: Crohn’s & Colitis Foundation . https://www.crohnscolitisfoundation.org/research/current-research-initiatives/pediatric-risk-stratification. Accessed August 4, 2020.
- 16. Hyams JS, Davis S, Mack DR, et al. Factors associated with early outcomes following standardised therapy in children with ulcerative colitis (PROTECT): a multicentre inception cohort study. Lancet Gastroenterol Hepatol. 2017;2:855–868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. SPARC IBD: Crohn’s & Colitis Foundation . https://www.crohnscolitisfoundation.org/research/current-research-initiatives/sparc-ibd. Accessed August 4, 2020.
- 18. Presley LL, Ye J, Li X, et al. Host-microbe relationships in inflammatory bowel disease detected by bacterial and metaproteomic analysis of the mucosal-luminal interface. Inflamm Bowel Dis. 2012;18:409–417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Mitchell TM Machine Learning. New York: McGraw-Hill; 1997. [Google Scholar]
- 20. Sammut C, Webb GI. Encyclopedia of Machine Learning and Data Mining. New York, NY: Springer Publishing Company, Incorporated; 2017. [Google Scholar]
- 21. Kok JN, Boers E, Kosters WA, et al. Artificial intelligence: definition, trends, techniques, and cases. Artif Intell. 2009;1:1–20. [Google Scholar]
- 22. Gao L, Song J, Liu X, et al. Learning in high-dimensional multimedia data: the state of the art. Multimedia Syst. 2017;23:303–313. [Google Scholar]
- 23. Bengio Y. Learning deep architectures for AI. Found Trends Mach Learn. 2009;2:1–127. [Google Scholar]
- 24. Deng J, Dong W, Socher R, et al., eds. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami Beach, FL: IEEE; 2009. [Google Scholar]
- 25. Alpaydin A Neural Models of Incremental Supervised and Unsupervised Learning. Lausanne, Switzerland: EPFL; 1990. [Google Scholar]
- 26. Hastie T, Tibshirani R, Friedman J. Unsupervised Learning. The Elements of Statistical Learning. Berlin, Germany: Springer; 2009:485–585. [Google Scholar]
- 27. Wang X, Sontag D, Wang F, eds. Unsupervised learning of disease progression models. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY; 2014.
- 28. Bandi P, Geessink O, Manson Q, et al. From detection of individual metastases to classification of lymph node status at the patient level: the CAMELYON17 challenge. IEEE Trans Med Imaging. 2019;38:550–560. [DOI] [PubMed] [Google Scholar]
- 29. Syed S, Al-Boni M, Khan MN, et al. Assessment of machine learning detection of environmental enteropathy and celiac disease in children. JAMA Netw Open. 2019;2:e195822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Mossotto E, Ashton JJ, Coelho T, et al. Classification of paediatric inflammatory bowel disease using machine learning. Sci Rep. 2017;7:2427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Klein A, Mazor Y, Karban A, et al. Early histological findings may predict the clinical phenotype in Crohn’s colitis. United European Gastroenterol J. 2017;5:694–701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Kass M, Witkin A, Terzopoulos D. Snakes: active contour models. Int J Comput Vis. 1988;1:321–331. [Google Scholar]
- 33. Matalka II, Al-Omari FA, Salama RM, et al. A novel approach for quantitative assessment of mucosal damage in inflammatory bowel disease. Diagn Pathol. 2013;8:156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Pradhan P, Meyer T, Vieth M, et al., eds. Semantic Segmentation of Non-linear Multimodal Images for Disease Grading of Inflammatory Bowel Disease: A SegNet-based Application. International Conference on Pattern Recognition Applications and Methods. Setúbal, Portugal: Science and Technology Publications; 2019. [Google Scholar]
- 35. Ma Z, Swiderska-Chadaj Z, Ing N, et al., eds. Semantic segmentation of colon glands in inflammatory bowel disease biopsies. In: International Conference on Information Technologies in Biomedicine. Springer; Kamień Śląski, Poland; 2018. [Google Scholar]
- 36. Vahadane A, Peng T, Sethi A, et al. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans Med Imaging. 2016;35:1962–1971. [DOI] [PubMed] [Google Scholar]
- 37. Fernández-Esparrach G, Bernal J, López-Cerón M, et al. Exploring the clinical potential of an automatic colonic polyp detection method based on the creation of energy maps. Endoscopy. 2016;48:837–842. [DOI] [PubMed] [Google Scholar]
- 38. Urban G, Tripathi P, Alkayali T, et al. Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology. 2018;155:1069–1078.e8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68:1813–1819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Byrne MF, Chapados N, Soudan F, et al. Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model. Gut. 2019;68:94–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Kudo SE, Misawa M, Mori Y, et al. Artificial intelligence-assisted system improves endoscopic identification of colorectal neoplasms. Clin Gastroenterol Hepatol. 2020;18:1874–1881.e2. [DOI] [PubMed] [Google Scholar]
- 42. Ozawa T, Ishihara S, Fujishiro M, et al. Novel computer-assisted diagnosis system for endoscopic disease activity in patients with ulcerative colitis. Gastrointest Endosc. 2019;89:416–421.e1. [DOI] [PubMed] [Google Scholar]
- 43. Stidham RW, Liu W, Bishu S, et al. Performance of a deep learning model vs human reviewers in grading endoscopic disease severity of patients with ulcerative colitis. JAMA Netw Open. 2019;2:e193963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Travis SP, Schnell D, Krzeski P, et al. Developing an instrument to assess the endoscopic severity of ulcerative colitis: the Ulcerative Colitis Endoscopic Index of Severity (UCEIS). Gut. 2012;61:535–542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Xie T, Zhang T, Ding C, et al. Ulcerative Colitis Endoscopic Index of Severity (UCEIS) versus Mayo Endoscopic Score (MES) in guiding the need for colectomy in patients with acute severe colitis. Gastroenterol Rep (Oxf). 2018;6:38–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Takenaka K, Ohtsuka K, Fujii T, et al. Development and validation of a deep neural network for accurate evaluation of endoscopic images from patients with ulcerative colitis. Gastroenterology. 2020;158:2150–2157. [DOI] [PubMed] [Google Scholar]
- 47. Klang E, Barash Y, Margalit RY, et al. Deep learning algorithms for automated detection of Crohn’s disease ulcers by video capsule endoscopy. Gastrointest Endosc. 2020;91:606–613.e2. [DOI] [PubMed] [Google Scholar]
- 48. Ding Z, Shi H, Zhang H, et al. Gastroenterologist-level identification of small-bowel diseases and normal variants by capsule endoscopy using a deep-learning model. Gastroenterology. 2019;157:1044–1054.e5. [DOI] [PubMed] [Google Scholar]
- 49. Stidham R, Yao H, Bishu S, et al. P595 Feasibility and performance of a fully automated endoscopic disease severity grading tool for ulcerative colitis using unaltered multisite videos. J Crohns Colitis. 2020;14:S495–S496. [Google Scholar]


