Abstract
In common medical procedures, the time-consuming and expensive nature of obtaining test results plagues doctors and patients. Digital pathology research allows using computational technologies to manage data, presenting an opportunity to improve the efficiency of diagnosis and treatment. Artificial intelligence (AI) has a great advantage in the data analytics phase. Extensive research has shown that AI algorithms can produce more up-to-date and standardized conclusions for whole slide images. In conjunction with the development of high-throughput sequencing technologies, algorithms can integrate and analyze data from multiple modalities to explore the correspondence between morphological features and gene expression. This review investigates using the most popular image data, hematoxylin–eosin stained tissue slide images, to find a strategic solution for the imbalance of healthcare resources. The article focuses on the role that the development of deep learning technology has in assisting doctors’ work and discusses the opportunities and challenges of AI.
Keywords: whole slide images, digital pathology, deep learning, multi-modality
Introduction
In the era of the precision medicine, the demand for predictive analytics by doctors has increased. Multiple types of medical data, such as digital pathology slides, immunohistochemistry (IHC) images, spatial transcriptome sequencing and medical history records, are used for big data analysis. Unlashing the power of medical big data can be succinctly summarized as providing the most accurate treatments to the right patients at the right time [1]. Compared to molecular exams, pathology slides are widely accepted in clinics with their affordability and low technical requirements. The pathologist is still the gold standard in the diagnosis of many diseases (such as cancer) in tissue slides. Hematoxylin–eosin stained (H&E) tissue slides show multiple cytoplasmic, nuclear and extracellular matrix features, making them an easily accessible and informative source of data [2–4]. However, while the pathologists can flexibly adapt to the high morphological and technical variability of histologic slides, they face the problem of limited objectivity due to cognitive and visual traps. Recently, digital pathology has been focusing on quantitative data management of information generated from digitized specimen slides [5]. The data analysis procedures mainly include scanning and stitching traditional glass sections into whole slide images (WSIs), saving the data, then completing steps such as medical diagnosis or machine learning analysis by viewing the WSIs. This process generates a large amount of digital accessible data that can be easily shared among researchers worldwide, which not only enriches the learning resources for medical purposes, but also attracts a large number of computer scientists and creates a revolution in the field of digital pathology.
Artificial intelligence (AI)-based digital pathology as an emerging area has shown great promise to increase both the accuracy and availability of high-quality health care in many medical fields. Due to the rise in the amount of data, the increase in computing power, and the emergence of new algorithms for machine learning, the concept of ‘AI’ has been widely used in the fields of face recognition and autonomous driving since 2012 [6–8]. Deep learning algorithms have rapidly become the preferred method for analyzing medical images [9]. AI-based digital pathology not only facilitates a more efficient pathology workflow, but also provides a more comprehensive and personalized view, enabling pathologists to address the progress of complex diseases. In addition, a single stained section reveals only little information to the naked eye, and preparing multiple sections for staining and examination is more costly [10]. These difficulties provide opportunities to use AI-assisted systems to give doctors a reduced workload and improved accuracy of diagnosis and treatment.
Computational techniques in digital pathology
AI is a branch of computer science that attempts to understand the nature of intelligence and construct complex machines that possess the same characteristics as human intelligence [11]. Humans initially used machine learning to implement AI, designing features by hand to extract patterns from raw data. Traditional algorithms include Logistic Regression [12], Naive Bayes [13], Decision Tree [14] and Support Vector Machines (SVM) [15]. However, it has gradually been found that the performance of simple machine learning algorithms depends heavily on the selection and representation of features. To overcome the difficulties of feature design, machine learning has developed deep learning techniques. The model is inspired by human neurons and uses artificial neural networks to simulate the biological functions of the brain. Compared to the simple architecture of traditional machine learning models, deep learning models use a hierarchical structure that allows computers to learn to abstract complex, data-driven logic by constructing simple concepts without significant human intervention in the feature design task. The first layer of the model is the input layer, and model training is completed by fitting ground truth in the last layer through techniques such as back propagation and gradient descent. The performance of the model relies on the test evaluation of a dataset that does not overlap with the training set. Popular models include deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), etc. [16, 17]. DNNs have the most basic network architecture, where linear operations are performed one by one between layer neurons, and then an activation function is attached to obtain nonlinear features [18]. CNNs have been frequently used in the field of computer vision. Since the creation of LeNet [19] in 1998, CNNs have become deeper, more refined in their structural design, and gradually more detailed in their research tasks. A series of models, such as ResNet [20], emerged that can surpass the level of human recognition. Usually, CNNs are connected with several fully-connected layers after convolution to map the feature map generated by the convolutional layers into a fixed-length feature vector. Fully convolutional networks (FCNs) replace the fully-connected layers with layers of convolutional layers to achieve pixel-level classification of images. CNNs have been widely used on radiological images, IHC images, and pathological images [21, 22]. Another type of networks commonly used to process sequence data are RNNs, which can capture association information between long-distance samples. RNNs have greater advantages in fields such as speech recognition and text prediction and are often applied in the analysis of RNA and DNA sequences [23, 24].
Recently, the application scenarios of AI on digital pathology have gradually expanded. A generalized workflow (Figure 1) has been widely used on a variety of topics and organs, introducing the latest techniques in classification, segmentation, and detection in the field of computer vision. The main research involves auxiliary functions, focused on image quality enhancement, cell identification, tissue typing, data storing, as well as direct applications, such as diagnosis, patient stratification, prognosis, treatment response, survival prediction and biomarker discovery. Several AI-related studies focus on investigating the interpretability of algorithms [25, 26] or building clinical application platforms [27], making it possible to select and stratify patients for treatment. With genetic sequencing databases becoming more refined, combination of digital pathology images and molecular datasets has led to breakthroughs in molecular biology [28, 29].
Figure 1.
Workflow diagram for doctors, pathologists and computer scientists. Based on a doctor’s request for a pathological examination, the pathologist prepares sections of the patient’s tissue samples with stains, digitizes the sections and annotates them with WSIs information as ground truth. These images are combined with relevant electronic clinical information to serve as a training database for AI models that can be shared worldwide, relying on the Internet. Computer scientists train and test the models based on the ground truth and complete multiple performance testing experiments to evaluate model value. The entire research process can be explored in categories based on medical tasks, disease types, computer models and computer tasks. Mature models can provide assistance to the pathologists in clinical diagnosis and treatment [169].
Methods
We investigate the application of AI to pathology images with center on H&E images. Manual feature extraction requires a priori knowledge from pathologists, usually for specific cancers or tissue types. Deep learning addresses the challenge of extracting image features that cannot be manually understood and finding image features that can be generalized. To investigate these techniques, we performed a manual screening on the US National Library of Medicine PubMed database (PubMed) for highly cited peer-reviewed articles published in English between January 2017 and September 2021. Search terms include medical (WSI, H&E) and AI (AI, deep learning, supervised learning, semi-supervised learning, weakly supervised learning, unsupervised learning, multitask). The results of the search contained 6707 articles. The selection process consisted of three steps: 1) Only research articles were retained, excluding types such as reviews, comments, etc. 2) Based on titles and abstracts, articles with at least one keyword in each of the medical and AI groups were retained, while avoiding purely medical experimental articles or computer technology research articles. 3) Papers that are not related to pathology images are excluded since abbreviations such as ‘AI’, ‘WSI’ and ‘H&E’ are easily confused with keywords such as aluminum. Only AI articles, where H&E images were the main subject of study, were retained. Therefore, a total of 155 articles are discussed in the main body of this paper. Sample articles are discussed for each type of article as a demonstration in Table 1 [28–30].
Table 1.
Sample articles for each research direction
Topic | Year | Tissue type | Medical task | Computer task | Data type | Input size | Main dataset size | Backbone | Learning method | External dataset validation | Refs.a | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Information preparation | Automatic segmentation | 2017 | Seven organs | Nuclear segmentation | Segmentation | H&E | 51 × 51 pixels | 30 WSIs | CNN | Supervised learning | No | [41] |
Virtual staining | 2020 | Kidney tissue sections | Data Preparation | Generation | Autofluorescence images | 256 × 256 pixels | 12-thin tissue sections | GAN | Unsupervised learning | Yes | [57] | |
Prognosis (MSI) | 2019 | Gastric (stomach) adenocarcinoma and colorectal cancer | Classify MSI versus microsatellite stability | Classification | H&E | 256 × 256 μm2 | 1053 WSIs | CNN | Supervised learning | Yes | [65] | |
Diagnosis | Diagnosis | 2019 | Prostate cancer, basal cell carcinoma and breast cancer metastases to axillary lymph nodes | Develop diagnostic decision support systems | Classification | H&E | 224 × 224 pixels | 44 732 WSIs | CNN + RNN | Weakly supervised learning | Yes | [91] |
Diagnosis | 2020 | Prostate cancer | Automated Gleason grading of prostate biopsies | Classification | H&E | 1024 × 1024 pixels | 5759 biopsies | CNN | Supervised learning | Yes | [74] | |
Diagnosis | 2020 | Diffuse large B-cell lymphoma (DLBCL) | Differentiate DLBCL and non-DLBCL | Classification | H&E | Various sizes | 4530 tissue sections | CNN | Supervised learning | Yes | [92] | |
Post-treatment | Outcome | 2020 | Colorectal cancer | Develop an automatic prognostic biomarker | Segmentation, classification | H&E | 224 × 224 pixels | 12 226 119 tiles | CNN | Weakly supervised learning | Yes | [102] |
Prognosis | 2021 | Gastric cancer | Improve tumor node metastasis staging system | Segmentation, classification | H&E | 700 × 700 pixels, 768 × 768 pixels | 9366 WSIs | CNN | Supervised learning | Yes | [113] | |
Images with nucleic acid sequences | Tumor/normal or subtype and gene mutation | 2018 | Lung cancer | Detection of cancer types or gene mutations | Classification | H&E and gene mutations | 512 x 512 pixels | 1634 WSIs | CNN | Supervised learning | No | [133] |
Tumor/normal or subtype and gene mutation | 2020 | 19 cancer types | Reveal conserved spatial behaviors across tumors | Classification | H&E | 512 × 512 pixels | 27 815 WSIs | CNN | Supervised learning | Yes | [134] | |
Multiple digital pathology images | Patient stratification | 2021 | Esophageal adenocarcinoma | Triage-driven diagnosis of Barrett’s esophagus | Classification | H&E and IHC | 400 × 400 pixels | 4662 WSIs | CNN | Supervised learning | Yes | [152] |
Morphologic assessment | 2021 | Kidney cortex | Segment kidney histologic structures | Segmentation | H&E, PAS, Silver, and Trichrome | 256 × 256 pixels | 459 WSIs | CNN | Supervised learning | Yes | [151] | |
Multiple modal information | Prognosis | 2017 | Lung adenocarcinoma | Predict the survival outcomes | Classification | H&E, pathology reports, RNA-sequencing, and proteomics data | — | 831 WSIs | Random forest | Traditional machine learning | Yes | [157] |
Multi-task | Simultaneous segmentation and classification | 2021 | Hepatocellular carcinoma | Segmentation of HCC lesions | Segmentation, Classification | H&E | 256 × 256 pixels | 423 images | CNN | Supervised learning | Yes | [159] |
Differential diagnosis | 2021 | 18 primary origins | Predicts origins for cancers of unknown primary | Classification | H&E | 256 × 256 pixels | 32 537 WSIs | Graph, CNN | Weakly supervised learning | Yes | [163] |
CNN, convolutional neural network; RNN, recurrent neural network; GAN, generative adversarial network
aThe reference numbers are consistent with the text.
Applications in unimodal digital pathology
It is a time-consuming and cumbersome process for pathologists to find a predefined region of interest and conduct with an accurate diagnose under time constraints. Besides, inter- and intra-observer variability for the exact same slide may occur among the experienced pathologists. Many efforts were made to overcome the challenges with traditional pathology methods. Technological advances have recently paved the way for the development of digital pathology-based methods for quantitative assessments, namely WSIs scanning and AI–based models, to assist with extracting information beyond pathologists’ visual perception. Pathology images and corresponding clinical information (such as cancer grading or cancer types) are mostly used for unimodal research [31]. The label for modeling results purely from clinical records without other complex wet experiments. Unimodal research with simple and clear goal is very suitable for AI-based models in clinical applications and translations (Figure 2). However, it is necessary to compare AI-based models’ predictions with the analyses that pathologists produce to reveal potential interpretation discrepancies and accuracy of algorithms. The development and integration of digital pathology and AI-based models provide substantive advantages over traditional methods, enabling spatial analysis while generating highly precise, unbiased and consistent readouts that can be accessed remotely by pathologists.
Figure 2.
Description of a unimodal generalized framework for AI application on digital pathology images. After the raw data collected at the hospital are annotated by pathologists, computer scientists complete the patches cut and enhance the data. The processed data are fed into a model pre-trained by ImageNet to achieve transfer learning of natural images to digital pathology images. The computer scientists train the model to fit the annotations and test the generalization of the model on an extra dataset that is also pre-processed. Finally, the model analysis results are used for statistical analysis as well as visual interpretation to explore their biological significance [169, 204].
Information preparation for digital diagnosis and treatment
H&E images highlight the morphological features of the nucleus and cytoplasm mainly by the pH distinction of the stains. Pathologists need plenty of exercises in their daily work to become proficient at slides analysis [32, 33]. When visualizing H&E images on the computer, there are problems, such as inconsistent staining shades and poor photographic quality due to manual errors. There are also problems in the algorithmic computation stage, where the size of the image is too large for commonly used algorithms. Many studies have been devoted to solving the difficulties associated with image data [34–37]. Therefore, the selecting the part of the image that contains morphological information is an important data pre-processing process.
The use of automated tools to differentiate cells/tissues can reduce manual work [38–43]. In computer vision, the mainstream segmentation network consists of encoder and decoder, which input an image and output a label mask of the same size. Transferring the model to the field of digital pathology, the segmentation of tissues or glands of a single cancer can generally be accomplished only due to the limitation of annotation [44–47]. The processing of pathology images as shown in Figure 2 usually refers to the cut-off approach in natural images, where square images are fed into a neural network. However, the cells are not only arranged in a square order, but also others, as the morphology changes with the type, organ and developmental stage, and some information is always lost when slicing the patches. Cells that are in contact or overlapping make the automatic segmentation task even more difficult [48]. The identification of cell nuclei during the analysis can often be centered on a single pixel, while extending the focus to the tissue requires the aid of the superpixel idea. Each superpixel combines adjacent pixels containing a single tissue type to fit the edges between tissues as irregular polygons [49, 50]. This interactive annotation system developed by Lee et al. [51] returns the less confident parts to the doctors for secondary review, thus optimizing the model in segmenting tumor-infiltrating lymphocytes (TILs), saving significant time in clinical applications.
When multiple special stains (such as periodic acid-Schiff or Masson’s trichrome) are required, the diagnosis and treatment processes become time-consuming and expensive. The stain transformation method in computational staining techniques provides doctors with additional diagnostic information beyond H&E images without laboratory operations [52–56]. With regards to the region-of-interest analysis, virtual staining can directly transform label-free tissue images into high-quality stained digital pathological images. Zhang et al. [57] demonstrated that a single neural network can achieve H&E, Jones’ silver staining and Masson’s trichrome staining of micro-structured on a single autofluorescence image. This method avoids destructing the specimens, completes operations that are not feasible in the laboratory, and achieves staining normalization.
In the grading of cancer, the metastasis of lymph nodes determines the seriousness of the disease and the next step of treatment [58, 59]. Algorithms can follow the process of finding lymph nodes first and then looking for metastatic sites in practice [60]. When expecting to outperform human performance, additional attention is paid to the size difference between micro- and macro-metastases in detection, employing dense scans to avoid missing tiny metastases [61].
Translating the complex sequencing process into simple classification problems in H&E images, as opposed to improving the ability to understand the images, brings new opportunities for oncology patients [62]. For example, consensus molecular subtypes (CMSs) of colorectal cancer are difficult to distinguish even after completing RNA analysis, and optimizing the results of molecular typing is important to guide treatment. As is known, cancer is good at evading immune destruction, but not all patients need immunotherapy or have the condition to complete genetic testing. In studies on immune checkpoint inhibitors (ICIs), tumor mutational burden (TMB) and microsatellite instability (MSI) are shown to be important biomarkers. The researchers assisted in the selection of treatment options by classifying H&E images as high- or low-value divisions [63, 64]. Analysis of TMB values with clinicopathologic features demonstrated that TILs, a morphologic feature that can be observed on pathologic slides, is an independent predictor that correlates tumor morphology with genomic alterations. Most of the MSI-related studies involve gastrointestinal cancers. The access to research data has become more extensive and contributed to the design of more complex deep learning models. With the different models, not only are the performances of the model themselves measured, but also the data source differences, preparation differences, and performance on different subclasses of diseases are investigated [65, 66]. Benefiting from advances in deep learning Ref. [67] referencing the idea of Ref. [65], the first step of the tissue classification task was refined to select mucin and colorectal adenocarcinoma epithelium tiles that are more relevant to MSI for prediction, and the performance was improved.
AI attempts to replace manual diagnosis
Efficient diagnosis of complex diseases can provide an important basis for patient recovery and clinical development, and many AI tools that have been developed to replace manual operations, thus saving time and money [68]. Initially, the diagnostic task was considered as a binary task as either having cancer and being healthy, and high accuracy rates have been achieved in algorithmic experiments with supervised learning [69]. In fact, cancer is a highly heterogeneous disease requiring more detailed subtype diagnosis. Deeper and more structurally complex models were created to accomplish this need [70, 71]. More questions followed and people started to explore whether the models could actually be used in a clinical setting and whether valuable knowledge was actually learned in the process of model prediction.
As shown in Figure 2, the pathologists first draw tumor/normal/other class regions on the WSIs and then slice the WSIs into patches that fit the size of the image model. The researchers train the model to learn the mapping relationship between patches and labels, and in this process, the model achieves abstraction of features. Then, in the testing phase, the same predictions are made for patches of unknown images, which are integrated into the results of slicing. The integration operation can be implemented with statistical analysis, machine learning or neural networks [72–78]. Based on the seriousness of the disease, Yang et al. [73] innovated the two-stage threshold-based tumor-first aggregation method according to the severity of the type. The indicators on public the cancer genome atlas (TCGA) data even exceeds the internal cohort, demonstrating the portability of the model. Across all covered cancers, the Gleason score exacerbates the challenges posed by intraclass similarity and interclass variability between multiple classes [79, 80]. Unlike one label per image, the diagnosis of prostate cancer requires a clear definition of the proportion of each pattern in a single picture [81]. Bulten et al. [56] innovated a semi-automated labeling method, making a model that realizes the simple task of training on the pure Gleason score first, then training it on the complete mixed Gleason sample dataset.
The problem of patch-level annotation also presents an opportunity for multi-instance learning (MIL) to enter the field of medical image analysis. To meet the training requirements of deep learning models, researchers need to prepare a large amount of data and ensure the accuracy of the annotation through the review of multiple pathologists. Weakly supervised learning treats the whole image as a ‘bag’ and each patch as an ‘instance,’ with multiple instances sharing the labels of the bag. If there is a positive instance in the bag, the bag is positive; the bag is only negative if and only if the instances are all negative, so the algorithm usually chooses the top-level patches that have the most discriminative power (the highest probability of being predicted as positive) to infer the slide class in the implementation process [82–90]. The performance of the model even exceeds the supervised learning with pixel-by-pixel annotation in specific cases. Campanella et al. [91] trained CNN models on the oversized real-world dataset (MSK dataset) of prostate, skin, and breast cancers and used a RNN to achieve slide-level determination. When they tested the MIL algorithm trained on the MSK dataset on the filtered CAMELYON16 dataset, the model AUC decreased by 7.15%. However, when the supervised learning algorithm was applied in real-world data, the performance dropped by 20.2%. This suggests that traditional supervised learning methods do not adapt to the variability of data.
Researchers have applied various models to test on data from different sources, [73, 74] aiming to solve AI models generalization problem on low-level data from multiple centers. However, deviations in the sample population with different staining patterns, and even microscope scanning specifications, may bring complex impact on pathological images. Typically, models are trained only on a single hospital data set, and the embedding features captured by the model cannot be transferred directly to another hospital datasets, resulting in low diagnostic accuracy and unable to reach a universal level of clinical application. Li et al. [92] adopted a multi-model approach, using an ensemble of 17 globally optimized transfer deep-learning platform with multiple pretrained CNNs (GOTDP-MP-CNNs) to distinguish between diffuse large B-cell lymphoma (DLBCL) and non-DLBCL on the data of three hospitals. The accuracy rate reached nearly 100%, surpassing individual CNNs and pathologists. In the cross-hospital tests of A and C, the accuracy rate increased from 82.09% to 90.50% only with unifying the image size. This result shows that the AI model can replace manual work on small data sets after avoiding differences in the sample preparation process through data preprocessing.
Despite all these metrics that demonstrate the capabilities of AI, deep learning models have been referred to as black box algorithms, as it is difficult to explain the working mechanism within the model to convince people that it is reliable. Researchers relied on medical knowledge to interpret models in different views, including the ablation studies [77], splitting pre-processing process [74], feature clustering [75] and the activation of class maps [62]. There are also many AI models that have been designed to increase model interpretability by leveraging the behavior of pathologists as a reference during the design process. In practice, pathologists usually zoom in to observe several important cancerous sites in the image and rely on image features to make a diagnosis for the patient [87], resulting in the design of a two-stage diagnostic algorithm that first detects important locations in the full image that affect the outcome and later classifies the locations. BenTaieb et al. [93] used a pyramid-shaped network to achieve the recognition of image features at different magnifications.
AI provides new knowledge for post-treatment analysis
The analysis of patient prognosis, recurrence or survival can guide the choice of treatment options [94–101]. In particular, the prognosis of the post-treatment effect in the early stages of cancer is key to optimizing clinical interventions. Generic models that directly use post-treatment metrics as labels are similar to diagnostic tasks, and researchers can corroborate the results of the models by analyzing the calculated characteristics in correspondence with other patient information [102, 103]. Therefore, the studies that can propose new criteria for clinical discriminations deserve more attention.
Pathological images reflect the morphological features and tissue architecture of tumors and contain a wealth of post-treatment information [104]. Traditional machine learning approaches apply predefined image features to resolve tissue patterns [105, 106]. On the contrary, deep learning can automatically identify cellular-level or structural-level patterns from multiple scales for fusion analysis [107]. For example, image features focus researchers’ attention on subclasses of a disease or subregions of an image, highlighting the relevance of class differences for post-treatment [108, 109]. Researchers measured the validity of the identified features using molecular expression information or correlation analysis as a bridge to analysis. Based on the idea that different molecules are expressed in different regions of the tumor, Zadeh Shirazi et al. [110] first segmented the pathological images into eight regions such as infiltrating tumor, and then correlated the mutation status with the survival rates. Yu et al. [111] first validated the ability of VGGnet as a backbone network to extract reasonable image features on simple cancer detection, grading and transcriptome subtype classification tasks. The network was then applied to predict the platinum-free interval (PFI), comparing proteomic and transcriptomic correlation analysis, combining biological mechanisms with morphological representation. Skrede et al. [102] developed a prognostic marker for colorectal cancer that can complete prognostic classification simply using H&E images. This method adopts a multi-scale MIL algorithm on a large-scale population.
The above paper shows that morphological information can give different forms of complementary evidence to post-treatment analysis, pushing AI models to ‘renovate’ existing empirical knowledge. Just as doctors know that different classes of conditions have different tissue compositions, in practice, this is still a difficult pattern distribution to determine. Neural networks can translate the distribution patterns of objectively complex and rich multicellular environments into quantitative clinical discriminators with a single value [98, 112]. Wang et al. [113] proposed the T/MLN for standardizing the N-staging criteria in gastric cancer. This score was calculated from the tumor classification area within the segmented lymph node region and was able to correct the cases diagnosed by doctors. In more complicated immune infiltration-related studies, it is a strong prognostic biomarker that relies on manual scoring [114–116]. With the aid of a computer, we can clearly detect immune cells in the tumor area and derive evaluation indicators of prognostic value from the number and distribution of cells (such as stromal TILs density statistics) [96, 117, 118]. This is the necessary research for clinical application of image features to drive the digital pathology assistance system to the ground.
Fusion approaches in multimodal digital pathology
During daily cancer diagnoses and treatments, even if the evaluation of tissue sections by trained pathologists is the gold standard, they still need the aid of some biomarker tests. The advent of high-throughput sequencing technology, the collection of electronic medical records and the improvement of hospital laboratory facilities have given us access to more information. The multiple information resources have been contributing to a new multimodal direction for research [119]. Multimodal studies refer to the processing of information from multiple modalities with different formation methods and internal structures and learning the correlations between separated modality datasets. This paper focuses on the integration of H&E images with nucleic acid data, additional images and clinical indicators. Nucleic acid data refer to the specific states detected during genome and transcriptome sequencing, and these often represent the inherent factors in the development of disease [120–122]. Additional images refer to the images generated when the same subject takes different imaging methods or stains. The paired images can help high-dimensional feature abstraction posed by H&E oversized images during deep learning feature extraction [123]. Clinical indicators are quantitative indicators contained in the patient’s electronic medical record, such as age, tumor stage, etc. All three types of data listed can be used to improve the effectiveness of deep learning algorithms from different perspectives.
The human brain is a high-level organ with the ability to process complex information, but the computer only completes computational procedures according to instructions. Two ways to solve the modal problem are shown in Figure 3. Independent inputs and outputs with unimodality can help us discover the correspondence between different modal information by running at high speed. Multiple modalities are used as the input of AI models, and each modality should be integrated with another. Research in both directions is addressing an important difficulty—aligning the features between different modalities [124].
Figure 3.
Description of multimodal generalized ways for AI application on digital pathology images. (A) H&E images are predicted by feature extractor for gene expression, other stained images, or clinical information to achieve alignment of different kinds of information content. (B) H&E images and multiple types of data are jointly predicted by feature extractor for the target, and the kinds of feature dimensions need to be aligned in the feature extraction process [169].
H&E images integration nucleic acid detection
The expression of nucleotides dominates the general patterns of organismal functioning. In unimodal studies, we commonly use RNA-seq data as validation metrics to discriminate whether the model learns the correct image features [98, 125]. The prediction result map of deep learning extracts human interpretable features that correlate with molecular phenotypes and can demonstrate the effectiveness of deep learning [126].
Unimodal input–output studies that extend the model on single pathological images to the fusion of pathological and molecular features guide the clinical system to make fine-grained decisions [127, 128]. It was Valieris et al. [129] who applied the model design of Campanella et al. [91] to DNA repair deficiency (DRD) detection. The morphological characteristics of the pathological pictures contain differential expression of gene methylation [130]. Tavolara et al. [131] used deep learning to predict gene expression, and then xgboost to classify gene expression as ‘supersusceptible’ or not.
Genetic mutations can cause a variety of cancers [132, 133]. Coudray et al. [134] used the same network to demonstrate that deep learning models can not only accomplish the classification of tumors, lung adenocarcinoma and lung squamous cells, but also predict specific gene mutations. The same problem was extended to pan-cancer by Noorbakhsh et al. [135] using the identical basic Inception v3 architecture. The pan-cancer study focuses on discovering similarities and differences between genomic and cellular alterations in different tumor types and finding universal patterns. The problem of less available data encountered in the above study, Bian et al. [136] proposed a semi-supervised approach to accomplish data labeling and screening, and the gene mutation prediction task designed based on this study achieved good results.
Simultaneous multimodal input of image and molecular information can predict survival. The fusion of multimodal data needs to address the validity of the information of each modal data. In the field of diffuse glioma survival research, Mobadersany et al. [137] first applied survival convolutional neural networks (SCNN) in regions of interest (ROIs), and the model combined CNN and traditional COX analysis to demonstrate a median c index of 0.754. After validation, the SCNN model was upgraded by adding isocitrate dehydrogenase (IDH) mutation status and 1p/19q codeletion as input to genomic survival convolutional neural network (GSCNN), the data of the two modalities together undergo fully connected network and COX analysis, and the index is upgraded to 0.801, which achieved the performance improvement on a single model. New techniques have been introduced to optimize feature alignment engineering across various modality data. The study by Chen et al. [138] used the co-attention transformer to achieve survival prediction. The co-attention visualization in the paper allows for complementary matching of sequence information with spatial information. The model’s performance surpasses the unimodal model as well as the late fusion algorithm.
Multiple digital pathology images combination
Magnetic resonance imaging (MRI) with WSI allows for simultaneous analysis of patient prognosis from both macroscopic and microscopic perspectives. Zhang et al. [139] extracted features from two scales separately to form a multiscale nomogram and verified that the features were associated by genetic variation. When it is necessary to understand the expression of biomarkers, other specific stains are needed to assist [44, 100, 115, 140–142]. Stain transformation techniques are mentioned in the ‘information preparation’ section, and this section discusses the role of different stains in accomplishing the task.
IHC combines the specificity of the immune response, the visibility of histochemistry, with qualitative, localization and quantitative determination of the corresponding antigen. Manual analysis of immune tissue environmental characteristics requires cytokeratin and IHC to identify each tissue and cell. Especially in complex tumor models, there is an insurmountable difficulty for humans to accurately calculate the number of each type of cells in the matrix [96]. Martino et al. [143] used IHC stained pictures as ground truth and extracted cell nuclear related features using QuPath to predict the expression of protein Ki67. This type of image is also commonly used to measure the model’s new biomarker [99, 117, 118] and how well the model scoring results matches the objective facts [144–146].
To investigate the performance differences of the different staining schemes on the same model structure and task [147–151], Jayapandian et al. [152] achieved segmentation of six kidney tissue structures on four stains: H&E, Periodic Acid Schiff (PAS), trichrome (TRI) and silver (SIL). Besides, one of the practical applications of the different staining methods is to achieve patient stratification and to apply multiple images to the diagnostic workup, reducing doctors’ workloads. Gehrung et al. [153] exploited a triage aid system to screen only H&E pictures with columnar epithelium into the second step of TFF3 staining assessment to select equivocal patients for pathologist to review.
H&E images incorporating multiple modal information
In clinical settings, data modalities are abundant. When some patients are admitted to the hospital, the first thing is to improve the patient demographics in the electronic medical record system. As a relatively common and easily obtained data modality, electronic records of clinical patients can be integrated with pathology images to enhance the reliability of image features and improve model performance [154, 155]. Brinker et al. [156] integrated inputs from three perspectives: clinical features, cellular features and image features, followed by features that went through a neural network feature extractor to jointly predict sentinel lymph node status, respectively. Although the results are not good enough, they provide new ideas for feature extraction. In [97], Cox regression analysis of the multinucleation index (MuNI) and clinical characteristics showed a significant improvement in hazard ratio (HR).
Vast amounts of omics data, bringing about detailed classification, early diagnosis, and prognosis of the disease, can be combined with digital pathology image data to extract useful patterns from these diverse data modalities. Multi-omics data can interpret image features [145]. Features extracted by deep learning models based only on pathology images with implicit information mining based on various types of histology data can help clinical translation for creating new clinical prognostic or grouping metrics [157]. Alternatively, sequencing data based on pathological image information and few important different multi-omics can support the performance of metrics for practical clinical applications. In [98], Cox analysis was carried out together with UICC stage, gender, and age, and it was verified that deep stroma score is an independent prognostic factor with high value. The application value of the index is reflected by the comparison with the gold standards, including manual annotation estimation and gene expression, thus confirming a universally available clinical application of H&E images. When copy-number alterations (CNA) correspond to transcriptome data with an unsupervised deep learning network (named CNx), the correlation of their middle layer activation values with the features can be found.
How to simultaneously input multiple histological data into a single network and tissue feature dimension alignment are the primary challenges in multimodal studies. Manual filtering of features is the general approach. Yu et al. [158] combined histopathological, genomic, transcriptomic, proteomic and clinical information. Firstly, quantitative features were extracted in the images. Then the images were graded based on clinical information, and the relationships between the filtered genomic, proteomic and transcriptomic features, and the pictures were explored. It was confirmed that there was a strong correlation between the enriched genes or proteins and tumor differentiation. When the multi-omics information was put into the COX analysis collectively, it showed excellent performance that was not available in separate analysis.
Multi-task learning with H&E images
Multi-task learning means that model features embedding and sharing for different tasks uncover more biological insights in different dimensions, such as, WSI, a domain of medical imaging that contains tissue and cell samples. Several researchers have applied deep learning on WSIs for various clinical tasks, including cell detection, segmentation and classification. Furthermore, while the similarly between the different tasks maybe difficult to define formally, some transfer learning methods between source tasks and target tasks have been shown to help convergence and reduce overfitting.
When extracting image features, we can perform cell segmentation and classification at the same time. One approach is to share feature extraction and then use different branches for different tasks. Graham et al. [159] completed the segmentation and classification of cell nuclei using Hover-Net, which is based on feature maps calculated from horizontal and vertical distances. The segmentation task is accomplished by parallel cell nucleus-to-background distinction branch, and individual cell nucleus delineation branch. Hover-Net demonstrates superb performance across diverse datasets. Wang et al. [160] focused on the pixel depiction of tissue edges and used two branches to complete the segmentation of tumor tissue and tumor against normal tissue, respectively; another auxiliary branch was designed to complete the classification task. Based on the deep learning results, the graph features are manually extracted to find new biological meanings. Wang et al. [161] completed the tasks of individual cells based on features extracted by Mask R-CNN, and the calculated quantitative depiction features set up a new subtyping criterion for hepatocellular carcinoma images. When having cell nucleus type labels, the classification task can be implemented simultaneously as an information supplement. Another approach is to process the segmentation and classification tasks sequentially. The segmentation task requires pixel-level annotation, while classification requires only image-level annotation since pixel-level annotation is often both time-consuming and error-prone. Because the segmentation mask contains the information on the classification to some extent, Ciga et al. [162] first trained the segmentation network using quantity-limited pixel-labeled images, then connected the classification layer after the network and borrowed the image level labels to adjust the segmentation network.
Clinical tasks with digital pathology are adaptive in different circumstances. To infer various properties of biological samples through multi-task learning or transfer learning is a big challenge. From the biological viewpoint, the human body can be viewed as a cell-tissue-organ level integration, which can be matched to the various resolution clinical tasks including image-based-only or image-omics-fusion tasks. In modern clinical medicine, especially with the diagnosis and treatment of complex diseases such as cancer, multiple various medical examinations are required, and they produce different data modalities. Using the different examination indicators as output, the genotype can be visualized spatially on the pathology images [163]. Schmauch et al. [144] implemented the HE2RNA model based on the idea of multilayer perceptron to study the accuracy of RNA-Seq prediction among 28 cancers. They found that, within the appropriate data selection, the model was consistent with the objective fact of IHC staining for gene expression prediction in terms of immunity, etc. They then transferred the transcriptomic representations learned by the model to MSI predictions, and the model performance reached a high improvement over the previous ones in a 3/4 versus 1/4 data division setting, demonstrating the validity of the representations. The multiclassification generic framework of [82] was used for cancer of unknown primary to predict whether it is metastatic and where the origin is. Convolutional features that concatenate gender features performed well on each point of origin, each cancer, its subclass, and individual. Top-k calculation results successfully narrowed down the IHC staining detection to assist doctors in diagnosis [164].
Opportunities and challenges in AI-based pathology
With the rise of big data analytics that enables the structuring and intelligent indexing of data, the current trend in the development of datasets is very favorable for research efforts, as the use of deep learning becomes more ubiquitous and practical in digital pathology [165–168]. Table 2 lists publicly available datasets of H&E images [169–192]. AI research motivates computers to simulate human learning habits, similar to adjusting the solution process based on the answers to exercises, which allows the model to capture the connection between data features and various labels and gain the ability to correctly discriminate known patterns after a series of adjustments and refinements. In the face of pathological images, features and labels are presented in different ways. Features can refer to whole image, tissue and cellular level image information extracted by the model, as well as clinical and genetic testing type information of the patient. The labels, on the other hand, are set according to the target task, with discrete numbers for classification tasks and pathologists’ annotations on the images for detection and segmentation tasks.
Table 2.
Digital pathology public datasets summary
Dataset name | Type | Task | Tissue | Staining method | ImageNum | Year | Refsa | Link |
---|---|---|---|---|---|---|---|---|
TCGA | Dataset | — | — | — | — | 2008 | [167] | https://gdc.cancer.gov/ |
ICPR 2012 | Challenge | Automated detection of mitotic cells | Breast | H&E | 5 | 2012 | [170] | http://ludo17.free.fr/mitos_2012/ |
MITOS & ATYPIA 2014 | Challenge | Detection of mitosis and evaluation of nuclear atypia score | Breast | H&E | 32 | 2014 | [171] | https://mitos-atypia-14.grand-challenge.org/ |
GlaS | Challenge | Segmentation of glands | Colon | H&E | 165 | 2015 | [172] | https://warwick.ac.uk/fac/cross_fac/tia/data/glascontest |
CAMELYON16 | Challenge | Detection of micro- and macro-metastases in lymph node digitized images | Breast | H&E | 400 | 2016 | [173] | https://camelyon16.grand-challenge.org/Home/ |
TUPAC | Challenge | Predicting tumor proliferation fraction | Breast | H&E | 500 and auxiliary dataset | 2016 | [174] | https://tupac.grand-challenge.org/ |
CAMELYON17 | Challenge | Detection and classification of breast cancer metastases in lymph nodes | Breast | H&E | 1399 | 2017 | [175] | https://camelyon17.grand-challenge.org/Home/ |
LSVM_CTXT | Dataset | Subtype classification of ovarian cancer | Ovarian | H&E | 133 | 2017 | [93] | http://www.sfu.ca/~abentaie/LSVM_CTXT/LSVM_CTXT.html |
MCSU | Dataset | Quantification of tumor hypoxia | Eight tumors | H&E and (immuno-)fluorescence | 178 | 2017 | [151] | https://cs.adelaide.edu.au/~carneiro/humboldt.html |
nucleisegmentation | Dataset | Segmentation of nuclei | Seven organs | H&E | 30 | 2017 | [41] | https://nucleisegmentationbenchmark.weebly.com/ |
BACH | Challenge | Classification of the four types of breast cancer | Breast | H&E | 400 | 2018 | [176] | https://iciar2018-challenge.grand-challenge.org/Home/ |
Pcam | Dataset | Differentiation of the presence for metastatic tissue | Lymph node | H&E | 327 680 | 2018 | [177] | https://github.com/basveeling/pcam |
ANHIR | Challenge | Registration of stained images by different biomarkers | Lung, kidney, breast etc. | H&E, Cc10, proSPC, Ki67 etc. | 355 | 2019 | [178] | https://anhir.grand-challenge.org/Intro/ |
BCSS | Dataset | Semantic segmentation of breast cancer | Breast | H&E | 151 | 2019 | [179] | https://github.com/PathologyDataScience/BCSS |
DigestPath2019 | Challenge | Detection of signet ring cells & segmentation and classification of colonoscopy tissue | Gastric mucosa, intestine and colon | H&E | 687 & 872 | 2019 | [180] | https://digestpath2019.grand-challenge.org/Home/ |
Gleason 2019 | Challenge | Automated Gleason grading | Prostate | H&E | 333 | 2019 | [81] | https://gleason2019.grand-challenge.org/Home/ |
MITOS_WSI_CCMCT | Dataset | Detection of mitosis | Skin(dog) | H&E (manually expert labeled dataset) | 32 | 2019 | [181] | https://github.com/DeepPathology/MITOS_WSI_CCMCT |
NCT-CRC-HE-100 K | Dataset | Automated tissue decomposition | Colon | H&E | 100 000 | 2019 | [98] | http://dx.doi.org/10.5281/zenodo.1214456 |
PAIP 2019 | Challenge | Detection of liver cancer | Liver | H&E | 100 | 2019 | [182] | https://paip2019.grand-challenge.org/Home/ |
HEROHE | Challenge | Classification of her2-positive and negative | Breast | H&E | 360 | 2020 | [183] | https://ecdp2020.grand-challenge.org/Home/ |
MITOS_WSI_CMC | Dataset | Detection of mitosis | Breast(dog) | H&E (manually expert labeled dataset) | 21 | 2020 | [184] | https://github.com/DeepPathology/MITOS_WSI_CMC/ |
MoNuSAC | Challenge | Segmentation of nuclei of multiple cell-types | Lung, prostate, kidney, and breast | H&E | 71 | 2020 | [185] | https://monusac-2020.grand-challenge.org/Home/ |
PANDA | Challenge | Automated Gleason grading | Prostate | H&E | 11 000 | 2020 | [186] | https://www.kaggle.com/c/prostate-cancer-grade-assessment/overview |
BCNB | Dataset | Prediction of the metastatic status of ALN, histological grading and molecular subtype etc. | Breast | H&E and clinical characteristics | 1058 | 2021 | [187] | https://bcnb.grand-challenge.org/Home/ |
MIDOG 2021 | Challenge | Identification of mitotic figure | Breast | H&E | 150 | 2021 | [188] | https://imig.science/midog/ |
NuCLS | Dataset | Classification, localization and segmentation of the cell nucleus | Breast | H&E | 125 | 2021 | [189] | https://nucls.grand-challenge.org/NuCLS/ |
PAIP 2021 | Challenge | Detection of perineural invasion in multiple organ cancer | Colon, prostate and pancreas | H&E | 240 | 2021 | [190] | https://paip2021.grand-challenge.org/Home/ |
WSSS4LUAD | Challenge | Prediction of three tissue types | Lung | H&E | 87 | 2021 | [191] | https://wsss4luad.grand-challenge.org/WSSS4LUAD/ |
CoNIC | Challenge | Segmentation and classification of nuclear and prediction of cellular composition | Colon | H&E | 4981 | 2022 | [192] | https://conic-challenge.grand-challenge.org/Home/ |
aThe reference numbers are consistent with the text.
More effective treatments in modern medicine are in need of accurate tests tailored for various patients instead of common one-size-fits-all tests. In conventional treatment, pathological sections are simply viewed under a microscope to make initial conclusions. Results are very dependent on the material quality and doctors’ skills, which tend to be lacking in areas with poor medical resources.
As we have discussed in the application of unimodal digital pathology, computers are attempting to replace manual operation. AI has different research priorities with different phases of medical tasks. The information preparation stage can simplify data processing steps, the diagnosis stage is able to replace doctors’ task of differential diagnosis, and the post-treatment stage proposes new criteria to refine treatment goals. AI helps doctors take digitized sections as inputs directly into the computer. The computer then automatically optimizes images’ quality and allows algorithms to assist the less experienced doctor in making a qualitative diagnosis. More accurate results help patients to gain early access to medical treatments, thus preventing further deterioration in unknown cases. Unimodal digital pathology makes traditional medical work faster and more accurate.
Unimodal pathology image processing and information extraction form the basis for multimodal applications. Multimodal pathology no longer requires the computer to simulate the behavior of physicians. In other words, it is more reliant on the complementary information in precision medicine. Genetic sequencing and special stained images are necessary to confirm the diagnosis of diseases such as cancer. In addition to resource-poor regions, the full set of tests is inaccessible to patients with limited physical and financial conditions. We discuss two modality fusion approaches in the context of multimodal digital pathology applications. One is to input pathology images and output another modality (sequences, images in other formats). This approach no longer simply summarizes the information from pathology images, but enables a transformation of the information. It shows that patients without medical conditions can also have equal treatment with other indicators. Another way in which multiple modalities are fed into the computer model at the same time achieves a correspondence between different data. For example, the mutation status of each lesion can be seen in images. This allows precision medicine to be more precise than just the individual but also the lesion. Both approaches try to uncover associations between changes in the body in different dimensions brought about by the disease. AI in multimodal or multi-task research solves the problem of the mismatching multiple information sources and inadequate mining of single information sources.
It is evident that advances in the computer vision field contributed to the new diagnostics methods that emerged with the fusion of pathology images and AI [193]. Researchers have been extending the applicability of models from several perspectives. Most approaches transfer pre-trained models on natural images to their specific research problems. But there are many differences between the two kinds of images. The objects of interest in natural images are usually large subjects such as people, animals and cars, while pathological images are concerned with tiny objects, such as cell nuclei and tissues. The generality of the abstract representation of both images in the hidden space is still problematic. Furthermore, it is not clear how many cells contained in a single image can show biological significance. Directly slicing pathology images to the commonly used 224*224 pixel size may result in information loss. Many studies have used class activation mapping to visualize model results to prove that valuable information had been learned; however, some heatmaps are difficult for pathologists to interpret. Thus, it is more reliable to do pre-training with public datasets, like TCGA, or cut patches like cellular connectivity maps and superpixels.
Large orders of magnitude of data from multiple sources also led to the issue of standards for data processing. Digital pathology section databases have achieved a wide collection of researchers in terms of data quantity, but not in terms of quality (class imbalance, artificial contamination, staining differences, etc.). To enhance the generalization of the model, validation experiments on data from multiple sources are necessary. Meanwhile data batch effects [194] place high demands on data preprocessing [195]. For addressing the data noise problem, standardized data preprocessing systems [196, 197] or uniform collection standards should be proposed accordingly. Otherwise, attention needs to be paid to the overfitting problem and the development of highly accurate AI algorithms on small data sets. Another source of noise is the inconsistency in the data annotation process. Secondary review among pathologists can reduce errors, but different countries and hospitals have different classification standards. There is also the challenge of focusing on multimodal research. Image data are ultra-high dimensional and sparse in AI models. But other clinical information can very easily become overwhelming, which makes it difficult to bring out all useful information [198].
There are also a few issues that may be met during the implementation of the algorithm. Focusing on the robustness of the model in testing and false-negative prediction results will help convince doctors the computer’s effectiveness, promoting its clinical application. In addition, not all hospitals have large GPU servers to compute the results [199] and scaling down the hardware requirements of the model is a necessary task.
In recent years, the research hotspot of digital pathology has shifted from unimodal bi-classification to multimodal multitasking, which has brought many benefits to doctors’ daily work. The automatic detection of identification at the cellular level and tissue level not only saves time in labeling molecular makers experiments, but also reduces the probability of misidentification by physicians in small regions. The direct application to the diagnosis process is semi- or fully automated by replacing the pathologist with a model to complete the process of identifying diagnostic results from images and reducing the stress of repetitive work for the doctors. Diagnosis-related tasks shorten the time to generate pathology reports, and prognosis-related tasks bring more opportunities to treat patients correctly and in a timely manner. The introduction of multimodal data taps into the association between external phenotypes and internal expressions, broadening the understanding of life. Many studies have demonstrated the improvements that AI can bring by comparing the accuracy between doctors and computers or the efficiency of doctors’ diagnosis with or without computer assistance [79, 200–203]. Building a universal model for all problems is not realistic. What AI can do is to reduce the burden on physicians and avoid the lack of treatment due to the lack of resources.
Conclusion
Digital pathology, combined with artificial intelligence, is one of the most promising fields for the delivery of precision medicine. The research problem is gradually switching from the understanding of unimodal data to the cooperation of multimodal data. The integration and evaluation criteria of digital pathology with electronic medical records, CT, or other clinical data for different diseases are urgently needed to be established, as they are important for the enhancement of the integrated use of medical information. Effective supporting systems can be produced based on each step of the complete process of doctors and researchers’ collaboration (scientific problem identification, clinical application criteria setting). Many people fear that this will lead to a regression in doctors’ skills, but, when computers replace basic repetitive tasks, doctors can have more time to focus on solving difficult problems. Currently, hospitals still lack the supporting landing facilities regarding automated pathology diagnosis and treatment systems. However, we cannot stop the research on digital pathology to promote the development of precision medicine. AI-based digital pathology delivers on detection, quantification, classification (such as tumor subtype), prognosis (in terms of combining clinical and genomics information) and prediction. With multiple modalities fusion analysis, artificial intelligence approaches will improve quality and efficiency and transforms pathology data into clinically actionable knowledge, and we can make all this information accessible to precision medicine.
Key Points
The availability of digitizing whole-slide images in various diseases has led to the advent of artificial intelligence (AI) and machine learning tools in digital pathology.
In this review, we critically summarized all kinds of Al-based computational approaches for digital pathology.
The review showed the role that deep learning technology development has in medical assistance and explored the opportunities and challenges of AI.
Acknowledgments
Thanks to Yinghua Li and Shuanlong Che of Guangzhou Kingmed Diagnostics Group Co., Ltd. and Jialiang Yang of Geneis(Beijing)Co.Ltd. for their support.
Author Biographies
Yixuan Qiao is a PhD candidate of the Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences and the University of Chinese Academy of Sciences. Her research interests include multi-modal deep learning.
Lianhe Zhao is a post-doc at the Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences. Her research interests include bioinformatics and deep learning.
Chunlong Luo is a PhD candidate of the Institute of Computing Technology, Chinese Academy of Sciences and the University of Chinese Academy of Sciences. His research interests include medical image analysis.
Yufan Luo is a PhD candidate of the Institute of Computing Technology, Chinese Academy of Sciences and the University of Chinese Academy of Sciences. His research interests include deep learning.
Yang Wu is an assistant investigator at the Institute of Computing Technology, Chinese Academy of Sciences. Her research interests include bioinformatics.
Shengtong Li is an undergraduate student at Massachusetts Institute of Technology. Her research interest include bioinformatics and artificial intelligence.
Dechao Bu is a an associate researcher at the Institute of Computing Technology, Chinese Academy of Sciences. His research interests include bioinformatics and cancer immunology.
Yi Zhao is a professor at the Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences and Shandong First Medical University & Shandong Academy of Medical Sciences. His research interests include bioinformatics and deep learning.
Contributor Information
Yixuan Qiao, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China.
Lianhe Zhao, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.
Chunlong Luo, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China.
Yufan Luo, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China.
Yang Wu, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.
Shengtong Li, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
Dechao Bu, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China.
Yi Zhao, Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Shandong First Medical University & Shandong Academy of Medical Sciences, Shandong Ji’nan 250117, China.
Authors’ contributions
Yixuan Qiao: Data Curation, Methodology, Writing - Original Draft, Writing - Review & Editing; Lianhe Zhao: Methodology, Writing - Original Draft, Writing - Review & Editing; Chunlong Luo: Investigation; Yufan Luo: Investigation; Yang Wu: Investigation; ShengTong Li: Writing - Review & Editing; Dechao Bu: Writing; Yi Zhao: Conceptualization, Methodology, Supervision, Project administration.
Funding
The National Key R&D Program of China (2021YFC2500203); The Strategic Priority Research Program of the Chinese Academy of Sciences (XDA16021400); Zhejiang Provincial Natural Science Foundation of China (LY20C060001); Innovation Fund of Institute of Computing and Technology, CAS (E161080); Zhejiang Provincial Research Center for Cancer Intelligent Diagnosis and Molecular Technology (JBZX-202003); Talent funding of Shandong First Medical University & Shandong Academy of Medical Sciences (922-001003130 185RC).
Competing interests
All the authors declare no competing financial or non-financial interests.
Data availability
The high-definition charts have been made publicly available here: http://bioinfo.org/mdp/. The data used to generate the figures and tables are publicly available to researchers through the National Library of Medicine. Additional inquiries are welcome to the corresponding author.
References
- 1. Konig IR, Fuchs O, Hansen G, et al. What is precision medicine? Eur Respir J 2017;50(4):1700391. [DOI] [PubMed] [Google Scholar]
- 2. Echle A, Rindtorff NT, Brinker TJ, et al. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br J Cancer 2021;124(4):686–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Cardiff RD, Miller CH, Munn RJ. Manual hematoxylin and eosin staining of mouse tissue sections. Cold Spring Harb Protoc 2014;2014(6):655–8. [DOI] [PubMed] [Google Scholar]
- 4. Fischer AH, Jacobson KA, Rose J, et al. Hematoxylin and eosin staining of tissue and cell sections. CSH Protoc 2008;2008:pdb.prot4986. [DOI] [PubMed] [Google Scholar]
- 5. Pantanowitz L, Sharma A, Carter AB, et al. Twenty years of digital pathology: an overview of the road travelled, what is on the horizon, and the emergence of vendor-neutral archives. J Pathol Inform 2018;9:40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44. [DOI] [PubMed] [Google Scholar]
- 7. Chang J, Lan Z, Cheng C, et al. Data uncertainty learning in face recognition. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (Cvpr), p. 5709–18. IEEE, 2020.
- 8. Yurtsever E, Lambert J, Carballo A, et al. A survey of autonomous driving: common practices and emerging technologies. IEEE Access 2020;8:58443–69. [Google Scholar]
- 9. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60–88. [DOI] [PubMed] [Google Scholar]
- 10. Jiang Y, Xie J, Huang W, et al. Tumor immune microenvironment and chemosensitivity signature for predicting response to chemotherapy in gastric cancer. Cancer Immunol Res 2019;7(12):2065–73. [DOI] [PubMed] [Google Scholar]
- 11. Ongsulee P. Artificial intelligence, machine learning and deep learning. In: 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), pp. 1–6. IEEE, 2017. [Google Scholar]
- 12. Bishop CM, Nasrabadi NM. Pattern Recognition and Machine Learning, Vol. 4, No. 4, p. 738. New York: Springer, 2006. [Google Scholar]
- 13. Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn 1997;29(2):131–63. [Google Scholar]
- 14. Quinlan JR. Induction of decision trees. Mach Learn 1986;1(1):81–106. [Google Scholar]
- 15. Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20(3):273–97. [Google Scholar]
- 16. Eraslan G, Avsec Ž, Gagneur J, et al. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 2019;20(7):389–403. [DOI] [PubMed] [Google Scholar]
- 17. Dias R, Torkamani A. Artificial intelligence in clinical and genomic diagnostics. Genome Med 2019;11(1):70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Angenent-Mari NM, Garruss AS, Soenksen LR, et al. A deep learning approach to programmable RNA switches. Nat Commun 2020;11(1):5057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278–324. [Google Scholar]
- 20. He KM, Zhang X, Ren S, et al. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), 2016: p. 770–8. IEEE.
- 21. Hosny A, Parmar C, Quackenbush J, et al. Artificial intelligence in radiology. Nat Rev Cancer 2018;18(8):500–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Z Med Phys 2019;29(2):102–27. [DOI] [PubMed] [Google Scholar]
- 23. Liu Q, Fang L, Yu G, et al. Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nat Commun 2019;10(1):2449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Singh J, Hanson J, Paliwal K, et al. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat Commun 2019;10(1):5407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Zhang Z, Chen P, McGough M, et al. Pathologist-level interpretable whole-slide cancer diagnosis with deep learning. Nat Mach Intell 2019;1(5):236–45. [Google Scholar]
- 26. Amgad M, Atteya LA, Hussein H, et al. Explainable nucleus classification using decision tree approximation of learned embeddings. Bioinformatics 2022;38:513–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Holmstrom O, Linder N, Kaingu H, et al. Point-of-care digital cytology with artificial intelligence for cervical cancer screening in a resource-limited setting. JAMA Netw Open 2021;4(3):e211740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Schneider L, Laiouar-Pedari S, Kuntz S, et al. Integration of deep learning-based image analysis and genomic data in cancer pathology: a systematic review. Eur J Cancer 2022;160:80–91. [DOI] [PubMed] [Google Scholar]
- 29. Kuntz S, Krieghoff-Henning E, Kather JN, et al. Gastrointestinal cancer classification and prognostication from histology using deep learning: systematic review. Eur J Cancer 2021;155:200–15. [DOI] [PubMed] [Google Scholar]
- 30. Calderaro J, Kather JN. Artificial intelligence-based pathology for gastrointestinal and hepatobiliary cancers. Gut 2021;70(6):1183–93. [DOI] [PubMed] [Google Scholar]
- 31. Couture HD, Williams LA, Geradts J, et al. Image analysis with deep learning to predict breast cancer grade, ER status, histologic subtype, and intrinsic subtype. NPJ Breast Cancer 2018;4:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Yang P, Zhai Y, Li L, et al. A deep metric learning approach for histopathological image retrieval. Methods 2020;179:14–25. [DOI] [PubMed] [Google Scholar]
- 33. Shi X, Sapkota M, Xing F, et al. Pairwise based deep ranking hashing for histopathology image classification and retrieval. Patt Recogn 2018;81:14–22. [Google Scholar]
- 34. Vahadane A, Peng T, Sethi A, et al. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans Med Imaging 2016;35(8):1962–71. [DOI] [PubMed] [Google Scholar]
- 35. Vasiljević J, Feuerhake F, Wemmert C, et al. Towards histopathological stain invariance by unsupervised domain augmentation using generative adversarial networks. Neurocomputing 2021;460:277–91. [Google Scholar]
- 36. Tellez D, Litjens G, van der Laak J, et al. Neural image compression for gigapixel histopathology image analysis. IEEE Trans Pattern Anal Mach Intell 2019;43(2):567–78. [DOI] [PubMed] [Google Scholar]
- 37. Yang P, Hong Z, Yin X, et al. Self-supervised visual representation learning for histopathological images. International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 47–57. Springer, Cham, 2021. [Google Scholar]
- 38. Sirinukunwattana K, Raza SEA, Tsang YW, et al. Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Trans Med Imaging 2016;35(5):1196–206. [DOI] [PubMed] [Google Scholar]
- 39. Qaiser T, Tsang YW, Taniyama D, et al. Fast and accurate tumor segmentation of histology images using persistent homology and deep convolutional features. Med Image Anal 2019;55:1–14. [DOI] [PubMed] [Google Scholar]
- 40. Xie Y, Xing F, Shi X, et al. Efficient and robust cell detection: a structured regression approach. Med Image Anal 2018;44:245–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Kumar N, Verma R, Sharma S, et al. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans Med Imaging 2017;36(7):1550–60. [DOI] [PubMed] [Google Scholar]
- 42. Marsh JN, Matlock MK, Kudose S, et al. Deep learning global glomerulosclerosis in transplant kidney frozen sections. IEEE Trans Med Imaging 2018;37(12):2718–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. D'Alfonso TM, Ho DJ, Hanna MG, et al. Multi-magnification-based machine learning as an ancillary tool for the pathologic assessment of shaved margins for breast carcinoma lumpectomy specimens. Mod Pathol 2021;34(8):1487–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Bouteldja N, Klinkhammer BM, Bülow RD, et al. Deep learning-based segmentation and quantification in experimental kidney histopathology. J Am Soc Nephrol 2021;32(1):52–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Lu Z, Zhan X, Wu Y, et al. BrcaSeg: a deep learning approach for tissue quantification and genomic correlations of histopathological images. Genom Proteom Bioinformatics 2021;19(6):1032–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Graham S, Chen H, Gamper J, et al. MILD-net: minimal information loss dilated network for gland instance segmentation in colon histology images. Med Image Anal 2019;52:199–211. [DOI] [PubMed] [Google Scholar]
- 47. Jang HJ, Song IH, Lee SH. Deep learning for automatic subclassification of gastric carcinoma using whole-slide histopathology images. Cancers 2021;13(15):3811. 10.3390/cancers13153811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Naylor P, Lae M, Reyal F, et al. Segmentation of nuclei in histopathology images by deep regression of the distance map. IEEE Trans Med Imaging 2019;38(2):448–59. [DOI] [PubMed] [Google Scholar]
- 49. Zormpas-Petridis K, Noguera R, Ivankovic DK, et al. SuperHistopath: a deep learning pipeline for mapping tumor heterogeneity on low-resolution whole-slide digital histopathology images. Front Oncol 2021;10:586292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Zhang J, Hua Z, Yan K, et al. Joint fully convolutional and graph convolutional networks for weakly-supervised segmentation of pathology images. Med Image Anal 2021;73:102183. [DOI] [PubMed] [Google Scholar]
- 51. Lee S, Amgad M, Mobadersany P, et al. Interactive classification of whole-slide imaging data for cancer researchers. Cancer Res 2021;81(4):1171–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Rivenson Y, Liu T, Wei Z, et al. PhaseStain: the digital staining of label-free quantitative phase microscopy images using deep learning. Light Sci Appl 2019;8:23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Haan K, Zhang Y, Zuckerman JE, et al. Deep learning-based transformation of H&E stained tissues into special stains. Nat Commun 2021;12(1):4884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Fujitani M, Mochizuki Y, Iizuka S, et al. Re-staining pathology images by FCNN. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE, 2019. [Google Scholar]
- 55. Levy JJ, Azizgolshani N, AndersenMJ, Jr, et al. A large-scale internal validation study of unsupervised virtual trichrome staining technologies on nonalcoholic steatohepatitis liver biopsies. Mod Pathol 2021;34(4):808–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Gadermayr M, Gupta L, Appel V, et al. Generative adversarial networks for facilitating stain-independent supervised and unsupervised segmentation: a study on kidney histology. IEEE Trans Med Imaging 2019;38(10):2293–302. [DOI] [PubMed] [Google Scholar]
- 57. Zhang Y, de Haan K, Rivenson Y, et al. Digital synthesis of histological stains using micro-structured and multiplexed virtual staining of label-free tissue. Light Sci Appl 2020;9:78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Kwak MS, Lee HH, Yang JM, et al. Deep convolutional neural network-based lymph node metastasis prediction for colon cancer using histopathological images. Front Oncol 2021;10:619803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Klimov S, Xue Y, Gertych A, et al. Predicting metastasis risk in pancreatic neuroendocrine tumors using deep learning image analysis. Front Oncol 2021;10:593211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Hu Y, Su F, Dong K, et al. Deep learning system for lymph node quantification and metastatic cancer identification from whole-slide pathology images. Gastric Cancer 2021;24(4):868–77. [DOI] [PubMed] [Google Scholar]
- 61. Lin H, Chen H, Graham S, et al. Fast ScanNet: fast and dense analysis of multi-gigapixel whole-slide images for cancer metastasis detection. IEEE Trans Med Imaging 2019;38(8):1948–58. [DOI] [PubMed] [Google Scholar]
- 62. Woerl AC, Eckstein M, Geiger J, et al. Deep learning predicts molecular subtype of muscle-invasive bladder cancer from conventional histopathological slides. Eur Urol 2020;78(2):256–64. [DOI] [PubMed] [Google Scholar]
- 63. Klein S, Quaas A, Quantius J, et al. Deep learning predicts HPV association in oropharyngeal squamous cell carcinomas and identifies patients with a favorable prognosis using regular H&E stains. Clin Cancer Res 2021;27(4):1131–8. [DOI] [PubMed] [Google Scholar]
- 64. Cao R, Yang F, Ma SC, et al. Development and interpretation of a pathomics-based model for the prediction of microsatellite instability in colorectal cancer. Theranostics 2020;10(24):11080–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Kather JN, Pearson AT, Halama N, et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med 2019;25(7):1054–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Echle A, Grabsch HI, Quirke P, et al. Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning. Gastroenterology 2020;159(4):1406–1416.e11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Yamashita R, Long J, Longacre T, et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol 2021;22(1):132–41. [DOI] [PubMed] [Google Scholar]
- 68. Wang Y, Coudray N, Zhao Y, et al. HEAL: an automated deep learning framework for cancer histopathology image analysis. Bioinformatics 2021;37(22):4291–95. 10.1093/bioinformatics/btab380. [DOI] [PubMed] [Google Scholar]
- 69. Fu H, Mi W, Pan B, et al. Automatic pancreatic ductal adenocarcinoma detection in whole slide images using deep convolutional neural networks. Front Oncol 2021;11:665929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Yao H, Zhang X, Zhou X, et al. Parallel structure deep neural network using CNN and RNN with an attention mechanism for breast cancer histology image classification. Cancers (Basel) 2019;11(12):1901. 10.3390/cancers11121901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Syrykh C, Abreu A, Amara N, et al. Accurate diagnosis of lymphoma on whole-slide histopathology images using deep learning. NPJ Digit Med 2020;3:63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Park J, Jang BG, Kim YW, et al. A prospective validation and observer performance study of a deep learning algorithm for pathologic diagnosis of gastric tumors in endoscopic biopsies. Clin Cancer Res 2021;27(3):719–28. [DOI] [PubMed] [Google Scholar]
- 73. Yang H, Chen L, Cheng Z, et al. Deep learning-based six-type classifier for lung cancer and mimics from histopathological whole slide images: a retrospective study. BMC Med 2021;19(1):80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Bulten W, Pinckaers H, van Boven H, et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. Lancet Oncol 2020;21(2):233–41. [DOI] [PubMed] [Google Scholar]
- 75. Ström P, Kartasalo K, Olsson H, et al. Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study. Lancet Oncol 2020;21(2):222–32. [DOI] [PubMed] [Google Scholar]
- 76. Chuang WY, Chang SH, Yu WH, et al. Successful identification of nasopharyngeal carcinoma in nasopharyngeal biopsies using deep learning. Cancers 2020;12(2):507. 10.3390/cancers12020507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77. Sheikh TS, Lee Y, Cho M. Histopathological classification of breast cancer images using a multi-scale input and multi-feature network. Cancers 2020;12(8):2031. 10.3390/cancers12082031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Steinbuss G, Kriegsmann M, Zgorzelski C, et al. Deep learning for the classification of non-Hodgkin lymphoma on histopathological images. Cancers 2021;13(10):2419. 10.3390/cancers13102419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Bulten W, Balkenhol M, Belinga JJA, et al. Artificial intelligence assistance significantly improves Gleason grading of prostate biopsies by pathologists. Mod Pathol 2021;34(3):660–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Perincheri S, Levi AW, Celli R, et al. An independent assessment of an artificial intelligence system for prostate cancer detection shows strong diagnostic accuracy. Mod Pathol 2021;34(8):1588–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Nir G, Hor S, Karimi D, et al. Automatic grading of prostate cancer in digitized histopathology images: learning from multiple experts. Med Image Anal 2018;50:167–80. [DOI] [PubMed] [Google Scholar]
- 82. Lu MY, Williamson DFK, Chen TY, et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat Biomed Eng 2021;5(6):555–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Wang S, Zhu Y, Yu L, et al. RMDL: recalibrated multi-instance deep learning for whole slide gastric image classification. Med Image Anal 2019;58:101549. [DOI] [PubMed] [Google Scholar]
- 84. Dov D, Kovalsky SZ, Assaad S, et al. Weakly supervised instance learning for thyroid malignancy prediction from whole slide cytopathology images. Med Image Anal 2021;67:101814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. Wang KS, Yu G, Xu C, et al. Accurate diagnosis of colorectal cancer based on histopathology images using artificial intelligence. BMC Med 2021;19(1):76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Yoshida H, Shimazu T, Kiyuna T, et al. Automated histological classification of whole-slide images of gastric biopsy specimens. Gastric Cancer 2018;21(2):249–57. [DOI] [PubMed] [Google Scholar]
- 87. Mercan C, Aksoy S, Mercan E, et al. Multi-instance multi-label learning for multi-class classification of whole slide breast histopathology images. IEEE Trans Med Imaging 2018;37(1):316–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Wang X, Chen H, Gan C, et al. Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Trans Cybern 2020;50(9):3950–62. [DOI] [PubMed] [Google Scholar]
- 89. Pirovano A, Almeida LG, Ladjal S, et al. Computer-aided diagnosis tool for cervical cancer screening with weakly supervised localization and detection of abnormalities using adaptable and explainable classifier. Med Image Anal 2021;73:102167. [DOI] [PubMed] [Google Scholar]
- 90. Pinckaers H, Bulten W, van der Laak J, et al. Detection of prostate cancer in whole-slide images through end-to-end training with image-level labels. IEEE Trans Med Imaging 2021;40(7):1817–26. [DOI] [PubMed] [Google Scholar]
- 91. Campanella G, Hanna MG, Geneslaw L, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 2019;25(8):1301–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Li D, Bledsoe JR, Zeng Y, et al. A deep learning diagnostic platform for diffuse large B-cell lymphoma with high accuracy across multiple hospitals. Nat Commun 2020;11(1):6004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93. BenTaieb A, Li-Chang H, Huntsman D, et al. A structured latent model for ovarian carcinoma subtyping from histopathology slides. Med Image Anal 2017;39:194–205. [DOI] [PubMed] [Google Scholar]
- 94. Yu KH, Zhang C, Berry GJ, et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat Commun 2016;7:12474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95. Kulkarni PM, Robinson EJ, Sarin Pradhan J, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res 2020;26(5):1126–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96. Thagaard J, Stovgaard ES, Vognsen LG, et al. Automated quantification of sTIL density with H&E-based digital image analysis has prognostic potential in triple-negative breast cancers. Cancers 2021;13(12):3050. 10.3390/cancers13123050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Koyuncu CF, Lu C, Bera K, et al. Computerized tumor multinucleation index (MuNI) is prognostic in p16+ oropharyngeal carcinoma. J Clin Invest 2021;131(8). 10.1172/JCI145488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98. Kather JN, Krisam J, Charoentong P, et al. Predicting survival from colorectal cancer histology slides using deep learning: a retrospective multicenter study. PLoS Med 2019;16(1):e1002730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99. Courtiol P, Maussion C, Moarii M, et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat Med 2019;25(10):1519–25. [DOI] [PubMed] [Google Scholar]
- 100. Sun M, Zhou W, Qi X, et al. Prediction of BAP1 expression in uveal melanoma using densely-connected deep classification networks. Cancers 2019;11(10):1579. 10.3390/cancers11101579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101. Yao J, Zhu X, Jonnagaddala J, et al. Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks. Med Image Anal 2020;65:101789. [DOI] [PubMed] [Google Scholar]
- 102. Skrede O-J, de Raedt S, Kleppe A, et al. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. The Lancet 2020;395(10221):350–60. [DOI] [PubMed] [Google Scholar]
- 103. Shi JY, Wang X, Ding GY, et al. Exploring prognostic indicators in the pathological images of hepatocellular carcinoma based on deep learning. Gut 2021;70(5):951–61. [DOI] [PubMed] [Google Scholar]
- 104. Ghosh A, Sirinukunwattana K, Khalid Alham N, et al. The potential of artificial intelligence to detect lymphovascular invasion in testicular cancer. Cancers 2021;13(6):1325. 10.3390/cancers13061325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105. Corredor G, Wang X, Zhou Y, et al. Spatial architecture and arrangement of tumor-infiltrating lymphocytes for predicting likelihood of recurrence in early-stage non-small cell lung cancer. Clin Cancer Res 2019;25(5):1526–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106. Saito A, Toyoda H, Kobayashi M, et al. Prediction of early recurrence of hepatocellular carcinoma after resection using digital pathology images assessed by machine learning. Mod Pathol 2021;34(2):417–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107. Shim WS, Yim K, Kim TJ, et al. DeepRePath: identifying the prognostic features of early-stage lung adenocarcinoma using multi-scale pathology images and deep convolutional neural networks. Cancers 2021;13(13):3308. 10.3390/cancers13133308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108. Foersch S, Eckstein M, Wagner DC, et al. Deep learning for diagnosis and survival prediction in soft tissue sarcoma. Ann Oncol 2021;32(9):1178–87. [DOI] [PubMed] [Google Scholar]
- 109. Wang Y, Acs B, Robertson S, et al. Improved breast cancer histological grading using deep learning. Ann Oncol 2022;33(1):89–98. [DOI] [PubMed] [Google Scholar]
- 110. Zadeh Shirazi A, McDonnell MD, Fornaciari E, et al. A deep convolutional neural network for segmentation of whole-slide pathology images identifies novel tumor cell-perivascular niche interactions that are associated with poor survival in glioblastoma. Br J Cancer 2021;125(3):337–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111. Yu KH, Hu V, Wang F, et al. Deciphering serous ovarian carcinoma histopathology and platinum response by convolutional neural networks. BMC Med 2020;18(1):236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112. Nearchou IP, Ueno H, Kajiwara Y, et al. Automated detection and classification of desmoplastic reaction at the colorectal tumour front using deep learning. Cancers 2021;13(7):1615. 10.3390/cancers13071615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113. Wang X, Chen Y, Gao Y, et al. Predicting gastric cancer outcome from resected lymph node histopathology images using deep learning. Nat Commun 2021;12(1):1637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114. Chou M, Illa-Bochaca I, Minxi B, et al. Optimization of an automated tumor-infiltrating lymphocyte algorithm for improved prognostication in primary melanoma. Mod Pathol 2021;34(3):562–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115. Narayanan PL, Raza SEA, Hall AH, et al. Unmasking the immune microecology of ductal carcinoma in situ with deep learning. NPJ Breast Cancer 2021;7(1):19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116. Acs B, Ahmed FS, Gupta S, et al. An open source automated tumor infiltrating lymphocyte algorithm for prognosis in melanoma. Nat Commun 2019;10(1):5440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117. Heindl A, Sestak I, Naidoo K, et al. Relevance of spatial heterogeneity of immune infiltration for predicting risk of recurrence after endocrine therapy of ER+ breast cancer. J Natl Cancer Inst 2018;110(2):166–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118. Bhargava HK, Leo P, Elliott R, et al. Computationally derived image signature of stromal morphology is prognostic of prostate cancer recurrence following prostatectomy in African American patients. Clin Cancer Res 2020;26(8):1915–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119. Zhao L, Dong Q, Luo C, et al. DeepOmix: A scalable and interpretable multi-omics deep learning framework and application in cancer survival analysis. Computational and structural biotechnology journal 2021;19:2719–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120. Chen X, Chen S, Song S, et al. Cell type annotation of single-cell chromatin accessibility data via supervised Bayesian embedding. Nat Mach Intell 2022;4(2):116–26. [Google Scholar]
- 121. Liu Q, Chen S, Jiang R, et al. Simultaneous deep generative modeling and clustering of single cell genomic data. Nat Mach Intell 2021;3(6):536–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122. Chen SQ, Yan G, Zhang W, et al. RA3 is a reference-guided approach for epigenetic characterization of single cells. Nat Commun 2021;12(1):2177. 10.1038/s41467-021-22495-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123. Liu Q, Xu J, Jiang R, et al. Density estimation using deep generative neural networks. Proc Natl Acad Sci U S A 2021;118(15):e2101344118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124. Chelebian E, Avenel C, Kartasalo K, et al. Morphological features extracted by AI associated with spatial transcriptomics in prostate cancer. Cancers 2021;13(19):4837. 10.3390/cancers13194837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125. Sirinukunwattana K, Domingo E, Richman SD, et al. Image-based consensus molecular subtype (imCMS) classification of colorectal cancer using deep learning. Gut 2021;70(3):544–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126. Diao JA, Wang JK, Chui WF, et al. Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes. Nat Commun 2021;12(1):1613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127. Jin L, Shi F, Chun Q, et al. Artificial intelligence neuropathologist for glioma classification using deep learning on hematoxylin and eosin stained slide images and molecular markers. Neuro Oncol 2021;23(1):44–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128. Qu H, Zhou M, Yan Z, et al. Genetic mutation and biological pathway prediction based on whole slide images in breast carcinoma using deep learning. NPJ Precis Oncol 2021;5(1):87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129. Valieris R, Amaro L, Osório CABT, et al. Deep learning predicts underlying features on pathology images with therapeutic relevance for breast and gastric cancer. Cancers 2020;12(12):3687. 10.3390/cancers12123687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130. Zheng H, Momeni A, Cedoz PL, et al. Whole slide images reflect DNA methylation patterns of human tumors. NPJ Genom Med 2020;5:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131. Tavolara TE, Niazi MKK, Gower AC, et al. Deep learning predicts gene expression as an intermediate data modality to identify susceptibility patterns in Mycobacterium tuberculosis infected diversity outbred mice. EBioMedicine 2021;67:103388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132. Chen M, Zhang B, Topatana W, et al. Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning. NPJ Precis Oncol 2020;4:14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133. Huang K, Mo Z, Zhu W, et al. Prediction of target-drug therapy by identifying gene mutations in lung cancer with histopathological stained image and deep learning techniques. Front Oncol 2021;11:642945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med 2018;24(10):1559–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135. Noorbakhsh J, Farahmand S, Foroughi pour A, et al. Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images. Nat Commun 2020;11(1):6367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136. Bian C, Wang Y, Lu Z, et al. ImmunoAIzer: a deep learning-based computational framework to characterize cell distribution and gene mutation in tumor microenvironment. Cancers 2021;13(7):1659. 10.3390/cancers13071659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A 2018;115(13):E2970–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138. Chen RJ, Lu MY, Weng WH, et al. Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4015–25. IEEE, 2021.
- 139. Zhang F, Zhong L-Z, Zhao X, et al. A deep-learning-based prognostic nomogram integrating microscopic digital pathology and macroscopic magnetic resonance images in nasopharyngeal carcinoma: a multi-cohort study. Ther Adv Med Oncol 2020;12:1758835920971416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140. Naik N, Madani A, Esteva A, et al. Deep learning-enabled breast cancer hormonal receptor status determination from base-level H&E stains. Nat Commun 2020;11(1):5727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141. Lepine C, Klein P, Voron T, et al. Histological severity risk factors identification in juvenile-onset recurrent respiratory papillomatosis: how immunohistochemistry and AI algorithms can help? Front Oncol 2021;11:596499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142. Chatrian A, Colling RT, Browning L, et al. Artificial intelligence for advance requesting of immunohistochemistry in diagnostically uncertain prostate biopsies. Mod Pathol 2021;34(9):1780–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143. Martino F, Varricchio S, Russo D, et al. A machine-learning approach for the assessment of the proliferative compartment of solid tumors on hematoxylin-eosin-stained sections. Cancers (Basel) 2020;12(5):1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144. Schmauch B, Romagnoni A, Pronier E, et al. A deep learning model to predict RNA-Seq expression of tumours from whole slide images. Nat Commun 2020;11(1):3877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145. AbdulJabbar K, Ahmed Raza SE, Rosenthal R, et al. Geospatial immune variability illuminates differential evolution of lung adenocarcinoma. Nat Med 2020;26(7):1054–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146. Rawat RR, Ruderman D, Macklin P, et al. Correlating nuclear morphometric patterns with estrogen receptor status in breast cancer pathologic specimens. NPJ Breast Cancer 2018;4:32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147. Van Eycke YR, Balsat C, Verset L, et al. Segmentation of glandular epithelium in colorectal tumours to automatically compartmentalise IHC biomarker quantification: a deep learning approach. Med Image Anal 2018;49:35–45. [DOI] [PubMed] [Google Scholar]
- 148. Su F, Sun Y, Hu Y, et al. Development and validation of a deep learning system for ascites cytopathology interpretation. Gastric Cancer 2020;23(6):1041–50. [DOI] [PubMed] [Google Scholar]
- 149. Bao G, Wang X, Xu R, et al. PathoFusion: an open-source AI framework for recognition of pathomorphological features and mapping of immunohistochemical data. Cancers 2021;13(4):617. 10.3390/cancers13040617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150. Race AM, Sutton D, Hamm G, et al. Deep learning-based annotation transfer between molecular imaging modalities: an automated workflow for multimodal data integration. Anal Chem 2021;93(6):3061–71. [DOI] [PubMed] [Google Scholar]
- 151. Carneiro G, Peng T, Bayer C, et al. Automatic quantification of tumour hypoxia from multi-modal microscopy images using weakly-supervised learning methods. IEEE Trans Med Imaging 2017;36(7):1405–17. [DOI] [PubMed] [Google Scholar]
- 152. Jayapandian CP, Chen Y, Janowczyk AR, et al. Development and evaluation of deep learning-based segmentation of histologic structures in the kidney cortex with multiple histologic stains. Kidney Int 2021;99(1):86–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153. Gehrung M, Crispin-Ortuzar M, Berman AG, et al. Triage-driven diagnosis of Barrett's esophagus for early detection of esophageal adenocarcinoma using deep learning. Nat Med 2021;27(5):833–41. [DOI] [PubMed] [Google Scholar]
- 154. Rathore S, Niazi T, Iftikhar MA, et al. Glioma grading via analysis of digital pathology images using machine learning. Cancers 2020;12(3):578. 10.3390/cancers12030578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 155. Chen S, Zhang N, Jiang L, et al. Clinical use of a machine learning histopathological image signature in diagnosis and survival prediction of clear cell renal cell carcinoma. Int J Cancer 2021;148(3):780–90. [DOI] [PubMed] [Google Scholar]
- 156. Brinker TJ, Kiehl L, Schmitt M, et al. Deep learning approach to predict sentinel lymph node status directly from routine histology of primary melanoma tumours. Eur J Cancer 2021;154:227–34. [DOI] [PubMed] [Google Scholar]
- 157. Pei L, Jones KA, Shboul ZA, et al. Deep neural network analysis of pathology images with integrated molecular data for enhanced glioma classification and grading. Front Oncol 2021;11:668694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158. Yu KH, Berry GJ, Rubin DL, et al. Association of omics features with histopathology patterns in lung adenocarcinoma. Cell Syst 2017;5(6):620–627.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159. Graham S, Vu QD, Raza SEA, et al. Hover-net: simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med Image Anal 2019;58:101563. [DOI] [PubMed] [Google Scholar]
- 160. Wang X, Fang Y, Yang S, et al. A hybrid network for automatic hepatocellular carcinoma segmentation in H&E-stained whole slide images. Med Image Anal 2021;68:101914. [DOI] [PubMed] [Google Scholar]
- 161. Wang H, Jiang Y, Li B, et al. Single-cell spatial analysis of tumor and immune microenvironment on whole-slide image reveals hepatocellular carcinoma subtypes. Cancers 2020;12(12):3562. 10.3390/cancers12123562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162. Ciga O, Martel AL. Learning to segment images with classification labels. Med Image Anal 2021;68:101912. [DOI] [PubMed] [Google Scholar]
- 163. Schrammen PL, Ghaffari Laleh N, Echle A, et al. Weakly supervised annotation-free cancer detection and prediction of genotype in routine histopathology. J Pathol 2022;256(1):50–60. [DOI] [PubMed] [Google Scholar]
- 164. Lu MY, Chen TY, Williamson DFK, et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 2021;594(7861):106–10. [DOI] [PubMed] [Google Scholar]
- 165. Riasatian A, Babaie M, Maleki D, et al. Fine-tuning and training of densenet for histopathology image representation using TCGA diagnostic slides. Med Image Anal 2021;70:102032. [DOI] [PubMed] [Google Scholar]
- 166. Kalra S, Tizhoosh HR, Choi C, et al. Yottixel - an image search engine for large archives of histopathology whole slide images. Med Image Anal 2020;65:101757. [DOI] [PubMed] [Google Scholar]
- 167. Sobhani F, Robinson R, Hamidinekoo A, et al. Artificial intelligence and digital pathology: opportunities and implications for immuno-oncology. Biochim Biophys Acta Rev Cancer 2021;1875(2):188520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 168. Duggento A, Conti A, Mauriello A, et al. Deep computational pathology in breast cancer. Semin Cancer Biol 2021;72:226–37. [DOI] [PubMed] [Google Scholar]
- 169. Network, C.G.A.R . Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008;455(7216):1061–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170. Roux L, Racoceanu D, Loménie N, et al. Mitosis detection in breast cancer histological images an ICPR 2012 contest. J Pathol Inform 2013;4:8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171. Roux L, Racoceanu D, Capron F, et al. Mitos & atypia. Image Pervasive Access Lab (IPAL), Agency Sci., Technol. & Res. Inst. Infocom Res., Singapore, Tech. Rep 2014;11–8. [Google Scholar]
- 172. Sirinukunwattana K, Pluim JPW, Chen H, et al. Gland segmentation in colon histology images: the glas challenge contest. Med Image Anal 2017;35:489–502. [DOI] [PubMed] [Google Scholar]
- 173. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 2017;318(22):2199–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 174. Veta M, Heng YJ, Stathonikos N, et al. Predicting breast tumor proliferation from whole-slide images: the TUPAC16 challenge. Med Image Anal 2019;54:111–21. [DOI] [PubMed] [Google Scholar]
- 175. Litjens G, Bandi P, Ehteshami Bejnordi B, et al. 1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset. Gigascience 2018;7(6):giy065. 10.1093/gigascience/giy065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176. Aresta G, Araújo T, Kwok S, et al. BACH: grand challenge on breast cancer histology images. Med Image Anal 2019;56:122–39. [DOI] [PubMed] [Google Scholar]
- 177. Veeling BS, Linmans J, Winkens J, et al. Rotation equivariant CNNs for digital pathology. Medical Image Computing and Computer Assisted Intervention - Miccai 2018, Pt Ii, 2018. 11071: p. 210–8. Springer, Cham. [Google Scholar]
- 178. Borovec J, Kybic J, Arganda-Carreras I, et al. ANHIR: automatic non-rigid histological image registration challenge. IEEE Trans Med Imaging 2020;39(10):3042–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 179. Amgad M, Elfandy H, Hussein H, et al. Structured crowdsourcing enables convolutional segmentation of histology images. Bioinformatics 2019;35(18):3461–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180. Li JH, Yang S, Huang X, et al. Signet ring cell detection with a semi-supervised learning framework. Information Processing in Medical Imaging, IPMI 2019, 11492. p. 842–54. Springer, Cham. [Google Scholar]
- 181. Bertram CA, Aubreville M, Marzahl C, et al. A large-scale dataset for mitotic figure assessment on whole slide images of canine cutaneous mast cell tumor. Sci Data 2019;6:274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182. Kim YJ, Jang H, Lee K, et al. PAIP 2019: liver cancer segmentation challenge. Med Image Anal 2021;67:101854. [DOI] [PubMed] [Google Scholar]
- 183. Conde-Sousa E, Vale J, Feng M, et al. HEROHE challenge: assessing HER2 status in breast cancer without immunohistochemistry or in situ hybridization. arXiv preprint arXiv:2111.04738. 2021.
- 184. Aubreville M, Bertram CA, Donovan TA, et al. A completely annotated whole slide image dataset of canine breast cancer to aid human breast cancer research. Sci Data 2020;7(1):417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 185. Verma R, Kumar N, Patil A, et al. MoNuSAC2020: a multi-organ nuclei segmentation and classification challenge. IEEE Trans Med Imaging 2021;40(12):3413–23. [DOI] [PubMed] [Google Scholar]
- 186. Bulten W, Kartasalo K, Chen PHC, et al. Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nature medicine 2020;28(1):154–63. 10.5281/zenodo. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 187. Xu F, Zhu C, Tang W, et al. Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides. Front Oncol 2021;11:759007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188. Aubreville M, Bertram C, Veta M, et al. Quantifying the scanner-induced domain gap in mitosis detection. arXiv preprint arXiv:2103.16515. 2021.
- 189. Amgad M, Atteya LA, Hussein H, et al. NuCLS: a scalable crowdsourcing, deep learning approach and dataset for nucleus classification, localization and segmentation. arXiv preprint arXiv:2102.09099. 2021. [DOI] [PMC free article] [PubMed]
- 190. Cuadrado-Godia E, Dwivedi P, Sharma S, et al. Cerebral small vessel disease: a review focusing on pathophysiology, biomarkers, and machine learning strategies. J Stroke 2018;20(3):302–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191. Han C, Lin J, Mai J, et al. Multi-layer pseudo-supervision for histopathology tissue semantic segmentation using patch-level classification labels. Medical Image Analysis. 2021;102487. [DOI] [PubMed]
- 192. Graham S, Jahanifar M, Dang VQ, et al. CoNIC: colon nuclei identification and counting challenge 2022. arXiv preprint arXiv:2111.14485. 2021.
- 193. Bera K, Schalper KA, Rimm DL, et al. Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology. Nat Rev Clin Oncol 2019;16(11):703–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 194. Howard FM, Dolezal J, Kochanny S, et al. The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nat Commun 2021;12(1):4423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 195. Boschman J, Farahani H, Darbandsari A, et al. The utility of color normalization for AI-based diagnosis of hematoxylin and eosin-stained pathology images. J Pathol 2022;256(1):15–24. [DOI] [PubMed] [Google Scholar]
- 196. Tellez D, Litjens G, Bándi P, et al. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med Image Anal 2019;58:101544. [DOI] [PubMed] [Google Scholar]
- 197. McCombe KD, Craig SG, Viratham Pulsawatdi A, et al. HistoClean: open-source software for histological image pre-processing and augmentation to improve development of robust convolutional neural networks. Comput Struct Biotechnol J 2021;19:4840–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198. Hohn J, Krieghoff-Henning E, Jutzi TB, et al. Combining CNN-based histologic whole slide image analysis and patient data to improve skin cancer classification. Eur J Cancer 2021;149:94–101. [DOI] [PubMed] [Google Scholar]
- 199. Chen CL, Chen CC, Yu WH, et al. An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning. Nat Commun 2021;12(1):1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 200. Raciti P, Sue J, Ceballos R, et al. Novel artificial intelligence system increases the detection of prostate cancer in whole slide images of core needle biopsies. Mod Pathol 2020;33(10):2058–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201. Hekler A, Utikal JS, Enk AH, et al. Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images. Eur J Cancer 2019;118:91–6. [DOI] [PubMed] [Google Scholar]
- 202. Hanna MG, Reuter VE, Ardon O, et al. Validation of a digital pathology system including remote review during the COVID-19 pandemic. Mod Pathol 2020;33(11):2115–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 203. Silva LM, Pereira EM, Salles PGO, et al. Independent real-world application of a clinical-grade automated prostate cancer detection system. J Pathol 2021;254(2):147–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 204. Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, 2009, 248–55. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The high-definition charts have been made publicly available here: http://bioinfo.org/mdp/. The data used to generate the figures and tables are publicly available to researchers through the National Library of Medicine. Additional inquiries are welcome to the corresponding author.