Abstract
Background
Images obtained from stimulated Raman scattering can be used to identify histomorphologically relevant information intraoperatively. In order to leverage deep learning algorithms for distinguishing tumoral and non-tumoral tissue, data preprocessing remains a crucial task and may affect the classification performance. To date, the effect of different preprocessing techniques on deep learning algorithm performance is unclear. This study aims to make a contribution to closing this knowledge gap.
Methods
To investigate the influence of different preprocessing techniques of images obtained from stimulated Raman scattering, six deep learning architectures (VGG19, ResNet50, InceptionResNetV2, Xception, ConvNeXt and Vision Transformer) and five different preprocessing procedures were compared. For this, annotated datasets comprising 542 images of tissue samples obtained from patients with oral squamous cell carcinoma and non-small cell lung carcinoma were used for network training. Each network was trained five times for 40 epochs. Performance metrics balanced accuracy, precision, recall and F1-score were recorded. Class activation and attention maps were used to highlight on which input pixels a prediction is based.
Results
A scaling of the original pixel values of stimulated Raman scattering images to the range [0, 1] yielded a higher and more stable overall classification performance across the neural networks when compared to more sophisticated and computationally expensive methods [=0.8327; standard deviation (SD) =0.0622 on scaled dataset and =0.7213 (SD =0.2315) on complex preprocessed dataset; P≤0.05]. Absolute performance was best on stimulated Raman histology images (=0.8478; SD =0.1487).
Conclusions
This study shows that preprocessing of pixel values of stimulated Raman scattering images can have a great impact on the performance and the stability of deep learning algorithms when applied for classification of cancer tissue.
Keywords: Stimulated Raman scattering microscopy (SRS microscopy), image processing, deep learning (DL), stimulated Raman histology (SRH)
Introduction
Hematoxylin and eosin staining (H&E) represents the gold standard in histopathological analyses. Using H&E for tissue staining makes cellular and subcellular morphology information visible and assessable for trained pathologists (1). In order to generate H&E-stained slides, an elaborate and time-intensive workflow comprising the macroscopic cutting of the specimen, formalin fixation, paraffin embedding, cutting on the microtome, deparaffination and staining of the resulting slice is required. Stimulated Raman histology (SRH) (2,3) represents an approach enabling intraoperative pathological assessment of tissue specimens, bypassing the entire traditional workflow. SRH utilizes stimulated Raman scattering (SRS) microscopy to generate images resembling H&E-stained tissue sections. With the NIO Laser Imaging System (Invenio Imaging Inc., Santa Clara, CA, USA), this can be achieved intraoperatively from fresh tissue specimens without preprocessing. Based on two Raman shifts at wavenumbers k1=2,845 cm−1 and k2=2,930 cm−1, representing the inelastic scattering of photons at CH2 and CH3 groups, the spatial distribution of cellular molecules such as lipids, proteins and DNA can be depicted. A proprietary coloring algorithm within the NIO system (software version 1.6.0) transforms the SRS images into SRH images, enabling visual assessment by a pathologist. This visualization is derived from H&E morphology and mimics traditional H&E staining.
SRH has been successfully deployed in various clinical disciplines comprising neurosurgery (4-6), and urology (7). With the advent of the digitization of tissue slides as whole slide images (WSI), deep learning (DL) has emerged as a promising tool for automation and diagnostic guidance to support pathologists in their decision-making process. Although only a few DL applications have matured to implementation in clinical practice (8), many studies show the potential of DL in histopathology. Applications range from diagnosis and prediction of survival of colorectal cancer (9,10) to the prediction of gene expression and classification from breast cancer histopathology images (11,12). A current overview of DL for histopathologic tasks can be found in (13).
The combination of SRS or SRH and DL led to a drastic reduction in diagnosis time, as demonstrated for gastric cancer (14), glioma recurrence (15) and Gleason scoring of prostate biopsies (16). Other works leveraged SRS images to count cells with a U-Net-like neural network (17), to quantify the uptake of drug molecules in skin (18) or to perform DL-based histological diagnosis of breast core-needle biopsies (19). When considering whole Raman spectra instead of using two discrete Raman shifts, there are many methods for spectral preprocessing (20). Wahl et al. (21) could show that the spectral appearance of Raman features from brain tumor depends on the chosen preprocessing method. However, when considering SRS images using specific Raman shifts, a comparison of the algorithmic performance depending on different preprocessing techniques is lacking. In general, rationales of how exactly raw pixel values of SRS images are altered before being input to a DL algorithm are rare.
The objective of this study was to compare the performance of several convolutional neural networks (CNNs) and a vision transformer trained on SRS and SRH images in order to investigate the dependency of the performance of the neural networks on different preprocessing of SRS images. Furthermore, class activation maps (CAMs) and attention maps are used to highlight relevant input features for a given prediction of the neural networks. We present this article in accordance with the CLEAR reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2608/rc).
Methods
Dataset
The dataset comprises 542 SRS and SRH images with a pixel size of 467 nm obtained from eight patients (83 images, collected at the Department of Oral and Maxillofacial Surgery, Medical Center, University of Freiburg, Germany) with oral squamous cell carcinoma (OSCC) and 136 patients (459 images, collected at the Department for Thoracic Surgery, Medical Center, University of Freiburg, Germany) with non-small cell lung carcinoma (NSCLC). Regions of interest on SRH images were annotated for tumoral and non-tumoral tissue by an assistant pathologist who was supervised and controlled by a board-certified second pathologist using QuPath Version v.4.3 (22). Subsequently, all annotations were transferred from SRH images to SRS images. The images were divided into tiles measuring 250×250 pixels. If a tile spatially overlaps to at least 99% with an annotation, the tile is labeled with the respective annotation. This results in 31,788 tiles labeled as tumoral and 16,387 tiles labeled as non-tumoral. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the local ethics committee of Ethik-Kommission der Albert-Ludwigs-Universität Freiburg (No. 22-1037 and No. 22-1322_2-S1) and written informed consent was taken from all individual participants.
Preprocessing of SRH images
Pixel intensities of SRH images were scaled to the range [0, 1] and saved as 16-bit floating point numbers. Due to the absence of stain or intensity variations, color normalization was not necessary.
Preprocessing of SRS images
SRS images were acquired in successive line scans with a width of 1,000 pixels and an overlap of 50 pixels between the individual line scans. The laser intensity is highest at the center of each line scan and drops to each side. Overlap pixels were removed and line scans were stitched together to generate the resulting SRS image. Since each pixel of SRS images consists of two values corresponding to the respective wavelength shift k1 and k2 but DL algorithms for image classification typically require three channels, a third image channel was populated with the spectral difference CH3-CH2 for each pixel, highlighting cell nuclei and protein-rich structures, as described in Hollon et al. (23).
We compared five different ways of preprocessing the original SRS pixel values:
Complex preprocessing according to a combined approach from (23,24): The bottom and top 3% of pixel intensities were cropped and mean zero centered. For correcting intensity variations, for example caused by laser intensity fluctuations, field flattening was applied involving a Gaussian filter to smooth the image, subtracting the smoothed image from the original and adding back the mean intensity of the original image to maintain brightness. Since low signal intensities might still be underrepresented and to enhance contrast, histogram equalization was applied which helps to improve the visibility of low-contrast structures via a stretching of pixel intensity distributions. This was followed by a registration of the CH3 to the CH2 channel with the functions phaseCorrelate and warpAffine from OpenCV (25). Resulting pixel intensities were scaled to the interval [0, 1].
Histogram matching: The intensity distribution of the first line scan of an image served as reference for the pixel intensity distribution of the respective image and the histograms of all other line scans were altered to match the histogram of the reference line scan using exposure.match_histograms from scikit-image (26). Pixel intensities were scaled to the interval [0, 1]. With histogram matching, intensity variation patterns of the first line scan are transferred to all other line scans. A neural network might recognize these periodic intensity variation patterns as noise and therefore does not assign any predictive value to these patterns.
Histogram equalization: The pixel intensity distribution for each line scan was equalized using exposure.equalize_adapthist from scikit-image (26) with a clip limit of 0.01. Pixel intensities were scaled to the interval [0, 1]. Histogram equalization ensures that underrepresented structures which may hold predictive potential are not overshadowed by overrepresented structures and its application therefore might improve the overall prediction performance of the neural network.
Blurring: Each line scan was blurred with a Gaussian filter using ndimage.gaussian_filter from SciPy (27) with sigma =5. Pixel intensities were scaled to the interval [0, 1]. The choice of a rather high standard deviation (SD) of sigma =5 was driven by the rationale that large line scan intensity variations need to be addressed with a strong smoothing of pixel intensities. Additionally, strong blurring does not necessarily compromise predictive power for a neural network, although it surely reduces diagnostic information for a human observer. It therefore remains to be seen to what extent a strong blurring reduces the performance of the neural networks.
Scaling: Original pixel intensities were scaled to the interval [0, 1] without any preceding manipulation. Here, neural networks need to find predictive pixel patterns all by themselves whereas pixel intensity variations are hypothesized to be justifiably ignored.
All pixel intensities were stored as 16-bit floating point numbers. Figure 1 shows an exemplary SRH image together with the corresponding SRS images after the different preprocessings. Figure 2 shows SRH and SRS tumoral and non-tumoral tiles.
Figure 1.
Exemplary SRH (A) and corresponding SRS images (B-F) of a slide with a sample of oral squamous cell carcinoma. SRS images are colored with the viridis color map, depict the CH2 channel and were preprocessed with complex preprocessing (B), histogram matching (C), histogram equalization (D), blurring (E) and scaling (F). SRH, stimulated Raman histology; SRS, stimulated Raman scattering.
Figure 2.
Exemplary SRH tiles of non-tumoral (A1 and B1) and tumoral (C1 and D1) tissue with corresponding SRS tiles (A2-A6, B2-B6, C2-C6, D2-D6) of a slide containing samples of oral squamous cell carcinoma. Shown SRS tiles are colored with the viridis color map, depict the CH2 channel and tiles were preprocessed with complex preprocessing (A2, B2, C2, D2), histogram matching (A3, B3, C3, D3), histogram equalization (A4, B4, C4, D4), blurring (A5, B5, C5, D5) and scaling (A6, B6, C6, D6). SRH, stimulated Raman histology; SRS, stimulated Raman scattering.
Data split and class distributions
All neural networks were trained on approximately 75% and tested on approximately 25% of the dataset. The dataset was split into training, validation, and test set at the image level, with the split designed to approximate the desired number of tiles for each class within each subset (Figure 3). The desired number of tiles of each subset was based on the overall class distribution. However, since the distribution of tiles is uneven across images, the target number of tiles for each subset could only be approximated. The original class distribution consists of 65.98% non-tumoral and 34.02% tumoral tiles and the class distributions of the training, validation and test set remain close to the original (training set: 64.64% non-tumoral and 35.36% tumoral; validation set: 67.22% non-tumoral and 32.78% tumoral; test set: 68.67% non-tumoral and 31.33% tumoral).
Figure 3.
Class distribution for training, validation and test sets.
Algorithms
VGG19 (28), ResNet50 (29), InceptionResNetV2 (30), Xception (31), ConvNeXt-L (32) and one Vision Transformer (ViT) (33) were compared. With this selection, a wide range of algorithms spanning from standard deep CNNs (VGG19 and ResNet50) to more complex architectures with depthwise separable convolutions (InceptionResNetV2 and Xception), as well as the shift towards incorporating attention mechanisms (ConvNeXt-L and ViT) can be covered. A grid search in the hyperparameter space for learning rate and number of epochs yielded the best performance on the validation set for a learning rate of 0.0001 and 40 epochs. All computations were performed using Python 3.9.16 and Tensorflow 2.6.0 (34) on a NVIDIA GeForce RTX 4090 with a batch size of 30. Table 1 provides an overview of all hyperparameter settings. Training procedures for each dataset and neural network were repeated five times from scratch in order to estimate the intra-dataset variation of the performance for each neural network.
Table 1. Overview of hyperparameters and settings.
Hyperparameter | Setting |
---|---|
Learning rate | 0.0001 |
Number of epochs | 40 |
Batch size | 30 |
Optimizer | Adam |
Momentum | beta_1 =0.9; beta_2 =0.999 |
Data augmentation | Random horizontal and vertical flipping |
Performance evaluation
For assessing the neural network performance on the test set, the metrics precision, recall, F1-score (35), balanced accuracy and receiver operating characteristic (ROC) curves (36) together with the area under the curve (AUC) were recorded.
CAMs and attention maps
For CNNs, CAMs (37,38) utilize global average pooling (GAP) to convert the spatial information in the final convolutional layer into a single value per feature map. This aggregated information represents the relevance of each feature map in making the final prediction. The CAM is then obtained by taking a weighted sum of the feature maps in the final convolutional layer, with the weights determined by the importance assigned to each feature map during the classification process. In the case of ViT, attention maps are used. ViT employs a self-attention mechanism which captures relationships between different spatial locations and allows the model to learn global context. The ViT architecture deployed here uses multi-head attention, where the model learns multiple sets of attention weights. Each set provides a different perspective on the relationships within the image. Attention maps are derived by combining the maximum weights across the attention maps.
Results
Performances of neural networks
The overall highest mean balanced accuracy of 0.8643 (SD =0.0925) was achieved on the SRH dataset. Within the SRS datasets, the highest mean balanced accuracy of 0.8453 (SD =0.0250) was achieved with a scaling of the raw pixel intensities. The lowest mean balanced accuracy of 0.7273 (SD =0.0089) was achieved on the SRS data with histogram matching. The balanced accuracies of the neural networks on the scaled SRS dataset showed the smallest variance (var) =0.0006 and SD =0.0250.
The highest mean precision of 0.8542 was achieved on the SRH dataset. Within the SRS datasets, the highest mean precision of 0.8287 was achieved on the scaled dataset. The precision values showed the smallest variance and SD on the blurred SRS dataset (var =0.0082 and SD =0.0906). For recall and F1-score, the best mean performance was achieved on the SRH dataset (0.8643 for recall and 0.8478 for F1-score) and the least variance and SD were achieved on the scaled SRS dataset. Since the performance metrics distributions neither show a normal distribution nor a variance homogeneity, a Kruskal-Wallis test was performed to assess the statistical significance of the distribution differences across all datasets. All performance metrics, together with violin plots and mean values, are depicted in Figure 4 and P values for each combination of datasets and for all performance metrics are shown in Figure 5. The distribution differences of the performance metrics are statistically significant (P<0.05) between SRH and all SRS datasets for all performance metrics. For complex preprocessing and scaling, the distribution differences of balanced accuracy and F1-score are statistically significant (P<0.05) whereas for the individual metrics precision and recall the distribution differences might be due to random effects (P>0.05). All performance metric values can be found in Table 2. Detailed summarizing statistics are shown in Table 3. ROC curves for each architecture together with AUC values are shown in Figures S1-S6. Confusion matrices are shown in Figures S7-S12.
Figure 4.
Performance metrics of the neural networks on each dataset. Balanced accuracy, precision, recall and F1-Score are depicted for each class separately. Violin plots are shown in the background of each subplot for an estimation of the overall distribution of the metrics. Mean values across all performance metrics in a single subplot are shown as blue horizontal lines. SRH, stimulated Raman histology.
Figure 5.
Matrices showing the P values of a Kruskal-Wallis test measuring the statistical significance of the performance metrics distribution differences for each combination of preprocessing method for the metrics balanced accuracy, precision, recall and F1-score across all neural networks. P values <0.05 are depicted in green (considered as statistically significant) and P values ≥0.05 are depicted in blue (considered as statistically not significant). SRH, stimulated Raman histology.
Table 2. Performance metrics balanced ACC, PRE, REC and F1-score for each run of the neural network on each dataset.
Architectures | Run | Class | SRH | Complex | Histogram | Blurring | Scaling | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Matching | Equalization | ||||||||||||||||||||||||||||||
ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | ||||||||
VGG19 | 1 | T | 0.89 | 0.93 | 0.81 | 0.87 | 0.81 | 0.86 | 0.68 | 0.76 | 0.82 | 0.70 | 0.80 | 0.74 | 0.88 | 0.77 | 0.87 | 0.82 | 0.85 | 0.74 | 0.82 | 0.78 | 0.86 | 0.74 | 0.85 | 0.79 | |||||
nT | 0.92 | 0.97 | 0.94 | 0.87 | 0.95 | 0.91 | 0.90 | 0.84 | 0.87 | 0.94 | 0.88 | 0.91 | 0.91 | 0.87 | 0.89 | 0.93 | 0.86 | 0.89 | |||||||||||||
2 | T | 0.83 | 0.96 | 0.67 | 0.79 | 0.78 | 0.60 | 0.82 | 0.69 | 0.67 | 0.92 | 0.35 | 0.51 | 0.84 | 0.66 | 0.88 | 0.75 | 0.80 | 0.74 | 0.72 | 0.73 | 0.87 | 0.75 | 0.88 | 0.81 | ||||||
nT | 0.87 | 0.99 | 0.92 | 0.90 | 0.75 | 0.82 | 0.77 | 0.99 | 0.86 | 0.94 | 0.79 | 0.86 | 0.88 | 0.89 | 0.88 | 0.94 | 0.87 | 0.90 | |||||||||||||
3 | T | 0.91 | 0.79 | 0.94 | 0.86 | 0.79 | 0.54 | 0.95 | 0.69 | 0.79 | 0.77 | 0.68 | 0.72 | 0.61 | 0.85 | 0.24 | 0.37 | 0.83 | 0.73 | 0.81 | 0.77 | 0.86 | 0.76 | 0.83 | 0.79 | ||||||
nT | 0.97 | 0.89 | 0.93 | 0.96 | 0.64 | 0.77 | 0.86 | 0.91 | 0.88 | 0.74 | 0.98 | 0.84 | 0.91 | 0.86 | 0.88 | 0.92 | 0.88 | 0.90 | |||||||||||||
4 | T | 0.87 | 0.75 | 0.87 | 0.81 | 0.57 | 0.42 | 0.39 | 0.41 | 0.81 | 0.62 | 0.85 | 0.72 | 0.72 | 0.88 | 0.48 | 0.62 | 0.80 | 0.78 | 0.70 | 0.73 | 0.89 | 0.78 | 0.89 | 0.83 | ||||||
nT | 0.94 | 0.87 | 0.90 | 0.73 | 0.75 | 0.74 | 0.92 | 0.76 | 0.83 | 0.80 | 0.97 | 0.88 | 0.87 | 0.91 | 0.89 | 0.95 | 0.89 | 0.92 | |||||||||||||
5 | T | 0.88 | 0.86 | 0.83 | 0.84 | 0.61 | 0.62 | 0.30 | 0.40 | 0.75 | 0.52 | 0.85 | 0.65 | 0.80 | 0.64 | 0.81 | 0.71 | 0.78 | 0.71 | 0.68 | 0.70 | 0.85 | 0.76 | 0.83 | 0.79 | ||||||
nT | 0.92 | 0.94 | 0.93 | 0.74 | 0.92 | 0.82 | 0.91 | 0.64 | 0.75 | 0.90 | 0.79 | 0.84 | 0.86 | 0.88 | 0.87 | 0.92 | 0.88 | 0.90 | |||||||||||||
ResNet50 | 1 | T | 0.84 | 0.67 | 0.86 | 0.76 | 0.64 | 0.96 | 0.28 | 0.43 | 0.54 | 0.65 | 0.10 | 0.18 | 0.51 | 0.71 | 0.04 | 0.07 | 0.84 | 0.82 | 0.77 | 0.79 | 0.85 | 0.83 | 0.78 | 0.80 | |||||
nT | 0.93 | 0.81 | 0.86 | 0.75 | 1.00 | 0.86 | 0.70 | 0.97 | 0.82 | 0.69 | 0.99 | 0.82 | 0.90 | 0.92 | 0.91 | 0.90 | 0.92 | 0.91 | |||||||||||||
2 | T | 0.74 | 0.92 | 0.50 | 0.65 | 0.78 | 0.72 | 0.67 | 0.70 | 0.51 | 0.55 | 0.03 | 0.07 | 0.87 | 0.71 | 0.90 | 0.80 | 0.86 | 0.74 | 0.86 | 0.80 | 0.88 | 0.76 | 0.89 | 0.82 | ||||||
nT | 0.81 | 0.98 | 0.89 | 0.86 | 0.88 | 0.87 | 0.69 | 0.99 | 0.81 | 0.95 | 0.83 | 0.89 | 0.93 | 0.87 | 0.90 | 0.95 | 0.87 | 0.91 | |||||||||||||
3 | T | 0.80 | 0.83 | 0.67 | 0.74 | 0.76 | 0.88 | 0.55 | 0.68 | 0.62 | 0.88 | 0.25 | 0.39 | 0.86 | 0.66 | 0.93 | 0.77 | 0.78 | 0.75 | 0.67 | 0.70 | 0.85 | 0.82 | 0.77 | 0.80 | ||||||
nT | 0.86 | 0.94 | 0.90 | 0.83 | 0.96 | 0.89 | 0.74 | 0.98 | 0.85 | 0.96 | 0.78 | 0.86 | 0.85 | 0.90 | 0.88 | 0.90 | 0.93 | 0.91 | |||||||||||||
4 | T | 0.62 | 0.99 | 0.24 | 0.38 | 0.74 | 0.64 | 0.66 | 0.65 | 0.55 | 0.84 | 0.12 | 0.21 | 0.82 | 0.73 | 0.77 | 0.75 | 0.83 | 0.73 | 0.79 | 0.76 | 0.83 | 0.71 | 0.80 | 0.76 | ||||||
nT | 0.74 | 1.00 | 0.85 | 0.84 | 0.83 | 0.84 | 0.71 | 0.99 | 0.83 | 0.89 | 0.87 | 0.88 | 0.90 | 0.87 | 0.88 | 0.91 | 0.85 | 0.88 | |||||||||||||
5 | T | 0.88 | 0.73 | 0.90 | 0.81 | 0.83 | 0.76 | 0.78 | 0.77 | 0.61 | 0.75 | 0.27 | 0.40 | 0.71 | 0.83 | 0.47 | 0.60 | 0.81 | 0.77 | 0.72 | 0.74 | 0.80 | 0.81 | 0.68 | 0.74 | ||||||
nT | 0.95 | 0.85 | 0.90 | 0.90 | 0.88 | 0.89 | 0.74 | 0.96 | 0.84 | 0.80 | 0.95 | 0.87 | 0.88 | 0.90 | 0.89 | 0.86 | 0.93 | 0.89 | |||||||||||||
Xception | 1 | T | 0.89 | 0.85 | 0.85 | 0.85 | 0.88 | 0.75 | 0.89 | 0.82 | 0.85 | 0.73 | 0.84 | 0.78 | 0.87 | 0.78 | 0.85 | 0.81 | 0.87 | 0.80 | 0.84 | 0.82 | 0.86 | 0.84 | 0.79 | 0.82 | |||||
nT | 0.93 | 0.93 | 0.93 | 0.95 | 0.87 | 0.90 | 0.92 | 0.86 | 0.89 | 0.93 | 0.89 | 0.91 | 0.92 | 0.91 | 0.91 | 0.91 | 0.93 | 0.92 | |||||||||||||
2 | T | 0.90 | 0.77 | 0.93 | 0.84 | 0.87 | 0.80 | 0.83 | 0.82 | 0.84 | 0.75 | 0.79 | 0.77 | 0.88 | 0.81 | 0.85 | 0.83 | 0.87 | 0.83 | 0.81 | 0.82 | 0.80 | 0.83 | 0.66 | 0.73 | ||||||
nT | 0.97 | 0.87 | 0.92 | 0.92 | 0.90 | 0.91 | 0.90 | 0.88 | 0.89 | 0.93 | 0.91 | 0.92 | 0.91 | 0.93 | 0.92 | 0.86 | 0.94 | 0.90 | |||||||||||||
3 | T | 0.91 | 0.87 | 0.88 | 0.88 | 0.87 | 0.70 | 0.93 | 0.80 | 0.86 | 0.80 | 0.81 | 0.80 | 0.85 | 0.73 | 0.84 | 0.78 | 0.87 | 0.81 | 0.82 | 0.82 | 0.87 | 0.85 | 0.80 | 0.82 | ||||||
nT | 0.95 | 0.94 | 0.94 | 0.96 | 0.82 | 0.89 | 0.91 | 0.91 | 0.91 | 0.92 | 0.86 | 0.89 | 0.92 | 0.91 | 0.92 | 0.91 | 0.93 | 0.92 | |||||||||||||
4 | T | 0.91 | 0.79 | 0.93 | 0.86 | 0.87 | 0.82 | 0.82 | 0.82 | 0.82 | 0.63 | 0.88 | 0.73 | 0.85 | 0.88 | 0.74 | 0.80 | 0.88 | 0.72 | 0.93 | 0.81 | 0.88 | 0.74 | 0.90 | 0.81 | ||||||
nT | 0.97 | 0.89 | 0.93 | 0.92 | 0.92 | 0.92 | 0.93 | 0.76 | 0.84 | 0.89 | 0.95 | 0.92 | 0.96 | 0.83 | 0.89 | 0.95 | 0.86 | 0.90 | |||||||||||||
5 | T | 0.90 | 0.80 | 0.90 | 0.85 | 0.86 | 0.77 | 0.82 | 0.80 | 0.82 | 0.75 | 0.74 | 0.75 | 0.87 | 0.84 | 0.82 | 0.83 | 0.81 | 0.85 | 0.68 | 0.75 | 0.89 | 0.77 | 0.89 | 0.83 | ||||||
nT | 0.95 | 0.90 | 0.92 | 0.92 | 0.89 | 0.90 | 0.88 | 0.89 | 0.88 | 0.92 | 0.93 | 0.92 | 0.86 | 0.95 | 0.90 | 0.95 | 0.88 | 0.91 | |||||||||||||
InceptionResNetV2 | 1 | T | 0.91 | 0.83 | 0.92 | 0.87 | 0.83 | 0.67 | 0.85 | 0.75 | 0.80 | 0.75 | 0.70 | 0.72 | 0.88 | 0.72 | 0.92 | 0.81 | 0.82 | 0.77 | 0.73 | 0.75 | 0.85 | 0.72 | 0.84 | 0.77 | |||||
nT | 0.96 | 0.91 | 0.94 | 0.92 | 0.81 | 0.86 | 0.87 | 0.89 | 0.88 | 0.96 | 0.83 | 0.89 | 0.88 | 0.90 | 0.89 | 0.92 | 0.85 | 0.88 | |||||||||||||
2 | T | 0.89 | 0.92 | 0.81 | 0.86 | 0.84 | 0.72 | 0.83 | 0.77 | 0.78 | 0.70 | 0.69 | 0.70 | 0.89 | 0.77 | 0.90 | 0.83 | 0.85 | 0.77 | 0.81 | 0.79 | 0.83 | 0.70 | 0.82 | 0.75 | ||||||
nT | 0.92 | 0.97 | 0.94 | 0.92 | 0.85 | 0.88 | 0.86 | 0.87 | 0.86 | 0.95 | 0.88 | 0.91 | 0.91 | 0.89 | 0.90 | 0.91 | 0.84 | 0.87 | |||||||||||||
3 | T | 0.92 | 0.86 | 0.91 | 0.88 | 0.87 | 0.72 | 0.90 | 0.80 | 0.77 | 0.77 | 0.63 | 0.70 | 0.87 | 0.73 | 0.89 | 0.81 | 0.79 | 0.77 | 0.68 | 0.72 | 0.84 | 0.77 | 0.79 | 0.78 | ||||||
nT | 0.96 | 0.93 | 0.94 | 0.95 | 0.84 | 0.89 | 0.85 | 0.91 | 0.88 | 0.95 | 0.85 | 0.90 | 0.86 | 0.91 | 0.88 | 0.90 | 0.89 | 0.90 | |||||||||||||
4 | T | 0.91 | 0.76 | 0.96 | 0.85 | 0.86 | 0.69 | 0.90 | 0.78 | 0.75 | 0.69 | 0.62 | 0.66 | 0.87 | 0.85 | 0.80 | 0.82 | 0.84 | 0.76 | 0.79 | 0.78 | 0.84 | 0.69 | 0.85 | 0.76 | ||||||
nT | 0.98 | 0.86 | 0.92 | 0.95 | 0.81 | 0.87 | 0.84 | 0.87 | 0.85 | 0.91 | 0.93 | 0.92 | 0.90 | 0.89 | 0.90 | 0.92 | 0.83 | 0.87 | |||||||||||||
5 | T | 0.92 | 0.88 | 0.90 | 0.89 | 0.87 | 0.80 | 0.84 | 0.82 | 0.75 | 0.70 | 0.61 | 0.65 | 0.80 | 0.63 | 0.82 | 0.71 | 0.82 | 0.76 | 0.75 | 0.75 | 0.86 | 0.71 | 0.87 | 0.78 | ||||||
nT | 0.95 | 0.94 | 0.95 | 0.93 | 0.91 | 0.92 | 0.83 | 0.88 | 0.85 | 0.90 | 0.78 | 0.84 | 0.89 | 0.89 | 0.89 | 0.94 | 0.84 | 0.88 | |||||||||||||
ConvNeXt | 1 | T | 0.90 | 0.80 | 0.91 | 0.85 | 0.78 | 0.72 | 0.68 | 0.70 | 0.71 | 0.66 | 0.55 | 0.60 | 0.87 | 0.77 | 0.86 | 0.81 | 0.78 | 0.71 | 0.67 | 0.69 | 0.80 | 0.68 | 0.75 | 0.71 | |||||
nT | 0.96 | 0.90 | 0.93 | 0.86 | 0.88 | 0.87 | 0.81 | 0.87 | 0.84 | 0.93 | 0.88 | 0.91 | 0.85 | 0.88 | 0.87 | 0.88 | 0.84 | 0.86 | |||||||||||||
2 | T | 0.90 | 0.74 | 0.95 | 0.83 | 0.78 | 0.69 | 0.70 | 0.69 | 0.71 | 0.70 | 0.53 | 0.60 | 0.86 | 0.71 | 0.89 | 0.79 | 0.78 | 0.63 | 0.77 | 0.69 | 0.83 | 0.70 | 0.83 | 0.76 | ||||||
nT | 0.97 | 0.85 | 0.91 | 0.86 | 0.85 | 0.86 | 0.81 | 0.89 | 0.85 | 0.94 | 0.84 | 0.89 | 0.88 | 0.79 | 0.84 | 0.92 | 0.84 | 0.87 | |||||||||||||
3 | T | 0.89 | 0.73 | 0.95 | 0.82 | 0.83 | 0.68 | 0.85 | 0.76 | 0.76 | 0.70 | 0.64 | 0.67 | 0.86 | 0.79 | 0.81 | 0.80 | 0.82 | 0.71 | 0.78 | 0.74 | 0.83 | 0.68 | 0.83 | 0.75 | ||||||
nT | 0.97 | 0.84 | 0.90 | 0.92 | 0.82 | 0.87 | 0.84 | 0.88 | 0.86 | 0.91 | 0.90 | 0.91 | 0.89 | 0.85 | 0.87 | 0.91 | 0.82 | 0.87 | |||||||||||||
4 | T | 0.90 | 0.84 | 0.87 | 0.85 | 0.82 | 0.71 | 0.79 | 0.74 | 0.79 | 0.67 | 0.74 | 0.70 | 0.84 | 0.86 | 0.73 | 0.79 | 0.80 | 0.74 | 0.71 | 0.72 | 0.84 | 0.71 | 0.83 | 0.76 | ||||||
nT | 0.94 | 0.92 | 0.93 | 0.90 | 0.85 | 0.87 | 0.87 | 0.83 | 0.85 | 0.89 | 0.94 | 0.91 | 0.87 | 0.89 | 0.88 | 0.91 | 0.85 | 0.88 | |||||||||||||
5 | T | 0.90 | 0.78 | 0.91 | 0.84 | 0.80 | 0.66 | 0.77 | 0.71 | 0.76 | 0.66 | 0.67 | 0.66 | 0.85 | 0.76 | 0.82 | 0.79 | 0.81 | 0.70 | 0.76 | 0.73 | 0.86 | 0.71 | 0.89 | 0.79 | ||||||
nT | 0.96 | 0.88 | 0.92 | 0.89 | 0.82 | 0.85 | 0.85 | 0.84 | 0.84 | 0.91 | 0.88 | 0.90 | 0.89 | 0.85 | 0.87 | 0.94 | 0.83 | 0.88 | |||||||||||||
VisionTransformer | 1 | T | 0.91 | 0.82 | 0.90 | 0.86 | 0.51 | 0.32 | 1.00 | 0.48 | 0.67 | 0.59 | 0.49 | 0.53 | 0.83 | 0.63 | 0.91 | 0.75 | 0.74 | 0.66 | 0.62 | 0.64 | 0.82 | 0.68 | 0.82 | 0.75 | |||||
nT | 0.95 | 0.91 | 0.93 | 1.00 | 0.01 | 0.02 | 0.78 | 0.85 | 0.81 | 0.95 | 0.76 | 0.84 | 0.83 | 0.86 | 0.84 | 0.91 | 0.82 | 0.87 | |||||||||||||
2 | T | 0.90 | 0.76 | 0.92 | 0.84 | 0.51 | 0.32 | 1.00 | 0.48 | 0.66 | 0.60 | 0.47 | 0.53 | 0.81 | 0.58 | 0.93 | 0.72 | 0.71 | 0.63 | 0.57 | 0.60 | 0.83 | 0.69 | 0.82 | 0.75 | ||||||
nT | 0.96 | 0.87 | 0.91 | 1.00 | 0.01 | 0.02 | 0.78 | 0.85 | 0.82 | 0.96 | 0.70 | 0.81 | 0.81 | 0.85 | 0.83 | 0.91 | 0.84 | 0.87 | |||||||||||||
3 | T | 0.50 | 0.31 | 1.00 | 0.48 | 0.80 | 0.66 | 0.78 | 0.71 | 0.66 | 0.57 | 0.48 | 0.52 | 0.50 | 0.00 | 0.00 | 0.00 | 0.70 | 0.58 | 0.61 | 0.59 | 0.85 | 0.68 | 0.89 | 0.77 | ||||||
nT | 0.00 | 0.00 | 0.00 | 0.89 | 0.82 | 0.85 | 0.78 | 0.83 | 0.81 | 0.69 | 1.00 | 0.81 | 0.82 | 0.80 | 0.81 | 0.94 | 0.81 | 0.87 | |||||||||||||
4 | T | 0.91 | 0.79 | 0.93 | 0.86 | 0.50 | 0.00 | 0.00 | 0.00 | 0.68 | 0.62 | 0.49 | 0.55 | 0.81 | 0.57 | 0.93 | 0.71 | 0.73 | 0.62 | 0.65 | 0.63 | 0.82 | 0.68 | 0.82 | 0.74 | ||||||
nT | 0.97 | 0.89 | 0.93 | 0.69 | 1.00 | 0.81 | 0.79 | 0.87 | 0.83 | 0.96 | 0.68 | 0.80 | 0.84 | 0.82 | 0.83 | 0.91 | 0.82 | 0.86 | |||||||||||||
5 | T | 0.90 | 0.84 | 0.88 | 0.86 | 0.50 | 0.31 | 1.00 | 0.48 | 0.71 | 0.67 | 0.53 | 0.59 | 0.78 | 0.54 | 0.92 | 0.68 | 0.76 | 0.66 | 0.68 | 0.67 | 0.82 | 0.69 | 0.80 | 0.74 | ||||||
nT | 0.95 | 0.92 | 0.93 | 0.00 | 0.00 | 0.00 | 0.80 | 0.88 | 0.84 | 0.95 | 0.65 | 0.77 | 0.85 | 0.84 | 0.85 | 0.90 | 0.83 | 0.87 |
ACC, accuracy; F1, F1-score; nT, non-tumoral; PRE, precision; REC, recall; SRH, stimulated Raman histology; T, tumoral.
Table 3. Summarizing statistics.
Dataset | Mean | Variance | Standard deviation | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | ACC | PRE | REC | F1 | |||
SRH | 0.8643 | 0.8542 | 0.8643 | 0.8478 | 0.0086 | 0.0250 | 0.0263 | 0.0221 | 0.0925 | 0.1580 | 0.1622 | 0.1487 | ||
Complex | 0.7587 | 0.7540 | 0.7583 | 0.7213 | 0.0159 | 0.0463 | 0.0628 | 0.0536 | 0.1261 | 0.2152 | 0.2505 | 0.2315 | ||
Histogram matching | 0.7273 | 0.7625 | 0.7257 | 0.7203 | 0.0089 | 0.0110 | 0.0534 | 0.0333 | 0.0942 | 0.1046 | 0.2311 | 0.1826 | ||
Histogram equalization | 0.8087 | 0.8058 | 0.8082 | 0.7892 | 0.0104 | 0.0248 | 0.0382 | 0.0288 | 0.1021 | 0.1576 | 0.1955 | 0.1696 | ||
Blurring | 0.8077 | 0.8070 | 0.8085 | 0.8062 | 0.0022 | 0.0082 | 0.0087 | 0.0077 | 0.0471 | 0.0906 | 0.0934 | 0.0875 | ||
Scaling | 0.8453 | 0.8287 | 0.8443 | 0.8327 | 0.0006 | 0.0095 | 0.0028 | 0.0039 | 0.0250 | 0.0975 | 0.0534 | 0.0622 |
For each dataset and performance metric balanced ACC, PRE, REC and F1. The mean, variance and standard deviation is computed across all networks, runs and classes. ACC, accuracy; F1, F1-score; PRE, precision; REC, recall; SRH, stimulated Raman histology.
CAMs and attention maps
For plotting the class activation and attention maps, the neural network with the best balanced accuracy score across all five runs were selected. Figure 6 shows the class activation and attention maps as well as overlays with the original tile for the correct classification of a tumoral tile. For the correct classification of a non-tumoral tile, the respective CAMs and overlays are shown in Figure 7. For the correct classification of a tumoral tile, CAMs differ greatly depending on image modality (SRH or SRS) as well as within differently preprocessed SRS images. Xception and ConvNeXt show similar CAMs for the SRH tile. The ViT architecture based its prediction of the SRH tile mainly on cytoplasm and the darker cell nuclei appeared not to be important for the classification. For the correct classification of a non-tumoral tile, CAMs of the CNNs show a great variation, making it challenging to identify important structures in the input image. The ViT architecture showed a tendency to use cell membranes as important input structure when classifying SRS images preprocessed with histogram equalization.
Figure 6.
Class activation maps for the correct classification of a tumoral tile. For each network, the first row shows the original SRH image together with the corresponding SRS images (CH2 and CH3 channel separately, colored with viridis color map). The second row shows the class activation maps where increasing brightness of a pixel means increasing importance for the respective classification. The third row shows the overlays of the tiles and the activation maps. SRH, stimulated Raman histology; SRS, stimulated Raman scattering.
Figure 7.
Class activation maps for the correct classification of a non-tumoral tile. For each network, the first row shows the original SRH image together with the corresponding SRS images (CH2 and CH3 channel separately, colored with viridis color map). The second row shows the class activation maps where increasing brightness of a pixel means increasing importance for the respective classification. The third row shows the overlays of the tiles and the activation maps. SRH, stimulated Raman histology; SRS, stimulated Raman scattering.
The ViT’s focus on cytoplasmic regions, cell membranes and related structures is similar to that of histopathological practice. In histopathological diagnostics, nuclear morphology is evaluated for tumor diagnosis and grading. Additionally, the cytoplasmic content and intercellular architecture help pathologists diagnose both OSCC and NSCLC. Specifically, the nuclear-to-cytoplasmic (NC) ratio is a key histopathological criterion for malignancy. An increased NC ratio, caused by nuclear enlargement resulting from increased transcriptional and replicative activity and atypical mitoses, followed by a reduction in the cytoplasm, is a hallmark of cellular atypia and malignancy. Therefore, changes in both nuclear and cytoplasmic compartments are important for diagnosis, and the ViT’s focus on these structures is consistent with recognised pathological criteria.
The ViTs focus on membrane structures could potentially be linked to features such as loss of polarity, alterations in cell cohesion or other architectural patterns that are sometimes considered in histopathological assessments. This observation suggests that DL models can capture spatially distributed tissue characteristics, although the exact nature and diagnostic relevance of these patterns need further investigation.
Discussion
Image preprocessing remains a crucial aspect for the successful application of DL algorithms to solve image classification tasks. Preprocessing techniques for SRS images applied in this study range from complex preprocessing, based on (23,24), histogram matching of the individual line scans to the histogram of the first line scan, histogram equalization of all individual line scans, the usage of a Gaussian filter for blurring the whole SRS image and a scaling of the original SRS pixel values to the interval [0, 1] without any preceding manipulation.
CAMs highlight the specific regions within the tiles that contribute most significantly to the output of a specific neuron of the network. This localization is crucial in histopathology, as it helps pathologists identify and focus on areas the network assigns high importance to for predicting a certain class. Furthermore, CAMs enhance the interpretability and explainability of CNN predictions. By visualizing which parts of the image contribute to a certain prediction, CAMs make the decision process more transparent. This is particularly important in medical contexts where understanding the rationale behind a prediction is crucial for gaining trust in the model. To unmask clever Hans predictors (39), CAMs can be used for examining the activated regions, so it can be ensured that the network is focusing on histologically relevant features rather than making predictions based on artifacts or irrelevant features.
This study has shown that a scaling of raw pixel values of SRS images to the range [0, 1] yielded better results across all neural networks than more sophisticated preprocessing methods aiming at reducing artefacts. The mean performance metrics showed higher values within the SRS datasets and the variance of the performance was significantly lower. Absolute performance was highest on the SRH dataset. Class activation and attention maps gave limited insights into the identification of important input structures for a given prediction. In most cases, these patterns could not be matched to biologically meaningful structures in the input image. For correct classifications of ViT on tumoral SRH tiles, attention maps showed that mainly cytoplasm was important for the classification. For correct classifications of ViT on non-tumoral SRS tiles preprocessed with histogram equalization, attention maps showed a tendency that, among other structures, cell membranes were important. The focus on cytoplasmic and membranous regions complements the diagnostic relevance of the NC ratio. The observed attention patterns may reflect recognised histopathological criteria. Although the precise significance of membrane-related features is yet to be determined, these findings suggest that DL models can identify diagnostically relevant, spatially distributed tissue characteristics. CAMs for CNNs remained obscure, revealing that even correct predictions are not necessarily based on biologically meaningful structures.
The findings of this study are restricted to the experimental setup and the data produced by the device of Invenio Imaging Inc. and no direct statement about data collected by other means and devices can be made. Additionally, the results are based on, but may not be restricted to, the strategy of taking the spectral difference CH3-CH2 for populating the third color channel of SRS images. Factors that further limit the applicability of the presented results are the relatively small size of the dataset and the imbalance between the number of patients and images for OSCC and NSCLC which was not addressed with sampling or weighting strategies. Therefore, this study does not allow any direct conclusions to be drawn about the performances of the models on individual pathologies but only on the combined imbalanced dataset. The significance of the results presented in this study might be shifted towards the overrepresented class NSCLC and especially for the underrepresented entity OSCC, the performance and stability might differ. The consideration of only two particular tumor entities and therefore specific SRS patterns further reduces the generalizability. However, since Invenio Imaging Inc. received Food and Drug Administration (FDA) breakthrough device designation for their AI-based image analysis module for evaluating lung biopsies in October 2024, the application of SRS to NSCLC and the optimization of deep-learning based outcomes is highly relevant.
The results of our study suggest that preprocessing of pixel values of SRS images have a great impact on the overall performance and statistical variance of DL algorithms and that a simple preprocessing may yield a better performance of DL algorithms when compared to more sophisticated techniques.
Conclusions
This study investigated the performances of various neural networks for distinguishing tumoral from non-tumoral tissue tiles for OSCC and NSCLC on SRH and differently preprocessed SRS images. It has been shown that preprocessing of pixel values of SRS images can have a great impact on the performance and the stability of DL algorithms when applied for classification of cancer tissue. The identification of unique biological structures in the input images which are relevant for a prediction remains challenging across all neural network architectures.
Supplementary
The article’s supplementary files as
Acknowledgments
The authors would like to acknowledge Florian Khalid for his valuable technical assistance.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the local ethics committee of Ethik-Kommission der Albert-Ludwigs-Universität Freiburg (No. 22-1037 and No. 22-1322_2-S1) and written informed consent was taken from all individual participants.
Footnotes
Reporting Checklist: The authors have completed the CLEAR reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2608/rc
Funding: This research was funded by the German Federal Ministry of Education and Research (Bundesministerium fur Bildung und Forschung, BMBF) (grant No. 13GW0571D).
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2608/coif). A.W. and P.B. received funding by the German Federal Ministry of Education and Research Grant (ID: 13GW0571D). L.S.B. received the following Payment or honoraria for lectures, presentations, speakers bureaus, manuscript writing or educational events: KLS martin 04/2023; Bundeswehr Zentralkrankenhaus 03/2024. L.S.B. has the following Patents planned, issued or pending: PCT patent EP2022/088045 pending; DPMA patent DE 10 2022 100 438 B3 issued. J.S. received financial support from Invenio Imaging Inc. for travel to on site visit of company headquaters in May 2022. The other authors have no conflicts of interest to declare.
Data Sharing Statement
Available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2608/dss
References
- 1.Djuric U, Zadeh G, Aldape K, Diamandis P. Precision histology: how deep learning is poised to revitalize histomorphology for personalized cancer care. NPJ Precis Oncol 2017;1:22. 10.1038/s41698-017-0022-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Freudiger CW, Min W, Saar BG, Lu S, Holtom GR, He C, Tsai JC, Kang JX, Xie XS. Label-free biomedical imaging with high sensitivity by stimulated Raman scattering microscopy. Science 2008;322:1857-61. 10.1126/science.1165758 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Orringer DA, Pandian B, Niknafs YS, Hollon TC, Boyle J, Lewis S, et al. Rapid intraoperative histology of unprocessed surgical specimens via fibre-laser-based stimulated Raman scattering microscopy. Nat Biomed Eng 2017;1:0027. [DOI] [PMC free article] [PubMed]
- 4.Di L, Eichberg DG, Huang K, Shah AH, Jamshidi AM, Luther EM, Lu VM, Komotar RJ, Ivan ME, Gultekin SH. Stimulated Raman Histology for Rapid Intraoperative Diagnosis of Gliomas. World Neurosurg 2021;150:e135-43. 10.1016/j.wneu.2021.02.122 [DOI] [PubMed] [Google Scholar]
- 5.Di L, Eichberg DG, Park YJ, Shah AH, Jamshidi AM, Luther EM, Lu VM, Komotar RJ, Ivan ME, Gultekin SH. Rapid Intraoperative Diagnosis of Meningiomas using Stimulated Raman Histology. World Neurosurg 2021;150:e108-16. 10.1016/j.wneu.2021.02.097 [DOI] [PubMed] [Google Scholar]
- 6.Neidert N, Straehle J, Erny D, Sacalean V, El Rahal A, Steybe D, Schmelzeisen R, Vlachos A, Reinacher PC, Coenen VA, Mizaikoff B, Heiland DH, Prinz M, Beck J, Schnell O. Stimulated Raman histology in the neurosurgical workflow of a major European neurosurgical center - part A. Neurosurg Rev 2022;45:1731-9. 10.1007/s10143-021-01712-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mannas MP, Jones D, Deng FM, Hoskoppal D, Melamed J, Orringer DA, Taneja SS. Stimulated Raman histology, a novel method to allow for rapid pathologic examination of unprocessed, fresh prostate biopsies. Prostate 2023;83:1060-7. 10.1002/pros.24547 [DOI] [PubMed] [Google Scholar]
- 8.van der Laak J, Litjens G, Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med 2021;27:775-84. 10.1038/s41591-021-01343-4 [DOI] [PubMed] [Google Scholar]
- 9.Kather JN, Krisam J, Charoentong P, Luedde T, Herpel E, Weis CA, Gaiser T, Marx A, Valous NA, Ferber D, Jansen L, Reyes-Aldasoro CC, Zörnig I, Jäger D, Brenner H, Chang-Claude J, Hoffmeister M, Halama N. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med 2019;16:e1002730. 10.1371/journal.pmed.1002730 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chlorogiannis DD, Verras GI, Tzelepi V, Chlorogiannis A, Apostolos A, Kotis K, Anagnostopoulos CN, Antzoulas A, Davakis S, Vailas M, Schizas D, Mulita F. Tissue classification and diagnosis of colorectal cancer histopathology images using deep learning algorithms. Is the time ripe for clinical practice implementation? Prz Gastroenterol 2023;18:353-67. 10.5114/pg.2023.130337 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Rawat RR, Ortega I, Roy P, Sha F, Shibata D, Ruderman D, Agus DB. Deep learned tissue "fingerprints" classify breast cancers by ER/PR/Her2 status from H&E images. Sci Rep 2020;10:7275. 10.1038/s41598-020-64156-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mondol RK, Millar EKA, Graham PH, Browne L, Sowmya A, Meijering E. hist2RNA: An Efficient Deep Learning Architecture to Predict Gene Expression from Breast Cancer Histopathology Images. Cancers (Basel) 2023;15:2569. 10.3390/cancers15092569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Unger M, Kather JN. Deep learning in cancer genomics and histopathology. Genome Med 2024;16:44. 10.1186/s13073-024-01315-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Liu Z, Su W, Ao J, Wang M, Jiang Q, He J, Gao H, Lei S, Nie J, Yan X, Guo X, Zhou P, Hu H, Ji M. Instant diagnosis of gastroscopic biopsy via deep-learned single-shot femtosecond stimulated Raman histology. Nat Commun 2022;13:4050. 10.1038/s41467-022-31339-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hollon TC, Pandian B, Urias E, Save AV, Adapa AR, Srinivasan S, et al. Rapid, label-free detection of diffuse glioma recurrence using intraoperative stimulated Raman histology and deep neural networks. Neuro Oncol 2021;23:144-55. 10.1093/neuonc/noaa162 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ao J, Shao X, Liu Z, Liu Q, Xia J, Shi Y, Qi L, Pan J, Ji M. Stimulated Raman Scattering Microscopy Enables Gleason Scoring of Prostate Core Needle Biopsy by a Convolutional Neural Network. Cancer Res 2023;83:641-51. 10.1158/0008-5472.CAN-22-2146 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhang Q, Yun KK, Wang H, Yoon SW, Lu F, Won D. Automatic cell counting from stimulated Raman imaging using deep learning. PLoS One 2021;16:e0254586. 10.1371/journal.pone.0254586 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Feizpour A, Marstrand T, Bastholm L, Eirefelt S, Evans CL. Label-Free Quantification of Pharmacokinetics in Skin with Stimulated Raman Scattering Microscopy and Deep Learning. J Invest Dermatol 2021;141:395-403. 10.1016/j.jid.2020.06.027 [DOI] [PubMed] [Google Scholar]
- 19.Yang Y, Liu Z, Huang J, Sun X, Ao J, Zheng B, Chen W, Shao Z, Hu H, Yang Y, Ji M. Histological diagnosis of unprocessed breast core-needle biopsy via stimulated Raman scattering microscopy and multi-instance learning. Theranostics 2023;13:1342-54. 10.7150/thno.81784 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Armani A, Chalyan T, Sampson DD, editors. Biophotonics and Biosensing: From Fundamental Research to Clinical Trials Through Advances of Signal and Image Processing. SPIE; 2024. [Google Scholar]
- 21.Wahl J, Klint E, Hallbeck M, Hillman J, Wårdell K, Ramser K. Impact of preprocessing methods on the Raman spectra of brain tissue. Biomed Opt Express 2022;13:6763-77. 10.1364/BOE.476507 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bankhead P, Loughrey MB, Fernández JA, Dombrowski Y, McArt DG, Dunne PD, McQuaid S, Gray RT, Murray LJ, Coleman HG, James JA, Salto-Tellez M, Hamilton PW. QuPath: Open source software for digital pathology image analysis. Sci Rep 2017;7:16878. 10.1038/s41598-017-17204-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hollon T, Jiang C, Chowdury A, Nasir-Moin M, Kondepudi A, Aabedi A, et al. Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging. Nat Med 2023;29:828-32. 10.1038/s41591-023-02252-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hollon TC, Pandian B, Adapa AR, Urias E, Save AV, Khalsa SSS, et al. Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks. Nat Med 2020;26:52-8. 10.1038/s41591-019-0715-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Bradski G. The OpenCV Library. Dr. Dobb’s Journal of Software Tools 2000;120;122-5. [Google Scholar]
- 26.van der Walt S, Schönberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, Gouillart E, Yu T; scikit-image contributors. scikit-image: image processing in Python. PeerJ 2014;2:e453. 10.7717/peerj.453 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 2020;17:261-72. 10.1038/s41592-019-0686-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015. arXiv: 1409.1556.
- 29.He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA; 2016:770-8. [Google Scholar]
- 30.Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. Proceedings of the AAAI Conference on Artificial Intelligence; 2017. doi: 10.1609/aaai.v31i1.11231. [DOI] [Google Scholar]
- 31.Chollet F. Xception: Deep Learning with Depthwise Separable Convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI; 2017:1800-7. [Google Scholar]
- 32.Liu Z, Mao H, Wu CY, Feichtenhofer C, Darrell T, Xie S. A ConvNet for the 2020s. arXiv 2022. Arxiv: 2201.03545.
- 33.Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2021. Arxiv: 2010.11929.
- 34.Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. Arxiv: 1603.04467.
- 35.Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manag 2009;45:427-37. [Google Scholar]
- 36.Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett 2006;27:861-74. [Google Scholar]
- 37.Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning Deep Features for Discriminative Localization. arXiv 2015. Arxiv: 1512.04150.
- 38.Huff DT, Weisman AJ, Jeraj R. Interpretation and visualization techniques for deep learning models in medical imaging. Phys Med Biol 2021;66:04TR01. 10.1088/1361-6560/abcd17 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller KR. Unmasking Clever Hans predictors and assessing what machines really learn. Nat Commun 2019;10:1096. 10.1038/s41467-019-08987-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
The article’s supplementary files as
Data Availability Statement
Available at https://qims.amegroups.com/article/view/10.21037/qims-2024-2608/dss