Skip to main content
. 2022 Dec 1;20(5):850–866. doi: 10.1016/j.gpb.2022.11.003

Table 1.

Publications relevant toMLon early detection and diagnosis using imaging data

Publication Feature extraction Classification model Sample size Imaging data type Performance Validation method Feature selection/input Highlight/advantage Shortcoming
McWilliams et al. [31] NA LR 2961 CT images AUC (0.907–0.960) Hold-out Clinical risk factors + nodule characteristics on CT images Using the extracted feature as input, the classifier can achieve high AUC in small nodules (< 10 mm) The selection of nodule characteristics affects the predictive performance of the model
Riel et al. [32] NA LR 300 CT images AUC (0.706–0.932) Hold-out Clinical factors + nodule characteristics on CT images The classifier can perform equivalently as human observers for malignant and benign classification The performance heavily relies on nodule size as the discriminator, and is not robust in small nodules
Kriegsmann et al. [34] NA LDA 326 MALDI Accuracy (0.991) Hold-out Mass spectra from ROIs of MALDI image The model maintains high accuracy on FFPE biopsies The performance relies on the quality of the MALDI stratification
Buty et al. [37] Spherical harmonics [44];
DCNN [41]
RF 1018 CT images Accuracy (0.793–0.824) 10-fold cross-validation CT imaging patches + radiologists’ binary nodule segmentations The model reaches higher predictive accuracy by integrating shape and appearance nodule imaging features No benchmarking comparisons were used in the study
Hussein et al. [38] 3D CNN-based multi-task model 3D CNN-based multi-task model 1018 CT images Accuracy (0.9126) 10-fold cross-validation 3D CT volume feature The model achieves higher accuracy than other benchmarked models The ground truth scores defined by radiologists for the benchmark might be arbitrary
Khosravan et al. [39] 3D CNN-based multi-task model 3D CNN-based multi-task model 6960 CT images Segmentation DSC (0.91); classification accuracy (0.97) 10-fold cross-validation 3D CT volume feature The model integration of clustering and sparsification algorithms helps to accurately extract potential attentional regions Segmentation might fail if the ROIs are outside the lung regions
Ciompi et al. [40] OverFeat [42] SVM; RF 1729 CT images AUC (0.868) 10-fold cross-validation 3D CT volume feature, nodule position coordinate, and maximum diameter This is the first study attempting to classify whether the diagnosed nodule is benign or malignant The model requires specifying the position and diameter of the nodule as input, but many nodules could not be located on the CT images
Venkadesh et al. [44] 2D-ResNet50-based [45];
3D-Inception-V1 [46]
An ensemble model based on two CNN models 16,429 CT images AUC (0.86–0.96) 10-fold cross-validation 3D CT volume feature and nodule coordinates The model achieves higher AUC than other benchmarked models The model requires specifying the position of the nodule, but many nodules are unable to be located on the CT images
Ardila et al. [47] Mask-RCNN [48];
RetinaNet [49];
3D-inflated Inception-V1 [50], [51]
Mask-RCNN [48];
RetinaNet [49];
3D-inflated Inception-V1 [50], [51]
14,851 CT images AUC (0.944) Hold-out Patient’s current and prior (if available) 3D CT volume features The model achieves higher AUC than radiologists when samples do not have prior CT images The training cohort is from only one dataset, although the sample size is large
AbdulJabbar et al. [52] Micro-Net [53]; SC-CNN [54] An ensemble model based on SC-CNN [54] 100 Histological images Accuracy (0.913) Hold-out Image features of H&E-stained tumor section histological slides The model can annotate cell types at the single-cell level using histological images only The annotation accuracy is affected by the used reference dataset
Coudray et al. [55] Multi-task CNN model based on Inception-V3 [51] Multi-task CNN model based on Inception-V3 network [51] 1634 Histological images AUC (0.733–0.856) Hold-out Transformed 512 × 512-pixel tiles from nonoverlapping ‘patches’ of the whole-slide images The model can predict whether a given tissue has somatic mutations in genes STK11, EGFR, FAT1, SETBP1, KRAS, and TP53 The accuracy of the gene mutation prediction is not very high
Lin et al. [59] DCGAN [58] + AlexNet [41] DCGAN [58] + AlexNet [41] 22,489 CT images Accuracy (0.9986) Hold-out Initial + synthetic CT images The model uses GAN to generate synthetic lung cancer images to reduce overfitting No benchmarking comparisons were used
Ren et al. [60] DCGAN [58] + VGG-DF DCGAN [58] + VGG-DF 15,000 Histopathological images Accuracy (0.9984);
F1-score (99.84%)
Hold-out Initial + synthetic histopathological images The model uses GAN to generate synthetic lung cancer images and a regularization-enhanced model to reduce overfitting The dimension of images by generator (64 × 64) is not sufficient for biomedical domain

Note: ML, machine learning; NA, not applicable; LR, logistic regression; AUC, area under the curve; CT, computed tomography; LDA, linear discriminant analysis; MALDI, matrix-assisted laser desorption/ionization; ROI, region of interest; FFPE, formalin-fixed paraffin-embedded; CNN, convolutional neural network; DSC, dice similarity coefficient; SVM, support vector machine; RF, random forest; DCNN, deep convolutional neural network; SC-CNN, spatially constrained convolutional neural network; DCGAN, deep convolutional generative adversarial network; RCNN, Region-CNN; H&E, hematoxylin and eosin; 2D, two dimensional; 3D, three dimensional. Compared with hold-out, cross-validation is usually more robust, and accounts for more variance between possible splits in training, validation, and test data. However, cross-validation is more time consuming than using the simple holdout method.