. 2022 Jul 10;14(14):3357. doi: 10.3390/cancers14143357

Table 1.

Machine learning approaches for the evaluation of thyroid nodule sonographic images.

Reference	Approach	Source Data	Method Details	Performance
Zhu, et al., 2021 [14]	Brief Efficient Thyroid Network (BETNET; a CSS model)	gray-scale US images of 592 patients with 600 TNs (internal dataset) 187 patients with 200 TNs (external validation dataset)	CNN approach with 24 layers: 13 convolution layers, 5 pooling layers, 3 fully connected layers with dropouts in between	AUC 0.970, 95% CI: 0.958–0.980 in the independent validation cohort; similar to two highly skilled radiologists (0.940 and 0.953)
Peng, et al. 2021 [15]	Deep-learning AI model (ThyNet)	18,049 US images of 8339 patients (training set) 4305 images of 2775 patients (total test set)	combined architecture of three networks: ResNet, ResNeXt, and DenseNet	ThyNet AUC (0.922; 95% CI 0.910–0.934] higher than that of the radiologists (0.839; CI 0.834–0.844]; p < 0.0001)
Bai, et al., 2021 [16]	RS-Net evaluation AI model	13,984 thyroid US images	CNN approach in which GoogLeNet is used as the backbone network.	Accuracy, sensitivity, specificity, PPV, and NPV were 88.0%, 98.1%, 79.1%, 80.5%, and 97.9%, comparable to that of a senior radiologist
Yoon, et al., 2021 [17]	Texture analysis; least absolute shrinkage and selection operator (LASSO) logistic regression model including clinical variables	155 US images of indeterminate thyroid nodules in 154 patients.	Texture extraction using MATLAB 2019b.; the LASSO model was used to choose the most useful predictive features. Univariable and multivariable logistic regression analyses were performed to build malignancy prediction models.	Integrated model AUC 0.839 vs. 0.583 (clinical variables only).
Liu, et al., 2021 [18]	information fusion-based joint convolutional neural network (IF-JCNN)	163 pairs of US images and raw radiofrequency signals of thyroid nodules	IF-JCNN contains two branched CNNs for deep feature extraction: one for US images (14 convolutional layers and 3 fully connected layers) and the other one for RF signals (12 convolutional layers and 3 fully connected layers)	The information carried by raw radiofrequency signals and ultrasound images for thyroid nodules is complementary IF-JCNN (both images and RF signals): AUC 0.956 (95% CI 0.926–0.987)
Gomes Ataide, et al., 2020 [19]	Feature extraction and Random Forest classifier	99 original US images	Feature extraction using MATLAB 2018b; Random Forest classifier (400 Decision Trees; Criterion: Entropy, with Bootstrap)	RFC accuracy 99.3%, sensitivity 99.4%, specificity 99.2%
Ye, et al., 2020 [20]	Deep convolution neural network (VGG-16)	US images of 1601 nodules (training set) and test data including 209 nodules (test set)	CNN approach based on VGG-19 (16 layers with learnable weights, 13 convolutions and 3 fully connected layers)	AUC 0.9157, comparable to the experienced radiologist (0.8879; p > 0.1)
Wei, et al., 2020 [21]	Ensemble deep learning model (EDLC-TN)	25,509 thyroid US images	CNN model based on DenseNet and adopted as a multistep cascade pathway for an ensemble learning model with voting system.	AUC 0.941 (0.936–0.946)
Zhou, et al., 2020 [23]	CNN-based transfer learning method named DLRT (deep-learning radiomics of thyroid)	US images of 1750 thyroid nodules (from 1734 patients)	CNN-based architecture with transfer learning strategy, with 4 hidden layers (3 transferred and a fine-tuned layer) and a fully connected layer	AUC in the external cohort 0.97 (0.95–0.99). Both a senior and a junior US radiologist had lower sensitivity and specificity than DLRT.
Nguyen, et al., 2020 [24]	Combination of multiple CNN models (ResNet-based and InceptionNet-based)	450 US thyroid nodule images (from 298 patients)	Combination of ResNet50-based (50 layers) and Inception-based (4 layers) networks followed by global average pooling, batch normalization, dropout, and dense layer	Accuracy: 92.05%
Wang, et al., 2020 [25]	Three CNN networks (feature extraction network; attention-based feature aggregation network; classification network)	7803 US thyroid nodule images from 1046 examinations	CNN approach based on Inception-Resnet-v2 (164 layers)	Method AUC 0.9006 Both the accuracy and sensitivity are significantly higher than sonographers.
Thomas, et al., 2020 [26]	AIBx, AI model to risk stratify thyroid nodules	2025 US images of 482 thyroid nodules (internal dataset) and 103 nodules (external dataset)	CNN approach based on ResNet 34 (34 layers)	Negative predictive value (NPV), sensitivity, specificity, positive predictive value (PPV), and accuracy of the image similarity model were greater than other cancer risk stratification systems.
Galimzianova, et al., 2020 [27]	Feature extraction and regularized logistic regression model	92 US images of 92 biopsy-confirmed thyroid nodules	Feature extraction (219 for each nodule) and elastic net regression analysis	Method AUC 0.828 (95% CI, 0.715–0.942), greater than or comparable to that of the expert classifiers
Nguyen, et al., 2019 [28]	AI-Based Thyroid Nodule Classification Using Information from Spatial and Frequency Domains	ultrasound thyroid images of 237 patients (training dataset) and 61 patients (test dataset).	CNN models (Resnet18, Resnet34, and Resnet50 were compared)	AI system with spatial domain based on deep learning, and frequency domain based on Fast Fourier transform (FFT) outperforms the state-of-the-art methods (especially CAD systems)
Buda, et al., 2019 [29]	CNN	1377 US images of thyroid nodules in 1230 patients (training dataset) and 99 nodules (internal test dataset)	Custom CNN (six blocks with 3 × 3 convolutional filters, followed by Rectified Linear Unit activation function and max pooling layer with 2 × 2 kernels).	Method AUC: 0.87 [CI 0.76, 0.95] Three ACR-TIRADS readers 0.91
Koh, et al., 2020 [30]	Two individual CNNs compared with experienced radiologist	15,375 US images of thyroid nodules (training set), 634 (internal test), 1181 (external test set).	Four CNNs including two individual CNNs, ResNet50 (50 layers) and InceptionResNetV2 (164 layers), and two classification ensembles, AlexNet-GoogLeNet-SqueezeNet ensemble and AlexNet-GoogLeNetSqueezeNet-InceptionResNetv2 ensemble	CNNs AUC similar to experienced radiologist AUC (0.87)
Wang, et al., 2019 [31]	CNN compared with experienced radiologist	351 US images with nodules and 213 images without nodules of 276 patients	CNN system in which the Resnet v2-50 (50 layers) network and YOLOv2 are integrated	CAD AUC 0.902 significantly higher than radiologist AUC 0.859 (p = 0.0434)
CAD systems
Sun, et al., 2020 [22]	Fused features combing the CNN-based features (VGG F-based features) with hand-crafted features	1037 US images of thyroid nodules (internal dataset) and 550 images (test dataset)	A support vector machine (SVM) is used for classification and fused features which combined the deep features extracted by a CNN with hand-crafted features, such as the histogram of oriented gradient (HOG), local binary patterns (LBP), and scale invariant feature transform (SIFT)	AUC of attending radiology lower than system (0.819 vs. 0.881, p = 0.0003)
Han, et al., 2021 [32]	S-Detect for Thyroid	US images of 454 thyroid nodules from 372 consecutive patients	S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea)	The sensitivities of the CAD system did not differ significantly from those of the radiologist (all p > 0.05); the specificities and accuracies were significantly lower than those of the radiologist (all p < 0.001).
Zhang, et al., 2020 [33]	AI-SONIC; Demetics Medical Technology Co., Zhejiang, China	US images of 365 thyroid nodules	AI-SONIC is a CAD based on deep learning (cascade CNN of two different CNN architectures (one with 15 convolutional layers/2 pooling layers for segmentation, and the other with 4 convolutional layers/4 pooling layers for detection), developed by Demetics Medical Technology Co., China	AUC CAD 0.788 vs. senior radiologist 0.906, p < 0.001). The use of CAD system improved the diagnostic sensitivities of both the senior and the junior radiologists
Fresilli, et al., 2020 [4]	S-Detect for Thyroid compared with an expert radiologist, a senior resident and a medical student evaluation	US images of 107 thyroid nodules	S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea)	The CAD system and the expert achieved similar values of a sensitivity and specificity (about 70%–87.5%). The specificity achieved by the student was significantly lower (76.25%).
Jin, et al., 2020 [34]	CAD system based on a modified, CNN-based TIRADS, evaluated by	US images of 789 thyroid nodules from 695 patients	CAD system based on the ACR TI-RADS automatic scoring using a CNN (no details provided).	AUC CAD 0.87 AUC Junior radiologist 0.73 (Junion + CAD): 0.83 AUC Senior radiologist 0.91
Xia, et al., 2019 [35]	S-Detect for Thyroid	US images of 180 thyroid nodules in 171 consecutive patients	S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea)	AUC CADs 0.659 (0.577–0.740) AUC radiologist 0.823 (0.758–0.887)
Jin, et al., 2019 [36]	AmCad; AmCad BioMed, Taipei City, Taiwan	33 images from 33 patients read by 81 radiologists	Commercial standalone CAD software: AmCad (version: Shanghai Sixth People’s Hospital; AmCad BioMed, Taipei City, Taiwan)	CAD AUC 0.985 (0.881–1.00) 177 contestants AUC 0.659 (0.645–0.673) (p < 0.01)
Kim, et al., 2019 [37]	S-Detect for Thyroid 1 and 2	US images of 218 thyroid nodules from 106 consecutive patients	S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea)	AUC: radiologist 0.905 (95% CI, 0.859–0.941) S-Detect 1–assisted radiologist 0.865 (0.812–0.907) S-Detect 1 0.814 (0.756–0.863) S-Detect 2-assisted radiologist 0.802 (0.743–0.853) S-Detect 2 0.748 (0.685–0.804)
Chi, et al., 2017 [38]	CAD system for thyroid nodule	Database 1 includes 428 images in total while database 2 includes 164 images in total	CAD based on fine tuning of GoogLeNet CNN (22 convolutional layers including 9 inception modules)	CAD AUC 0.9920 Experienced radiologist AUC 0.9135
Zhao, et al., 2019 [39]	CAD system for thyroid nodule systematic review and meta-analysis	Meta-analysis of 5 studies with 723 thyroid nodules from 536 patients	4 studies with S-Detect; 1 study with internal CAD based on CNN.	CAD AUC 0.90 (95% CI 0.87–0.92) Experienced radiologist AUC 0.96 (95% CI 0.94–0.97)
AI-modified TIRADS
Watkins, et al., 2021 [40]	AI-TIRADS	US images of 218 nodules from 212 patients	The AI-TIRADS is an optimization of ACR TIRADS generated by “genetic algorithms”, a subgroup of AI methods that focus on algorithms inspired by “natural selection”.	Sensitivity 93.44% Specificity 45.71% BTA, ACR-TIRADS, and AI-TIRADS have comparable diagnostic performance
Wang, et al., 2020 [41]	Google AutoML for automated nodule identification and risk stratification	US images of 252 nodules from 249 patients.	Google AutoML algorithm (AutoML Vision; Google LLC), with cloud computing and transfer learning	Accuracy of 68.7 ± 7.4% of AI-integrated TIRADS
Wildman-Tobriner, et al., 2010 [42]	AI-TIRADS	US images of 1425 biopsy-proven thyroid nodules from 1264 consecutive patients (training set); 100 nodules (test set)	The AI-TIRADS is an optimization of ACR TIRADS generated by “genetic algorithms”, a subgroup of AI methods that focus on algorithms inspired by “natural selection”.	ACR TI-RADS AUC 0.91 AI TI-RADS AUC 0.93 (with slight improvement of specificity and ease of use)

Abbreviations: ACR: American College of Radiology; AI: artificial intelligence; AIBx: AI model to risk stratify thyroid nodules; AUC: area under the curve; AutoML: Auto machine learning; BETNET: brief efficient thyroid network; CAD: computer-aided diagnosis; CI: confidence interval; CNN: convolution neural network; CSS: cascading style sheets; DLRT: deep-learning radiomics of thyroid; EDLC-TN: ensemble deep-learning classification model for thyroid nodules; FFT: Fast Fourier transform; IF-JCNN: information fusion-based joint convolutional neural network; LASSO: Least Absolute Shrinkage and Selection Operator; NPV: negative predictive value; PPV: positive predictive value; RF: radiofrequency; RFC: Random Forest classifier; RS-NET: regression–segmentation network; US: ultrasound; VGG: Visual Geometry Group.