Table 1.
Reference | Approach | Source Data | Method Details | Performance |
---|---|---|---|---|
Zhu, et al., 2021 [14] | Brief Efficient Thyroid Network (BETNET; a CSS model) | gray-scale US images of 592 patients with 600 TNs (internal dataset) 187 patients with 200 TNs (external validation dataset) |
CNN approach with 24 layers: 13 convolution layers, 5 pooling layers, 3 fully connected layers with dropouts in between | AUC 0.970, 95% CI: 0.958–0.980 in the independent validation cohort; similar to two highly skilled radiologists (0.940 and 0.953) |
Peng, et al. 2021 [15] | Deep-learning AI model (ThyNet) | 18,049 US images of 8339 patients (training set) 4305 images of 2775 patients (total test set) |
combined architecture of three networks: ResNet, ResNeXt, and DenseNet | ThyNet AUC (0.922; 95% CI 0.910–0.934] higher than that of the radiologists (0.839; CI 0.834–0.844]; p < 0.0001) |
Bai, et al., 2021 [16] | RS-Net evaluation AI model | 13,984 thyroid US images | CNN approach in which GoogLeNet is used as the backbone network. | Accuracy, sensitivity, specificity, PPV, and NPV were 88.0%, 98.1%, 79.1%, 80.5%, and 97.9%, comparable to that of a senior radiologist |
Yoon, et al., 2021 [17] | Texture analysis; least absolute shrinkage and selection operator (LASSO) logistic regression model including clinical variables | 155 US images of indeterminate thyroid nodules in 154 patients. | Texture extraction using MATLAB 2019b.; the LASSO model was used to choose the most useful predictive features. Univariable and multivariable logistic regression analyses were performed to build malignancy prediction models. | Integrated model AUC 0.839 vs. 0.583 (clinical variables only). |
Liu, et al., 2021 [18] | information fusion-based joint convolutional neural network (IF-JCNN) | 163 pairs of US images and raw radiofrequency signals of thyroid nodules | IF-JCNN contains two branched CNNs for deep feature extraction: one for US images (14 convolutional layers and 3 fully connected layers) and the other one for RF signals (12 convolutional layers and 3 fully connected layers) | The information carried by raw radiofrequency signals and ultrasound images for thyroid nodules is complementary IF-JCNN (both images and RF signals): AUC 0.956 (95% CI 0.926–0.987) |
Gomes Ataide, et al., 2020 [19] | Feature extraction and Random Forest classifier | 99 original US images | Feature extraction using MATLAB 2018b; Random Forest classifier (400 Decision Trees; Criterion: Entropy, with Bootstrap) | RFC accuracy 99.3%, sensitivity 99.4%, specificity 99.2% |
Ye, et al., 2020 [20] | Deep convolution neural network (VGG-16) | US images of 1601 nodules (training set) and test data including 209 nodules (test set) | CNN approach based on VGG-19 (16 layers with learnable weights, 13 convolutions and 3 fully connected layers) | AUC 0.9157, comparable to the experienced radiologist (0.8879; p > 0.1) |
Wei, et al., 2020 [21] | Ensemble deep learning model (EDLC-TN) | 25,509 thyroid US images | CNN model based on DenseNet and adopted as a multistep cascade pathway for an ensemble learning model with voting system. | AUC 0.941 (0.936–0.946) |
Zhou, et al., 2020 [23] | CNN-based transfer learning method named DLRT (deep-learning radiomics of thyroid) | US images of 1750 thyroid nodules (from 1734 patients) | CNN-based architecture with transfer learning strategy, with 4 hidden layers (3 transferred and a fine-tuned layer) and a fully connected layer | AUC in the external cohort 0.97 (0.95–0.99). Both a senior and a junior US radiologist had lower sensitivity and specificity than DLRT. |
Nguyen, et al., 2020 [24] | Combination of multiple CNN models (ResNet-based and InceptionNet-based) |
450 US thyroid nodule images (from 298 patients) | Combination of ResNet50-based (50 layers) and Inception-based (4 layers) networks followed by global average pooling, batch normalization, dropout, and dense layer | Accuracy: 92.05% |
Wang, et al., 2020 [25] | Three CNN networks (feature extraction network; attention-based feature aggregation network; classification network) | 7803 US thyroid nodule images from 1046 examinations | CNN approach based on Inception-Resnet-v2 (164 layers) | Method AUC 0.9006 Both the accuracy and sensitivity are significantly higher than sonographers. |
Thomas, et al., 2020 [26] | AIBx, AI model to risk stratify thyroid nodules | 2025 US images of 482 thyroid nodules (internal dataset) and 103 nodules (external dataset) | CNN approach based on ResNet 34 (34 layers) | Negative predictive value (NPV), sensitivity, specificity, positive predictive value (PPV), and accuracy of the image similarity model were greater than other cancer risk stratification systems. |
Galimzianova, et al., 2020 [27] | Feature extraction and regularized logistic regression model | 92 US images of 92 biopsy-confirmed thyroid nodules | Feature extraction (219 for each nodule) and elastic net regression analysis | Method AUC 0.828 (95% CI, 0.715–0.942), greater than or comparable to that of the expert classifiers |
Nguyen, et al., 2019 [28] | AI-Based Thyroid Nodule Classification Using Information from Spatial and Frequency Domains | ultrasound thyroid images of 237 patients (training dataset) and 61 patients (test dataset). | CNN models (Resnet18, Resnet34, and Resnet50 were compared) | AI system with spatial domain based on deep learning, and frequency domain based on Fast Fourier transform (FFT) outperforms the state-of-the-art methods (especially CAD systems) |
Buda, et al., 2019 [29] | CNN | 1377 US images of thyroid nodules in 1230 patients (training dataset) and 99 nodules (internal test dataset) | Custom CNN (six blocks with 3 × 3 convolutional filters, followed by Rectified Linear Unit activation function and max pooling layer with 2 × 2 kernels). | Method AUC: 0.87 [CI 0.76, 0.95] Three ACR-TIRADS readers 0.91 |
Koh, et al., 2020 [30] | Two individual CNNs compared with experienced radiologist | 15,375 US images of thyroid nodules (training set), 634 (internal test), 1181 (external test set). | Four CNNs including two individual CNNs, ResNet50 (50 layers) and InceptionResNetV2 (164 layers), and two classification ensembles, AlexNet-GoogLeNet-SqueezeNet ensemble and AlexNet-GoogLeNetSqueezeNet-InceptionResNetv2 ensemble |
CNNs AUC similar to experienced radiologist AUC (0.87) |
Wang, et al., 2019 [31] | CNN compared with experienced radiologist | 351 US images with nodules and 213 images without nodules of 276 patients | CNN system in which the Resnet v2-50 (50 layers) network and YOLOv2 are integrated | CAD AUC 0.902 significantly higher than radiologist AUC 0.859 (p = 0.0434) |
CAD systems | ||||
Sun, et al., 2020 [22] | Fused features combing the CNN-based features (VGG F-based features) with hand-crafted features | 1037 US images of thyroid nodules (internal dataset) and 550 images (test dataset) | A support vector machine (SVM) is used for classification and fused features which combined the deep features extracted by a CNN with hand-crafted features, such as the histogram of oriented gradient (HOG), local binary patterns (LBP), and scale invariant feature transform (SIFT) | AUC of attending radiology lower than system (0.819 vs. 0.881, p = 0.0003) |
Han, et al., 2021 [32] | S-Detect for Thyroid | US images of 454 thyroid nodules from 372 consecutive patients | S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea) | The sensitivities of the CAD system did not differ significantly from those of the radiologist (all p > 0.05); the specificities and accuracies were significantly lower than those of the radiologist (all p < 0.001). |
Zhang, et al., 2020 [33] | AI-SONIC; Demetics Medical Technology Co., Zhejiang, China | US images of 365 thyroid nodules | AI-SONIC is a CAD based on deep learning (cascade CNN of two different CNN architectures (one with 15 convolutional layers/2 pooling layers for segmentation, and the other with 4 convolutional layers/4 pooling layers for detection), developed by Demetics Medical Technology Co., China | AUC CAD 0.788 vs. senior radiologist 0.906, p < 0.001). The use of CAD system improved the diagnostic sensitivities of both the senior and the junior radiologists |
Fresilli, et al., 2020 [4] | S-Detect for Thyroid compared with an expert radiologist, a senior resident and a medical student evaluation | US images of 107 thyroid nodules | S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea) | The CAD system and the expert achieved similar values of a sensitivity and specificity (about 70%–87.5%). The specificity achieved by the student was significantly lower (76.25%). |
Jin, et al., 2020 [34] | CAD system based on a modified, CNN-based TIRADS, evaluated by | US images of 789 thyroid nodules from 695 patients | CAD system based on the ACR TI-RADS automatic scoring using a CNN (no details provided). |
AUC CAD 0.87 AUC Junior radiologist 0.73 (Junion + CAD): 0.83 AUC Senior radiologist 0.91 |
Xia, et al., 2019 [35] | S-Detect for Thyroid | US images of 180 thyroid nodules in 171 consecutive patients | S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea) | AUC CADs 0.659 (0.577–0.740) AUC radiologist 0.823 (0.758–0.887) |
Jin, et al., 2019 [36] | AmCad; AmCad BioMed, Taipei City, Taiwan | 33 images from 33 patients read by 81 radiologists | Commercial standalone CAD software: AmCad (version: Shanghai Sixth People’s Hospital; AmCad BioMed, Taipei City, Taiwan) | CAD AUC 0.985 (0.881–1.00) 177 contestants AUC 0.659 (0.645–0.673) (p < 0.01) |
Kim, et al., 2019 [37] | S-Detect for Thyroid 1 and 2 | US images of 218 thyroid nodules from 106 consecutive patients | S-Detect for Thyroid is an AI-based CAD software integrated in US equipment (Samsung Medison Co., Seoul, South Korea) | AUC: radiologist 0.905 (95% CI, 0.859–0.941) S-Detect 1–assisted radiologist 0.865 (0.812–0.907) S-Detect 1 0.814 (0.756–0.863) S-Detect 2-assisted radiologist 0.802 (0.743–0.853) S-Detect 2 0.748 (0.685–0.804) |
Chi, et al., 2017 [38] | CAD system for thyroid nodule | Database 1 includes 428 images in total while database 2 includes 164 images in total | CAD based on fine tuning of GoogLeNet CNN (22 convolutional layers including 9 inception modules) | CAD AUC 0.9920 Experienced radiologist AUC 0.9135 |
Zhao, et al., 2019 [39] | CAD system for thyroid nodule systematic review and meta-analysis | Meta-analysis of 5 studies with 723 thyroid nodules from 536 patients | 4 studies with S-Detect; 1 study with internal CAD based on CNN. | CAD AUC 0.90 (95% CI 0.87–0.92) Experienced radiologist AUC 0.96 (95% CI 0.94–0.97) |
AI-modified TIRADS | ||||
Watkins, et al., 2021 [40] | AI-TIRADS | US images of 218 nodules from 212 patients | The AI-TIRADS is an optimization of ACR TIRADS generated by “genetic algorithms”, a subgroup of AI methods that focus on algorithms inspired by “natural selection”. |
Sensitivity 93.44% Specificity 45.71% BTA, ACR-TIRADS, and AI-TIRADS have comparable diagnostic performance |
Wang, et al., 2020 [41] | Google AutoML for automated nodule identification and risk stratification | US images of 252 nodules from 249 patients. | Google AutoML algorithm (AutoML Vision; Google LLC), with cloud computing and transfer learning | Accuracy of 68.7 ± 7.4% of AI-integrated TIRADS |
Wildman-Tobriner, et al., 2010 [42] | AI-TIRADS | US images of 1425 biopsy-proven thyroid nodules from 1264 consecutive patients (training set); 100 nodules (test set) | The AI-TIRADS is an optimization of ACR TIRADS generated by “genetic algorithms”, a subgroup of AI methods that focus on algorithms inspired by “natural selection”. |
ACR TI-RADS AUC 0.91 AI TI-RADS AUC 0.93 (with slight improvement of specificity and ease of use) |
Abbreviations: ACR: American College of Radiology; AI: artificial intelligence; AIBx: AI model to risk stratify thyroid nodules; AUC: area under the curve; AutoML: Auto machine learning; BETNET: brief efficient thyroid network; CAD: computer-aided diagnosis; CI: confidence interval; CNN: convolution neural network; CSS: cascading style sheets; DLRT: deep-learning radiomics of thyroid; EDLC-TN: ensemble deep-learning classification model for thyroid nodules; FFT: Fast Fourier transform; IF-JCNN: information fusion-based joint convolutional neural network; LASSO: Least Absolute Shrinkage and Selection Operator; NPV: negative predictive value; PPV: positive predictive value; RF: radiofrequency; RFC: Random Forest classifier; RS-NET: regression–segmentation network; US: ultrasound; VGG: Visual Geometry Group.