Table 2.
Summary of Data Extracted From Selected Studiesa
S.No |
Author, Year |
Country |
Age Range, y |
Sample Size |
CVM Method Used |
Inputs |
Reference Standards/Comparisons |
Outcome |
1 | Kim et al., 202126 | Korea | 6–18 | Training: 600 images Testing: 120 images | Baccetti et al. | Images of lateral cephalograms equally distributed across stages | Two specialists | The combination of the CNN with a region-of-interest detector and segmentor module was significantly more accurate (62.5%) than without them. |
2 | Seo et al., 202127 | Korea | 6–19 | 600 lateral cephalograms | Baccetti and Franchi | Cropped images of lateral cephalograms equally distributed across stages displaying the inferior border of C2 to C4 | One radiologist | A pretrained network, Inception-ResNet-v2, had relatively high accuracy of 0.941 ± 0.018 when adapted. It also had the highest recall and precision scores among all pretrained models tested. |
3 | Amasya et al., 202028 | Turkey | 10–30 | 72 images | Baccetti and Franchi | Manually labeled image data set with 54 features and ratios with equal distribution across stages | Three dentomaxillofacial radiologists and an orthodontist | Interobserver agreement between researchers and the ANN model was substantial to almost perfect (wκ = 0.76–0.92). Percentage agreements between the ANN model and each researcher were 59.7%, 50%, 62.5%, and 61.1%. |
4 | Amasya et al., 202029 | Turkey | 10–30 | 647 images (498 for training and 149 for testing) | Baccetti and Franchi | Manually labeled image data set with 54 features and results of the evaluation by a clinical decision support system | Expert visual evaluation | Percentage agreement between the model and the visual analysis of the researcher was 86.93%, which was the highest among all models tested. |
5 | Kök et al., 202030 | Turkey | 8–17 | 419 individuals | Hassel and Farman | Measurements used in different combinations for seven neural networks | Human observer's classification | Highest classification accuracy was obtained from the model that used all 32 measurements and age as inputs. The overall accuracy was 94.2% for this model on the test data set. |
6 | Kök et al., 202031 | Turkey | 8–17 | 360 individuals | — | Measurements on the second, third, fourth, and fifth vertebrae used in different combinations as inputs for four models | Human observer's classification | Highest accuracy obtained with one of the neural networks was 0.95 when the training and test data were split into a ratio of 70%:30%. |
7 | Kök et al., 201932 | Turkey | 8–17 | 300 individuals | Hassel and Farman | Linear measurements performed on second, third, and fourth cervical vertebrae | Orthodontist | The neural network model had the second highest accuracy values for determining individual stages, except the fifth stage second-highest accuracy values (93%, 89.7%, 68.8%, 55.6%, 47.4%, and 78%) but was the most stable among all algorithms tested. |
8 | Makaremi et al., 201933 | France | Not mentioned | 1870 cephalograms | Baccetti and Franchi | Cropped images without filters and cropped images processed with mean, median, and entropy filter | Human observer | The pretrained models were not as effective as the neural network made by the researchers. The accuracy of the neural network did not exceed 90% on test images. The accuracy improved with more images and preprocessing with the entropy filter. |
ANN indicates artificial neural network; CNN, convolutional neural network.