Skip to main content
. 2023 Oct 18;11(20):2760. doi: 10.3390/healthcare11202760

Table 1.

The application of AI in cephalometric analysis in the past 5 years.

Author (Year) Data Type Dataset Size
(Training/Test)
No. of Landmarks/Measurements Algorithm Performance
Payer et al. (2019) [28] Lateral cephalograms 150/250 19/0 CNN Error radii: 26.67% (2 mm), 21.24% (2.5 mm), 16.76% (3 mm), and 10.25% (4 mm).
Nishimoto et al. (2019) [29] Lateral cephalograms 153/66 10/12 CNN Average prediction errors: 17.02 pixels.
Median prediction errors: 16.22 pixels.
Zhong et al. (2019) [30] Lateral cephalograms 150/100
(additional 150 images than validation set).
19/0 U-Net Test 1:
MRE: 1.12 ± 0.88 mm.
SDR within 2, 2.5, 3, and 4 mm: 86.91%, 91.82%, 94.88%, and 97.90%, respectively.
Test 2:
MRE: 1.42 ± 0.84 mm.
SDR within 2, 2.5, 3, and 4 mm: 76.00%, 82.90%, 88.74%, and 94.32%, respectively.
Park et al. (2019) [31] Lateral cephalograms 1028/283 80/0 YOLOv3, SSD YOLOv3 demonstrated overall superiority over SSD in terms of accuracy and computational performance.
For YOLOv3, SDR within 2, 2.5, 3, and 4 mm: 80.40%, 87.4%, 92.00%, and 96.2%, respectively.
Moon et al. (2020) [32] Lateral cephalograms Training: 50, 100, 200, 400, 800, 1200, 1600, 2000.
Test: 200.
19, 40, 80 CNN (YOLOv3) The accuracy of AI is positively correlated with the number of training datasets and negatively correlated with the number of detection targets.
Hwang et al. (2020) [33] Lateral cephalograms 1028/283 A total of 80 CNN (YOLOv3) Mean detection error: 1.46 ± 2.97 mm.
Oh et al. (2020) [34] Lateral cephalograms 150/100
(additional 150 images than validation set).
19/8 CNN (DACFL) MRE: 14.55 ± 8.22 pixel.
SDR within 2, 2.5, 3, and 4 mm: 75.9%, 83.4%, 89.3%, and 94.7%, respectively.
Classification accuracy: 83.94%.
Kim et al. (2020) [35] Lateral cephalograms 1675/400 23/8 Stacked hourglass deep learning model. Point-to-point error: 1.37 ± 1.79 mm.
SCR: 88.43%.
Kunz et al. (2020) [36] Lateral cephalograms 1792/50 18/12 CNN The CNN models showed almost no statistically significant differences with the humans’ gold standard.
Alqahtani et al. (2020) [37] Lateral cephalograms -/30 16/16 Commercially available web-based platform (CephX, https://www.orca-ai.com/, accessed on 23 August 2023) The results obtained from CephX and manual landmarking did not exhibit clinically significant differences.
Lee et al. (2020) [38] Lateral cephalograms 150/250 19/8 Bayesian CNN Mean landmark error: 1.53 ± 1.74 mm.
SDR within 2, 3, and 4 mm: 82.11%, 92.28%, and 95.95%, respectively.
Classification accuracy: 72.69~84.74.
Yu et al. (2020) [39] Lateral cephalograms A total of 5890 Four skeletal classification indicators. Multimodal CNN Sensitivity, specificity, and accuracy for vertical and sagittal skeletal classification: >90%.
Li et al. (2020) [40] Lateral cephalograms 150/100
(additional 150 images than validation set).
19/0 GCN MRE: 1.43 mm.
SDR within 2, 2.5, 3, and 4 mm: 76.57%, 83.68%, 88.21%, and 94.31%, respectively.
Tanikawa et al. (2021) [41] Lateral cephalograms 1755/30 for each subgroup 26/0 CNN Mean success rate: 85~91%.
Mean identification error: 1.32~1.50 mm.
Zeng et al. (2021) [42] Lateral cephalograms 150/100
(additional 150 images than validation set).
19/8 CNN MRE: 1.64 ± 0.91 mm.
SDR within 2, 2.5, 3, and 4 mm: 70.58%, 79.53%, 86.05%, and 93.32%, respectively.
SCR: 79.27%.
Kim et al. (2021) [24] Lateral cephalograms 2610/100
(additional 440 images than validation set)
20/0 Cascade CNN Overall detection error: 1.36 ± 0.98 mm.
Hwang et al. (2021) [43] Lateral cephalograms 1983/200 19/8 CNN (YOLOv3) SDR within 2, 2.5, 3, and 4 mm: 75.45%, 83.66%, 88.92%, and 94.24%, respectively.
SCR: 81.53%.
Bulatova et al. (2021) [44] Lateral cephalograms -/110 16/0 CNN (YOLOv3) (Ceppro software) Total of 12 out of 16 points showed no statistical difference in absolute differences between AI and manual landmarking.
Jeon et al. (2021) [45] Lateral cephalograms -/35 16/26 CNN None of the measurements showed statistically differences except the saddle angle, linear measurements of maxillary incisor to NA line and mandibular incisor to NB line.
Hong et al. (2022) [46] Lateral cephalograms 3004/184 20/ Cascade CNN Total mean error was 1.17 mm.
Accuracy percentage: 74.2%.
Le et al. (2022) [47] Lateral cephalograms 1193/100 41/8 CNN (DACFL) MRE of 1.87 ± 2.04 mm.
SDR within 2, 2.5, 3, and 4 mm: 73.32%, 80.39%, 85.61%, and 91.68%, respectively.
Average SCR: 83.75%.
Mahto et al. (2022) [48] Lateral cephalograms -/30 18/12 Commercially available web-based platform (WebCeph, https://webceph.com, accessed on 23 August 2023) Intraclass correlation coefficient:
7 parameters >0.9 (excellent agreement), 5 parameters: 0.75~0.9 (good agreement).
Uğurlu et al. (2022) [49] Lateral cephalograms 1360/180
(additional 140 images than validation set)
21/0 CNN (FARNet) MRE: 3.4 ± 1.57 mm.
SDR within 2, 2.5, 3, 4 mm: 76.2%, 83.5%, 88.2%, 93.4%, respectively.
Yao et al. (2022) [50] Lateral cephalograms 312/100 (additional 100 images than validation set) 37/0 CNN MRE: 1.038 ± 0.893 mm.
SDR within 1, 1.5, 2, 2.5, 3, 3.5, 4 mm: 54.05%, 91.89%, 97.30%, 100%, 100%, 100%, respectively.
Lu et al. (2022) [51] Lateral cephalograms 150/250 19/0 GCN MRE: 1.19 mm.
SDR within 2, 2.5, 3, and 4 mm: 83.20%, 88.93%, 92.88%, and 97.07%, respectively.
Tsolakis et al. (2022) [52] Lateral cephalograms -/100 16/18 CNN (commercially available software: CS imaging V8). Differences between the AI software (CS imaging V8) and manual landmarking were not clinically significant.
Duran et al. (2023) [53] Lateral cephalograms -/50 32/18 Commercially available web-based platform (OrthoDx, https://ortho dx.phime ntum.com; WebCeph, https://webceph.com, accessed on 23 August 2023) Consistency between AI software and manual landmarking:
A statistically significant good level: angular measurements; a weak level: linear measurement and soft tissue parameters.
Ye et al. (2023) [54] Lateral cephalograms -/43 32/0 Commercially available software (MyOrthoX, Angelalign, and Digident) MRE:
MyOrthoX: 0.97 ± 0.51 mm.
Angelalign: 0.80 ± 0.26 mm.
Digident: 1.11 ± 0.48 mm.
SDR (%) (within 1/1.5/2 mm):
MyOrthoX: 67.02 ± 10.23/82.80 ± 7.36/89.99 ± 5.17.
Angelalign: 78.08 ± 14.23/89.29 ± 14.02/93.09 ± 13.64.
Digident: 59.13 ± 10.36/78.72 ± 5.97/87.53 ± 4.84.
Ueda et al. (2023) [55] Lateral cephalometric data A total of 220 0/8 RF Overall accuracy: 0.823 ± 0.060.
Bao et al.(2023) [56] Reconstructed lateral cephalograms from CBCT -/85 19/23 Commercially available software (Planmeca Romexis 6.2) For landmarks:
MRE: 2.07 ± 1.35 mm
SDR within 1, 2, 2.5, 3, and 4 mm: 18.82%, 58.58%, 71.70%, 82.04%, and 91.39%, respectively.
For measurements:
The rates of consistency within the 95% limits of agreement: 91.76~98.82%.
Kim et al. (2021) [57] Reconstructed
Posteroanterior cephalograms from CBCT
345/85 23/0 Multi-stage CNN MRE: 2.23 ± 2.02 mm
SDR within 2 mm: 60.88%.
Takeda et al. (2021) [58] Posteroanterior cephalograms 320/80 4/1 CNN, RF The CNN showed higher coefficient of determination than RF and less mean absolute error for the distance from the vertical reference line to menton.
CNN with a stochastic gradient descent optimizer had the best performance.
Lee et al. (2019) [59] CBCT 20/7 7 Deep learning Average point-to-point error: 1.5 mm.
Torosdagli et al. (2019) [60] CBCT A total of 50 9/0 Deep geodesic learning Errors in the pixel space: <3 pixels for all landmarks.
Yun et al. (2020) [61] CBCT 230/25 93/0 CNN Average point-to-point error: 3.63 mm.
Kang et al. (2021) [62] CT 20/8 16/0 Multi-stage DRL Mean detection error: 1.96 ± 0.78.
SDR within 2, 2.5, 3, and 4 mm: 58.99%, 75.39%, 86.52%, and 95.70%, respectively.
Ghowsi et al. (2022) [63] CBCT -/100 53/0 Commercially available software (Stratovan Corporation) Mean absolute error: 1.57 mm.
Mean error distance: 3.19 ± 2.6 mm.
SDR within 2, 2.5, 3, and 4 mm: 35%, 48%, 59%, and 75%, respectively.
Dot et al. (2022) [64] CT 128/38
(additional 32 images as validation set).
33/15 SCN For landmarks:
MRE: 1.0 ± 1.3 mm.
SDR within 2, 2.5, and 3 mm: 90.4%, 93.6%, and 95.4%, respectively.
For measurements:
Mean errors: −0.3 ± 1.3° (angular), −0.1 ± 0.7 mm (linear).
Blum et al. (2023) [65] CBCT 931/114 35/0 CNN Mean error: 2.73 mm.

MRE, mean radial error; SDR, success detection rate; YOLOv3, You-Only-Look-Once version 3; SSD, Single-Shot Multibox Detector; SCR, success classification rates; DACFL, deep anatomical context feature learning; CBCT, cone-beam computed tomography; GCN, graph convolutional networks, FARNet, feature aggregation and refinement network; DRL, deep reinforcement learning; CT, computerized tomography; SCN, SpatialConfiguration-Net.