Table 1.
Characteristics of Included studies.
S.No | References | Country | Aim | Age group | Sample size | Mode of AI used | Inputs employed | Method | Outcome |
---|---|---|---|---|---|---|---|---|---|
I. RISK OF DEVELOPING CLEFT LIP AND/OR PALATE | |||||||||
1. | Shafi et al. (26) | Pakistan | Predict the risk of development of cleft lip and palate (CLP) | N/A | 1,000 (500 in each group) | Multilayer perceptron (MLP), k nearest neighbor (kNN), decision tree and random forest (RF) | Answers to questionnaires | A questionnaire was used to elicit data on 36 input features from mothers, half of whom had cleft babies while half were controls. Data was prepared and different predictive models were applied. Accuracy of the results obtained with each were evaluated. | MLP model with three hidden layers and 28 perceptrons in each gave the highest accuracy of classification (92.6%) on test data. |
2. | Machado et al. (27) | Brazil | Predict the genetic risk of development of non-syndromic cleft lip with or without palate (NSCL ± P) | N/A | 722 participants with NSCL ± P and 866 without NSCL ± P | RF and Neural Networks (NN) | Single Nucleotide Polymorphisms (SNP) | A model for genetic risk assessment of NSCL ± P in the Brazilian population by subjecting 72 known SNPs to RF which was used to determine significant SNPs. A NN was used to confirm the predictive model. Interactions among the SNPs were evaluated using multiple regression. | 13 SNPs were highly predictive to identify individuals with NSCL ± P in the Brazilian population after RF and NN analysis. The combination of these was able to separate NSCL ± P subjects from controls with an accuracy of 94.5%. SNP–SNP interaction showed 13 significant pairs. |
3. | Liu et al. (28) | China/USA | Assess gene interactions leading to development of NSCL ± P | N/A | 806 patient–parent trios | Logic regression | SNPs | 173 SNPs in and around eight genes were analyzed. A genotype to genotype two way and multi way interaction of SNPs from different genes involved in cell adhesion was analyzed using a machine learning algorithm. | Two way and multiway interactions between three genes–ACTN1, CTNNB1 and CDH1, contributed toward risk of NSCL ± P. |
4. | Baker et al. (29) | U.S.A | Understand the mechanism of development of cleft palate via toxicology | N/A | 500 chemicals | Data mining | High thoroughput screening data and chemical structural descriptors | A database of cleft active and inactive chemicals with identifiers such as gene scores and chemotypes was made. This was used to model a dataset for machine learning. A data mining software was used to extract indicators suggesting molecular initiating events leading to the adverse outcome of cleft palate. | Six molecular initiating events, each associated with a cluster of chemicals, were identified which could lead to an adverse outcome of cleft palate. The pathways for the same were also identified. |
5. | Zhang et al. (30) | China | Assessment of genetic risk of developing NSCL ± P using SNPs | Infants | 587 (Han and Uygur populations) | Support Vector Machines (SVM), logistic regression (LR), naïve bayes, decision trees, RF, k-NN, artificial neural network (ANN) | SNPs | 43 SNP candidates were examined and their diagnostic ability in genetic risk assessment in the Chinese populations was validated using machine learning methods. From these, a panel of 24 SNPs was evaluated further after manual selection for risk assessment efficiency. This was done by sequential removal or addition of an SNP each time the LR based model was trained. | The LR model gave the best results for genetic risk assessment in the Han population while the Uyghur population obtained better results using the SVM. In the Uyghur population, the best results were obtained using the relative risk scoring model. Assessment efficiency showed that SNPs in three genes involved in folic acid and vitamin A synthesis play an important role in incidence of NSCL ± P. |
6. | Li et al. (31) | USA | Identifying gene interactions that risk the development of NSCL ± P | N/A | 895 (Asian) and 681 (European) patient–parent trios | RF and Logic regression | SNPs | RF was used to identify plausible SNPs in Asian and European populations. Potential genotype to genotype interactions were studied by machine learning and applied to NSCL ± P case parent trios focusing on the WNT pathway and those identified by genome wide association studies. | The study found evidence of interaction between SNPs in MAFB and WNT5B genes in both populations. WNT5B may also interact with markers in 8q24 region in Europeans and WNT5A may interact with markers in IRF6 and nearby open-reading frame C1orf107 in Asians. |
II.DIAGNOSIS OF CLEFT LIP AND/OR PALATE | |||||||||
7. | Jurek et al. (32) | Poland | Identify cleft palate in a prenatal ultrasound | Fetuses (11–13 weeks of gestation) | 49 (36 non-cleft participants and 13 with CP) | Syntactic Pattern Recognition | Processed image of the fetal palate | Ultrasound images were used as input which were subject to processing for detecting the contour of the palate. Echogenicity histograms were generated by sequential image analysis. These were used to classify fetal palates as physiologic or pathologic. | The proposed method was able to identify 81.6% of the images effectively. |
8. | Zhang et al. (33) | China | Estimate alveolar cleft defect voume prior to secondary alveolar bone grafting | 11 ± 2.8 years | 21 CBCTs | Deep Neural Network (DNN) | CBCT images | A partial non-rigid registration- based framework was used to determine the volume of bone missing in the alveolar cleft. The system was compared to other main stream non-rigid registration methods. The consistency between the estimate made by using the proposed system and ground truth was evaluated using the Dice similarity coefficient (DSC). | A completely automated system that can estimate bone volume in a cleft of the alveolus was created. The DSC of the proposed method was between 0.88 for the maxilla and 0.83 for the cleft. The relative volume error of the system (8%) was less than any of the mainstream systems. |
9. | Alam and Alfawzan (34) | Saudi Arabia | Assess radiographic characteristics in participants with cleft lip and palate | 13.29 ± 3.52 to 14.32 ± 4.46 years | 123 (31 non-cleft and 92 participants with clefts) | AI based software (Webceph, Korea) was used | Lateral Cephalometric Radiographs | Radiographic characteristics of sella turcica of all the groups: unilateral CLP (UCLP), bilateral CLP (BCLP), unilateral cleft lip and alveolus (UCLA) and unilateral cleft lip (UCL), were compared and skeletal malocclusion was identified in subjects with bridging of sella turcica using an AI based software. | Sella turcica bridging, Skeletal class III malocclusions were more common in cleft subjects. Measurements of sella turcica were lowest in participants with bilateral cleft lip and palate followed by non-cleft individuals. |
10. | Alam and Alfawzan (35) | Saudi Arabia | Assess dental characteristics in participants with cleft lip and palate | 13.29 ± 3.52 to 14.32 ± 4.46 years | 123 (31 non-cleft and 92 participants with cleft) | AI based software (Webceph, Korea) was used | Lateral Cephalometric Radiographs | An AI based software was used to assess dental characteristics among groups (non-cleft, BCLP, UCLP, UCLA, UCL). The results were statistically analyzed. | Of the 14 dental characteristics evaluated, eight were significantly altered in non-syndromic cleft individuals. |
11. | Agarwal et al. (36) | U.S.A | Detect cleft lip on images | Newborns/children | 1,451 images | Convolutional Neural Network (CNN) and SVM | Images annotated with landmarks around the nose and mouth | A pre-trained convolutional neural network (AlexNet) was trained to extract features from facial images. These were used as inputs for an SVM classifier. Four models were tested to classify images as unilateral or bilateral cleft or normal. | The proposed model with augmented images using high level features extracted from the CNN that served as input for the SVM classifier showed the best validation accuracy (92.22%) and testing accuracy (84.12%). |
12. | Wu et al. (37) | USA | Detect the midfacial plane in individuals with cleft lip | 3.2–6.7 months | 50 subjects | AI based software was used to detect landmarks | 3D facial images | Five 3D images of each subject were used to detect a mid-facial plane using five methods–direct placement, manual landmark, mirror method, deformation method and the machine learning method. The planes were rated for accuracy by three cleft surgeons, two craniofacial pediatricians and one craniofacial morphology researcher. |
The manual based methods were rated the best of all. Of the computer based methods, the deformation method received the best scores. The machine learning method performed slightly better than the mirror method. |
III.PRE-SURGICAL ORTHOPEDICS | |||||||||
13. | Schiebl et al. (38) | Germany | Use sequential NAM plates generated using an AI software to reduce the size of the alveolar cleft | Neonates | 17 infants with BCLP for development and 6 sets for validation | Development of an algorithm for automated processing | Digitized maxillary impressions | An algorithm was designed to segment the defective structures from a scanned maxillary model and create a mesh with virtual bridging of the defect. NAM plates were generated from the mesh. The plates were reshaped and resized to fit the maxilla as it grows. | The algorithm generated plates in 16 cases for 16 weeks of treatment. They were anatomically correct with minor deviations in the structure. On validation, 5 of the 6 plates were made by the algorithm. On assessment by a healthcare professional, 3 out of the 5 were evaluated as useful. |
IV.SPEECH ASSESSMENT | |||||||||
14. | Mathad et al. (39) | India | Evaluate misarticulated stops in participants with cleft palate (CP) | 6–12 years | 61 (31 participants with repaired cleft palate and 30 without cleft palate) | SVM classifier | Vowel onset points | A method is proposed to detect Vowel onset point to segment consonant-vowel transitions and analyse misarticulated stops. These are analyzed to train a SVM classifier that identifies misarticulated stops. | Vowel onset point detection by the proposed method showed 90% accuracy for normal speech and 75% for misarticulated speech. The SVM classifier obtained an accuracy of 90.57% based on VOP detected by the proposed method. Manual detection of VOPs increased the accuracy to 91.92% |
15. | Dubey et al. (40) | India | Detect hypernasality in individuals with CP | 7–12 years | 60 (30 individuals without CP and 30 with CP) | SVM | Normalized harmonic amplitude, Harmonics amplitude ratio and prominent harmonics frequency | Speech recordings from participants were analyzed based on baseline features and proposed features. SVMs were used to classify speech based on the proposed features, individually and in combination. The results obtained were compared to those obtained by baseline features. | The highest level of accuracy was seen when the three proposed features–normalized harmonic amplitude, harmonic amplitude ratio and prominent harmonics frequency were used in combination (up to 87.89%). This was better than the individual features and baseline features. |
16. | Wang et al. (41) | China | Detect hypernasality in individuals with CP | 5–12 years | 144 (72 children in each group) | Long Short Term Memory–Deep Recurrent Neural Network (LSTM–DRNN) | Linear Prediction Coefficients (LPC), Vocal tract area, reflection coefficients of vocal tract, formants + bandwidths, Vowel Space Area, GD spectrum, Voice Low tone to High tone Ratio, Power Spectrum Density, LPC spectrum, Long-term Spectral Flatness Measure, and Spectrum of Vocal Tract Impulse Response, Removing Glottal Excitation, Linear Prediction Coefficients Cepstrum (LPCC), and LPCC distance, Mel-Frequency Cepstral Coefficients (MFCCs), Mel spectrum, and 1/3 octave spectra | An automatic hypernasal speech detection system based on four category features–vocal shape based features, formant based features, vocal tract based cepstral features and vocal tract features. The results were compared to shallow classifiers using both mined features from the LSTM-DRNN and features without the mining process as input. | The LSTM–DRNN classifier had an accuracy of 92.67% when all four vocal tract based features were combined which was better than any shallow classifier. This was better than using the mined features as input for shallow classifiers as well as those not obtained by the mining process. |
17. | Wutiwiwatchai (42) | Thailand | Detect nasal and oral phonemes in individuals with CP | 7–12 years | 60 (30 individuals with CP and 30 without CP) | Gausian Mixture Model (GMM) and DNN | Speech samples in the Thai language | A novel instrument was tested to asisst the speech therapist in assessment. DNN and GMM were used to identify oral and nasal phonemes correctly from mixed ora-nasal and oral only input of speech. The output sequence was compared with its correct phoneme sequence to compute the accuracy in identification. | The DNN showed better recognition than the GMM. Identification of phonemes was better in normal speech compared to that of cleft-palate. The oral-only input provided a better outcome than an oro-nasal input. |
18. | Golabbakhsh et al. (43) | Iran | Detect hypernasality in participants with CLP | 4–28 years | 30 (15 individuals without CLP and 15 with CLP) | SVM | Jitter, shimmer, mel frequency cepstral coefficient (MFCC), bionic wavelet transform energy and bionic wavelet transform entropy | Recorded sentences from children were used to extract time and frequency features–jitter, shimmer, mel frequency cepstral coefficient (MFCC), bionic wavelet transform energy and bionic wavelet transform entropy. Using combinations of different features, hypernasal voice was classified from normal using SVM. The accuracy, sensitivity and specificity was evaluated against manual classification. |
The use of mel frequency cepstral coefficient with bionic wavelet transform energy showed the highest accuracy and sensitivity though the values varied for every sentence. |
19. | Liu et al. (44) | China | Classify the level of hypernasality in individuals with CP | 64 (48 individuals of unrepaired CP and 16 without CP) | Back propagation neural network | Homomorphic spectrum sequence | Speech data containing four basic vowels is classified into levels of hypernasality. The feature extracted is the homomorphic Spectrum feature and a back propagation neural network is applied to classify the speech |
The classification accuracy for different levels of hypernasality reached up to 80.7% and the accuracy for detection of normal or hypernasal speech reached 94.5% | |
20. | He et al. (45) | China, Australia | Detect resonance and consonant articulation disorders in individuals with CP | 5–12 years | 120 individuals with CP | GMM classifier | Short-time Shannon energy, non-linear Teager energy operator, mel frequency cepstral coefficient | Four acoustic features were extracted from the data and a GMM is used to classify speech based on individual features. Identification of consonant omission and consonant replacement was also done. | The system had the highest accuracy for classification of hypernasality based on the MFCC features (80.40%). Evaluation of consonant misarticulations had different accuracy for each consonant. |
21. | He et al. (46) | China | Classify speech hypernasality and consonant articulation in individuals with of CP | 5–12 years | 120 individuals with CP | GMM classifier | Teager energy operator, first formant, number of formants | Participant recordings were classified as hypernasal or normal based on energy distribution ratio. A GMM classifier further detected one of the three levels of hypernasal speech using acoustic features. A method to detect initial consonant omission was also proposed based on the diiference in energy amplified frequency bands. |
The accuracy of the system to detect levels of hypernasality is 80.74%. The classification accuracy for detecting consonant omission is 94.87%. |
22. | Bocklet et al. (47) | Germany | Use phonemes to detect intelligibility in participants with cleft lip and palate (CLP) | Avg 7 years (1–17 years) | 630 (380 individuals without CLP, 250 individuals with CLP) | SVM | Mel frequency cepstral coefficient, transformation matrices of Maximum Likelihood Linear Regression | Two approaches–a GMM and a Maximum Likelihood Linear Regression were used to model the articulatory space of a speaker based on speech data annotated by therapists. A Support Vector regression model was used to predict the level of intelligibility of the speech of CLP children based on vectors identified by the two approaches. Speech therapist labels were considered ground truth scores. |
The maximum likelihood linear regression model was more effective in automatic phoneme evaluation which was in the same range as perceptual labels by speech therapists. |
23. | He et al. (48) | China | Evaluate intelligibility and hypernasality in individuals with CP | 240 words (uttered by individuals with CP) | GMM classifier | Shannon Energy, mel frequency cepstral coefficient | Two acoustic features were extracted. These were used to classify hypernasality and to detect speech intelligibility using GMM. The data was split into training and testing sets (80%-20%). Speech intelligibility was assessed by word recognition. | A combination of the two acoustic features was a better classifier for hypernasality than the features alone. The accuracy for hypernasality detection was up to 85.42%. The classification accuracy for speech intelligibility was 75% for normal but dropped with increase in levels to 16.5%. | |
V.SURGERY | |||||||||
24. | Lin et al. (49) | Korea | Predict the need for orthognathic surgery using radiographs taken in childhood | 5–7 years | 56 cephalograms of individuals with UCLP | Random Forest and Xboost Algorithm | Cephalometric variables measured by an operator | Cephalometric parameters of participants with UCLP were assessed at the age of 5–7 years and later at 15 years. The data was processed by the Boruta method to determine cephalometric predictors for orthognathic surgery at an earlier age. | Significant cephalometric differences were found between participants needing surgery and those who did not. Four predictors–inclination of palatal plane to FH, ANB angle, combination factor and FCA were selected as predictors for need of orthognathic surgery in the future. The model had an accuracy of 87.4%, sensitivity of 97.83% and specificity of 90%. |
25. | Li et al. (50) | China | Assist in placement of surgical markers in individuals | Neonates | 2,568 images | DNN | Frontal facial images annotated with surgical markers | A dataset of pictures annotated with surgical markers was created. This was divided into training, validation and test set of pictures. After training the model, the test pictures were used to identify markers using the proposed model and compared to other methods of feature extraction. | The proposed model showed a better localization of surgical markers than other models used for image feature extraction. |
26. | Park et al. (51) | Korea | Predict the need for orthognathic surgery using radiographs taken in childhood | 9.3 years | 131 Korean males (CLA: n = 35), UCLP: n = 56), BCLP: n = 40). | SVM | Cephalometric variables measured by an operator | A prediction model was made using forest wrapping that was based on cephalometric parameters. Accuracy was verified by a 10-fold cross validation test. | A total of 10 cephalometric variables were selected as predictors with an accuracy of 77.3%. The sensitivity of the model was 99% and specificity was 74.1%. |