Abstract
Preterm birth (PTB) is one of the most common and serious complications of pregnancy, leading to mortality and severe morbidities that can impact lifelong health. PTB could be associated with various maternal medical condition and dental status including periodontitis. The purpose of this study was to identify major predictors of PTB among clinical and dental variables using machine learning methods. Prospective cohort data were obtained from 60 women who delivered singleton births via cesarean section (30 PTB, 30 full-term birth [FTB]). Dependent variables were PTB and spontaneous PTB (SPTB). 15 independent variables (10 clinical and 5 dental factors) were selected for inclusion in the machine learning analysis. Random forest (RF) variable importance was used to identify the major predictors of PTB and SPTB. Shapley additive explanation (SHAP) values were calculated to analyze the directions of the associations between the predictors and PTB/SPTB. Major predictors of PTB identified by RF variable importance included pre-pregnancy body mass index (BMI), modified gingival index (MGI), preeclampsia, decayed missing filled teeth (DMFT) index, and maternal age as in top five rankings. SHAP values revealed positive correlations between PTB/SPTB and its major predictors such as premature rupture of the membranes, pre-pregnancy BMI, maternal age, and MGI. The positive correlations between these predictors and PTB emphasize the need for integrated medical and dental care during pregnancy. Future research should focus on validating these predictors in larger populations and exploring interventions to mitigate these risk factors.
Keywords: Artificial intelligence, Gingival index, Periodontitis, Preterm birth, Preterm labor
Subject terms: Comorbidities, Oral manifestations, Reproductive signs and symptoms
Introduction
Preterm birth (PTB), defined as a birth prior to 37 weeks of gestation, is one of the most common and serious pregnancy complications that can lead to significant morbidity and mortality among neonates, infants, and children aged < 5 years1. Infant prematurity caused by PTB leads to several serious perinatal illnesses, including intraventricular hemorrhage, retinopathy of prematurity, necrotizing enterocolitis, and bronchopulmonary dysplasia, which could result in long-term neurodevelopmental impairment2–5. Some of these are lifelong complications manifesting not only in the neonatal period but also throughout life.
PTBs can be classified largely into two subsets: (1) intended PTBs, in which labor is induced or the infant is delivered by cesarean section without prelabor due to maternal or fetal indications, such as preeclampsia or nonreassuring fetal conditions; and (2) spontaneous PTBs (SPTBs), in which preterm labor occurs with or without the prelabor rupture of membranes (PROM)6. While all PTBs are threatening, SPTBs are responsible for two-thirds of all PTBs and are difficult to predict7.
Risk factors often discussed for PTB or SPTB can be largely classified into two categories: (1) maternal/environmental factors, including advanced age, low socioeconomic status (SES), smoking, alcohol consumption, drug use, mood disorders or psychological stress, nutritional factors, and pre-pregnant body mass index (BMI); and (2) clinical/obstetric factors, including previous PTB, multiple gestations, short cervical length (CL), urinary tract infections, diabetes, hypertension, thyroid disorders, PROMs, and fetal anomalies8. Despite the numerous risk factors identified through decades of research, prediction rates for PTBs remain low, and the overall rate is maintaining, if not rising9. The prediction strategy for PTB commonly considered nowadays includes risk evaluation, serial measurement of CL, and assessment of biochemical biomarkers, including fetal fibronectin and inflammatory cytokines. Although CL measurement has become a routine procedure in obstetrics beginning at midterm pregnancy, its clinical utility remains controversial, especially in low-risk populations10,11. There are discrepant recommendations for fibronectin screening, and low cost-effectiveness as well as patient burden should also be considered12. More importantly, there is a lack of intervention measures during early or mid-pregnancy to lower the risk of PTB. This highlights the need to develop more reliable strategies for predicting and preventing PTBs.
Periodontitis is a chronic inflammatory disease that occurs in the oral cavity and is widely associated with PTBs and pregnancy complications, including preeclampsia and low birth weight13,14. Periodontitis and PTBs share risk factors such as low SES, high BMI, smoking, alcohol consumption, and clinical factors including diabetes and hypertension. In theory, periodontitis can elevate the inflammatory state of the placental tissue through either direct or indirect pathways, contributing to improper activation of the labor cascade that leads to PTB15. In this context, association between the two disease entities has been investigated in numerous studies.
In a recent nationwide population-based cohort study conducted in Taiwan, presence of periodontal disease in expectant mothers was found to be positively correlated with the increase of preterm delivery risk16. In another cohort study conducted in Brazil involving 2474 participants, it was also revealed that preexisting periodontitis of the mother was associated with nearly twice the risk of PTB17. In a Brazilian case–control study, combination of periodontitis and hypertension elevated the risks of PTB and low birth weight by four times18. However, such possible associations between the two have not yet been confirmed because many intervention studies have shown contradictory results, and even the majority of observational studies published thus far are retrospective studies14,19.
This study aimed to determine the significance of periodontitis-related parameters as predictors of PTB using prospectively collected data and machine learning (ML) methods. ML is a subfield of artificial intelligence (AI) that uses algorithms trained on data to produce adaptable models that can perform various tasks. In medical field, in particular, ML can be used in building disease risk prediction model along with many other tasks such as the analysis of health care utilization, chronic disease surveillance, and comparing disease prevalence and drug outcomes20. ML-based data analysis has been proven to be superior to the conventional approach in risk factor assessment as ML methods are relatively free from data assumption, and can model the complex nonlinear relationships between predictors and the outcome unlike traditional linear regression models21. Therefore, using ML methods in risk factor analysis makes it feasible to identify additional screening tools for the disease by filling the gap in analysis using conventional approaches22.
In this context, this study used ML and prospective cohort data to test the association between the clinical and dental predictors of PTB.
Results
Descriptive statistics
Descriptive statistics are presented in Table 1. Among the included clinical variables, the incidences of preeclampsia, PROM, preterm labor, and labor at delivery were significantly higher in the PTB group (P < 0.05). Maternal age, incidence of in vitro fertilization, gestational diabetes mellitus (GDM), and smoking are known risk factors for PTB but showed no significant differences in our study population. There was also no significant difference in white blood cells (WBC) count and C-reactive protein (CRP) level before delivery. While periodontitis stage, decayed missing filled teeth (DMFT) index, and plaque index (PI) showed no differences between the groups, modified gingival index (MGI) showed significantly higher values in the PTB group (P < 0.05). Among total 60 participants, 3 participants were periodontally healthy showing no clinical attachment level (CAL) loss and bleeding on probing (BOP) < 10%. Among the participants diagnosed as periodontitis, 14/57 (24.6%) had generalized periodontitis while 44/57 (77.2%) had localized form.
Table 1.
Variables | Preterm birth (N = 30) |
Full-term birth (N = 30) |
P-value |
---|---|---|---|
Maternal age ≥ 35 years |
34 (31, 37) 12 (40.0) |
34 (32, 36) 12 (40.0) |
0.973 > 0.999 |
Pre-pregnancy BMI | 23.2 (21.8, 26.4) | 21 (19.5, 22.9) | 0.057 |
BMI changes during pregnancy | 3.7 (2.7, 5.3) | 4.6 (3.6, 6.1) | 0.118 |
In vitro fertilization | 3 (10.0) | 5 (16.7) | 0.448 |
Prior preterm birth | 3 (10.0) | 3 (10.0) | > 0.999 |
Preeclampsia | 14 (46.7) | 2 (6.7) | 0.001 |
Chronic hypertension | 0 (0.0) | 2 (6.7) | 0.150 |
Gestational diabetes mellitus | 4 (13.3) | 3 (10.0) | 0.688 |
Premature rupture of the membranes | 9 (30.0) | 1 (3.3) | 0.006 |
Histologic chorioamnionitis | 0.110 | ||
Stage 1 | 2 (6.7) | 0 (0.0) | |
Stage 2 | 1 (3.3) | 0 (0.0) | |
Stage 3 | 3 (10.0) | 0 (0.0) | |
Funisitis | 2/29 (6.9) | 0/27 (0.0) | 0.492 |
Preterm labor | 11 (36.7) | 4 (13.3) | 0.037 |
Labor at delivery | 10 (33.3) | 0 (0.0) | 0.001 |
WBC before delivery (103/μL) | 9.5 (8.1, 12.7) | 8.8 (7.0, 10.6) | 0.117 |
CRP before delivery (mg/L) | 4.8 (2.0, 8.5) | 2.5 (1.5, 5.2) | 0.159 |
Smoker | 0 (0.0) | 0 (0.0) | - |
Periodontal health | 0.665 | ||
Periodontitis | 1 (3.3) | 2 (6.7) | |
Stage 1 | 17 (56.7) | 20 (66.7) | |
Stage 2 | 10 (33.3) | 6 (20.0) | |
Stage 3 | 2 (6.7) | 2 (6.7) | |
Stage 4 | 0 (0.0) | 0 (0.0) | |
Periodontitis Stage ≥ 2 | 12 (40.0) | 8 (26.7) | 0.273 |
MGI | 2.3 (1.5, 2.8) | 1.6 (1, 2.4) | 0.032 |
MGI ≥ 2 | 17 (56.7) | 9 (30.0) | 0.037 |
Decayed, missing, filled teeth index | 5 (3, 9) | 7 (5, 10) | 0.413 |
Plaque index ≥ 2 | 16 (53.3) | 10 (33.3) | 0.118 |
Maternal age, pre-pregnancy BMI, BMI changes during pregnancy, and modified gingival index are presented by mean values (25th and 75th percentile). Other variables are stated in numbers (percent).
BMI, body mass index; CRP, C-reactive protein; MGI, modified gingival index; WBC, white blood cells.
Machine learning analysis
The model performance is detailed in Table 2. Random forest (RF) registered a similar performance compared to logistic regression (LR), that is, 65% vs. 68% (accuracy) and 73% vs. 72% (area under the receiver operating characteristic curve [AUC]), 61% vs. 64% (specificity) and 72% vs. 74% (sensitivity) for PTB; 83% vs. 81% (accuracy) and 86% vs. 86% (AUC), 96% vs. 92% (specificity) and 66% vs. 64% (sensitivity) for SPTB. Based on RF variable importance in Table 3, pre-pregnancy BMI, MGI, preeclampsia, DMFT index, and maternal age were ranked among the top five for PTB. The RF variable importance rankings of PROM, pre-pregnancy BMI, maternal age, DMFT index, and chorioamnionitis (CAM) stage were among the top five for SPTB.
Table 2.
Model | PTB vs. FTB | SPTB vs. FTB | ||||||
---|---|---|---|---|---|---|---|---|
Accuracy | AUC | Specificity | Sensitivity | Accuracy | AUC | Specificity | Sensitivity | |
LR | 0.68 | 0.72 | 0.64 | 0.74 | 0.81 | 0.86 | 0.92 | 0.64 |
DT | 0.64 | 0.64 | 0.64 | 0.63 | 0.76 | 0.78 | 0.80 | 0.75 |
NB | 0.59 | 0.78 | 0.92 | 0.23 | 0.78 | 0.87 | 0.78 | 0.76 |
RF | 0.65 | 0.73 | 0.61 | 0.72 | 0.83 | 0.86 | 0.96 | 0.66 |
SVM | 0.47 | 0.40 | 0.05 | 0.96 | 0.66 | 0.62 | 1.00 | 0.00 |
ANN | 0.58 | 0.63 | 0.58 | 0.60 | 0.70 | 0.71 | 0.83 | 0.45 |
Results from random forest analysis are highlighted in Bold.
LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; SVM, support vector machine; ANN, artificial neural network; AUC, area under the curve; PTB, preterm birth; FTB, full-term birth; SPTB, spontaneous preterm birth.
Table 3.
Variables | Variable importance rankings | |
---|---|---|
PTB vs. FTB | SPTB vs. FTB | |
Maternal age | 5 | 3 |
Pre-pregnancy BMI | 1 | 2 |
In vitro fertilization | 12 | 9 |
Prior preterm birth | 14 | 13 |
Preeclampsia | 3 | 14 |
Chronic hypertension | 15 | 15 |
Gestational diabetes mellitus | 10 | 8 |
PROM | 6 | 1 |
CAM | 11 | 7 |
CAM stage | 13 | 5 |
Periodontitis stage | 7 | 10 |
MGI | 2 | 6 |
MGI ≥ 2 | 8 | 11 |
Plaque index ≥ 2 | 9 | 12 |
DMFT-index | 4 | 4 |
The ranking of a top-5 predictor is highlighted in Bold.
BMI, body mass index; PROM, prelabor rupture of the membranes; CAM, chorioamnionitis; MGI, modified gingival index; DMFT, decayed, missing, filled teeth; PTB, preterm birth; FTB, full-term birth; SPTB, spontaneous preterm birth.
The Shapley additive explanation (SHAP) values for each independent variable are given in Table 4. These values indicate a decrease or increase in the probability of the dependent variable (PTB or SPTB) when a certain independent variable is included in the ML analysis. Overall, an absolute value of max SHAP (positive) greater than that of min SHAP (negative) indicates a positive relationship between the predictor and the dependent variable. Some positive associations can be observed in Table 4, particularly among the variables in the top rankings of RF permutation importance. SHAP values revealed positive correlations between PTB/SPTB and its major predictors such as PROM, pre-pregnancy BMI, maternal age, and MGI. The SHAP values for each participant are shown as individual dots in Fig. 1. In this figure, blue (or red) denotes the low (or high) index values of the predictor for a participant. For instance, in the case of MGI, blue dots with low MGI values are located on the left side with low SHAP values, whereas red dots with high MGI values are located on the right side with high SHAP values, indicating that the MGI and SHAP values have a positive correlation. SHAP value measures the difference between what ML predicts for the probability of PTB/SPTB with and without the predictor. The inclusion of MGI into the RF will increase the probability of PTB by 13%. SHAP dependence plots can be used to visualize the correlations between predictors and outcomes, enabling an intuitive understanding of the predictor contribution to the model. In addition, the plots also show correlations among the predictors if they are sufficiently strong. Figure 2a and b show the SHAP dependence plots for MGI and DMFT index, respectively. In these plots, the relationships between the predictors of MGI/DMFT index and pre-pregnancy BMI are also shown in blue (low BMI) or red (high BMI).
Table 4.
Variables | SHAP values | |||||
---|---|---|---|---|---|---|
PTB vs. FTB | SPTB vs. FTB | |||||
Min | Max | Mean | Min | Max | Mean | |
Maternal age | − 0.0805 | 0.0990 | − 0.0036 | − 0.0551 | 0.0756 | 0.0023 |
Pre-pregnancy BMI | − 0.1640 | 0.1997 | 0.0071 | − 0.0572 | 0.1038 | 0.0042 |
In vitro fertilization | − 0.0445 | 0.0191 | 0.0005 | − 0.0609 | 0.0272 | 0.0023 |
Prior preterm birth | − 0.0173 | 0.0132 | − 0.0006 | − 0.0054 | 0.0350 | − 0.0006 |
Preeclampsia | − 0.0578 | 0.1312 | 0.0000 | − 0.0214 | 0.0053 | 0.0012 |
Chronic hypertension | − 0.0800 | 0.0091 | 0.0028 | − 0.0248 | 0.0077 | 0.0017 |
Gestational diabetes mellitus | − 0.0350 | 0.0352 | − 0.0001 | − 0.0080 | 0.0020 | 0.0006 |
PROM | − 0.0577 | 0.2256 | − 0.0029 | − 0.0989 | 0.3005 | − 0.0041 |
CAM | − 0.0184 | 0.1057 | − 0.0030 | − 0.0266 | 0.1637 | − 0.0044 |
CAM stage | − 0.0192 | 0.1176 | − 0.0041 | − 0.0300 | 0.1617 | − 0.0085 |
Periodontitis stage | − 0.0268 | 0.0323 | 0.0010 | − 0.0189 | 0.0161 | 0.0025 |
MGI | − 0.1190 | 0.1274 | − 0.0005 | − 0.0681 | 0.1306 | 0.0023 |
MGI ≥ 2 | − 0.0752 | 0.0889 | − 0.0032 | − 0.0192 | 0.0283 | − 0.0015 |
Plaque index ≥ 2 | − 0.0637 | 0.0620 | 0.0001 | − 0.0073 | 0.0119 | − 0.0007 |
DMFT-index | − 0.0909 | 0.0837 | − 0.0004 | − 0.0839 | 0.1171 | 0.0024 |
The SHAP values of top-5 predictors are highlighted in Bold.
BMI, body mass index; PROM, prelabor rupture of the membranes; CAM, chorioamnionitis; MGI, modified gingival index; DMFT, decayed, missing, filled teeth; PTB, preterm birth; FTB, full-term birth; SPTB, spontaneous preterm birth.
Discussion
In this study, we found that dental factors such as MGI and DMFT index could function as major predictors in the PTB prediction model constructed using the ML method. What differentiates our study from previous ones is that we added dental factors in addition to the well-known clinical risk factors, including various clinical backgrounds and obstetric histories. In previous prediction models using either ML or multivariate regression analysis, model performance calculated using AUC ranged within 0.61–0.7123–28. The types of birth used as the outcome variables were SPTBs and PTBs. In this study, among the five ML methods tested (decision tree, naïve Bayes, RF, support vector machine; artificial neural network), RF showed the highest performance with AUC 0.73 in PTB and 0.86 in the SPTB model. RF creates many training sets, trains many decision trees, and makes a prediction with a majority vote (bagging). RF included 1000 decision trees in this study. A majority vote from 1000 doctors would be more robust than a vote from 1 doctor. In a similar context, a majority vote from 1000 decision trees would be more robust than a vote from a single ML approach.
This study confirmed the effectiveness of introducing ML for the development of predictive models for PTB. This can help to establish guidelines for PTB prediction models and serve as critical evidence for early screening and personalized preventive interventions based on these risk factors. In particular, identifying maternal dental health, such as the MGI, as an important predictor of PTB underscores the significance of generalizing dental examinations for pregnant women. Furthermore, integrating medical and dental evaluations into prenatal care protocols can facilitate early interventions, which may improve outcomes for both mothers and newborns.
This model is significant because it is the first attempt to adopt dental factors as independent variables for PTB prediction. Surprisingly, the AUC values for prediction performance were comparable to or even superior to those of previous publications. The rankings of RF variable importance verified the significance of the dental factors in this model (Table 3). MGI ranked second in the PTB and sixth in the SPTB model. This outranked well-known PTB risk factors from a medical perspective, such as maternal age (5th), prior PTB (14th), preeclampsia (3rd), chronic hypertension (15th), and GDM (10th)8. MGI indicates the level of gingival inflammation during the examination period and represents susceptibility to chronic inflammation29. In addition, considering several specificities, our patient population showed the following in terms of periodontal status: (1) mild/moderate periodontitis (stages 1 and 2) in general; and (2) no observable differences between the two groups with regard to periodontitis severity and amount of plaque biofilm. Higher MGI scores meaningfully reflect gingivitis susceptibility of the host, which correlates strongly with future periodontitis, especially in young adults30. In this context, high MGI score, especially MGI ≥ 2 in average (Fig. 2a), at prenatal screening or regular check-ups during pregnancy should indicate either high maternal susceptibility to the possible infection that could occur during the pregnancy or the tendency for progressive periodontitis, the disease whose causative pathogens are known to influence the activation of preterm labor15.
Numerous publications including systematic reviews and meta-analyses have addressed the association between periodontitis and PTB or adverse pregnancy outcomes14,16–18,31–38. Although the overall results indicate that maternal periodontitis is significantly associated with the PTB rate, the results are often inconsistent, and causal relationships are still far from being determined. Many of the studies included in the meta-analyses are case–control studies; therefore, inevitable biases and limitations are inherent14. Importantly, the definitions of periodontitis vary in different studies, and the parameters considered in periodontal diagnosis are often confined to probing pocket depth or CAL.
In this prospective study, we collected data on multiple dental parameters, including periodontitis stage, MGI, PI, and DMFT index. Periodontitis staging represents the severity of the disease and is based on a newly established periodontitis classification39. The periodontal status of our participants evaluated using the staging system did not differ between the two groups. Also, there were very few severe periodontitis cases corresponding to stages 3 or 4, and the prevalence did not differ between the groups. This result is somewhat different from that of previous publication from our group, in which the incidence of periodontitis was generally higher in preterm mothers40. It is possible that our study population did not fully reflect real-world situations because the number of participants was relatively small.
Considering the high AUC (0.86) in our SPTB analysis model and RF variable importance rankings relating to that, preterm labor with or without rupture of membranes (SPTB) should be considered the main cause of PTBs. It is regarded as a syndrome with multiple causes, including infections, vascular disorders, decidual senescence, breakdown of maternal–fetal tolerance, a decline in progesterone action, and cervical disease41. A delicate balance in maternal immunology is a prerequisite for healthy pregnancy and labor processes. The shift from quiescent to a proinflammatory state activates labor, and inflammatory cytokines such as interleukin-1, interleukin-6, and tumor necrosis factor-α are involved in this process42. In this context, when the pregnant mother becomes vulnerable to the infection relating to the multitude of factors such as BMI, stress, behavioral factors, genetics, and nutritional deficiency, labor process can be abnormally brought forward43. According to our results, high MGI score can work as a screening tool for mother’s susceptibility to the infection. It may be a simple, noninvasive, and cost-effective test that can be included in the prenatal screening. Moreover, because it is likely that mothers with high MGI scores will have more pathogenic bacteria in their dental plaques, periodontal therapy should be carried out in the prenatal or mid-term period to prevent hematogenous dissemination of the oral pathogen.
Further, the dependence plot showed negative correlation within the range of DMFT index ≤ 10 (Fig. 2b), implying that dental caries-susceptible patients had a low probability of SPTB. Dental caries and periodontitis are two contrasting infections occurring in the oral cavity, because the bacterial species associated primarily with each disease entity have nearly opposite characteristics44. In this context, periodontitis-susceptible patients have a low tendency to be caries-prone, therefore presenting a low DMFT index and high SHAP value.
The strength of this study lies in the inclusion of dental risk factors such as MGI and DMFT index in predicting PTB. There were several researches that establishing PTB predicting model based on the LR or ML analysis45–47. However, all of these studies included only clinical data based on electronic health records. Unlike previous models, this study incorporates dental factors in addition to maternal clinical data, based on the correlation between periodontal disease and PTB. This is a novel approach and introduces an important new insight to existing research on PTB prediction. The other strength of this study is to use of cutting-edge ML approaches, such as RF variable importance and SHAP summary/dependence plots, to identify the major predictors and explaining the directions of their associations. In conventional statistical approaches like linear or LR analysis, unrealistic data assumption underlies the methodology, that is, ceteris paribus, “all the other variables staying constant.” By contrast, SHAP considers all realistic scenarios. Assume that there are three predictors of SPTB: PROM, pre-pregnancy BMI, and MGI. Here, the SHAP value of MGI for a participant is the average of the following four scenarios: (1) PROM excluded, pre-pregnancy BMI excluded; (2) PROM included, pre-pregnancy BMI excluded; (3) PROM excluded, pre-pregnancy BMI included; and (4) PROM included, pre-pregnancy BMI included. In other words, the SHAP value combines the results of all possible subgroup analyses that are ignored in conventional methods.
There were several limitations in this study. Firstly, the limitation of this study lay in its small sample size of the dataset. Normally, ML techniques can structure data without explicit programming, and the prediction performance should increase and become more reliable when a large dataset is used. However, in this study, we could still obtain solid performance using the RF method, that is, 73% (AUC) for PTB and 86% (AUC) for SPTB, from a relatively small dataset. Unlike previous studies where population-based retrospective data were used in ML-based modeling, our data set was prospectively collected and contained multifaceted parameters, including dental, clinical, and obstetric factors. It appears that the high-quality data enabled the construction of a robust model. However, it will be necessary to confirm our results using a larger dataset. Also, small sample size might have led to the study population that tended to have more favorable periodontal conditions comparing the real world. Secondly, social factors of parents including SES such as parents’ educational level, occupation, and income, as well as level of physical activity, were not included in this analysis. These SES and physical activity may serve as risk factors for PTB48,49, but research findings on this matter have not yielded consistent conclusions50,51. Additional research may be necessary to evaluate the impact of these variables. Thirdly, the limitation is that dental records were collected after delivery. Although dental examinations were conducted within several days after delivery, hormonal changes that occur along with the delivery can affect periodontal status of the mothers. Most ideally, serial records of dental examination during antenatal visits will provide more valuable information in clarifying the significance of periodontal status on PTB prediction. This should be taken into consideration in further studies. Fourthly, there were potential biases in this study, including selection bias arising from the focus on women who delivered via cesarean section, which may not represent the broader population of expectant mothers. Additionally, reliance on self-reported medical histories could introduce reporting bias, affecting the accuracy of the predictors identified. Consequently, while the findings provide valuable insights, caution should be exercised when generalizing these results to all pregnant women, and further research is needed to validate the predictive model across diverse populations and settings.
Conclusion
In our ML-based prediction model, major predictors of PTB include pre-pregnancy BMI, maternal age, preeclampsia, and dental factors such as the MGI. The positive correlations between these predictors and PTB emphasize the need for early screening and integrated medical and dental care during pregnancy. Future research should focus on validating these predictors in larger populations and exploring interventions to mitigate these risk factors.
Methods
Study population
This prospective cohort study was approved by the Institutional Review Board (IRB) of Korea University Anam Hospital (IRB no. 2020AN0217) and have conformed to the STROBE guidelines. All participants signed an informed consent form before enrollment in the study. This study was performed as part of Maternal Oral Health Effects for Preterm Infants (MOHEPI) study conducted by our group and was performed in accordance with the Declaration of Helsinki.
Participants were recruited from pregnant women who were admitted to the Department of Obstetrics, Korea University Anam Hospital for cesarean delivery of a singleton baby and included two groups: 30 patients who delivered prior to 37 weeks of gestation (PTB) and 30 with term delivery (full-term birth). Only mothers with single births were included in the study. Cases involving multiplets or congenital deformities were excluded. Clinical and obstetric data were collected before and after delivery. These included maternal age, pre-pregnancy BMI, BMI changes during pregnancy, history of in vitro fertilization, prior PTB, preeclampsia, chronic hypertension, GDM, PROM, histological CAM, funisitis, the presence of preterm labor, the number of WBC and CRP just before delivery, and smoking. Preterm labor was defined as cervical dilation of ≥ 1 cm and regular uterine contractions.
Sample size calculation
Based on a previous study examining the outcomes of PTB according to maternal periodontal status52, the sample size was calculated to be 13 per group to test the difference between two independent population proportions with the significance level of 5%, power of 80%, and two-tailed hypothesis. The effect size we considered was 0.473 (= 0.506–0.033), which is the difference in proportions between the two groups. When analyzing a number of samples, considering the confounding variables that may affect the outcome variables is essential. However, no previous studies have referenced confounding variables. Because the number of samples may have been calculated to be less than the number required for the actual study, the total number of samples was determined to be 60 people, 30 per group, including a 10% dropout rate. Statistical power analysis was conducted using G*power version 3.1.9.2 by medical statistics of the Korea University College of Medicine.
Dental examination
Participants were brought to the clinic during the admission period after delivery. Panoramic radiographs were obtained before full-mouth periodontal examination and DMFT index scoring. Measurements included periodontal probing depth (PPD), CAL, MGI53, and PI. The severity of periodontitis was graded as stage 1–4 based on the 2017 periodontitis classification39. All the measurements were performed by one periodontist (JSP). Periodontitis stage was determined for each patient based on the radiologic screening and CAL measurements. For cases with no detectable bone loss or CAL loss, diagnosis of either periodontal health or gingivitis was given based on the percentages of BOP present sites. If BOP was present in < 10%, the diagnosis was periodontal health. When ≥ 10%, the diagnosis was gingivitis. In our study, all the cases with no CAL loss showed BOP < 10% and, therefore, were classified into “periodontal health”. Cases with PPD > 5 mm sites or furcation involvements were classified as either stage 3 or 4. If number of tooth loss due to periodontal cause were > 4 or complex rehabilitation was required due to bite collapse, cases were diagnosed to be stage 4. Stage 1 or 2 were determined based on CAL and radiographic bone loss at site of greatest loss. Extent of the disease was determined, by assessing whether CAL or bone loss affected < 30% of the dentition (localized) or more (generalized)54.
Machine learning analysis
Prospective cohort data were obtained from 60 women who gave singleton births. The dependent variable was PTB/SPTB. Fifteen independent variables were included in the analysis. LR, decision tree, naïve Bayes, RF, support vector machine, and artificial neural network55–57 were used to predict PTB/SPTB. RF is a group of decision trees that make majority votes on the dependent variable (“bootstrap aggregation”). For example, RF with 1000 decision trees is processed as follows. Assuming that the original data includes 42 participants, training and testing of the RF involves two steps. First step is to create a decision tree based on random sampling of the data extracted from 42 participants. This process is called “bootstrap sampling”, and it allows replacement of the data. The data left over from each sampling process is called out-of-bag data. Secondly, 1000 decision trees created from 1000 repetitions of this process predict the dependent variable for every participant in the out-of-bag data. The majority vote then is taken as their final prediction on this participant. The proportion of wrong votes is calculated as out-of-bag error55–57. There was no missing data. The data from the 60 cases with full information in this study were split into training and validation sets in a 75:25 ratio (42 vs. 14 cases). The random split and analysis were repeated 50 times and averaged for cross validation. The criteria for validation of the trained models were accuracy (the ratio of correct predictions among 14 cases) and AUC (area under the plot of sensitivity vs. 1 – specificity).
Variable importance analysis
In the following context, RF permutation importance was used to identify the major predictors of PTB/SPTB. SHAP values were calculated to analyze the directions of the associations between the predictors and PTB/SPTB. These analytic approaches are called explainable AI, defined as “AI to identify major predictors of the dependent variable.” Explainable AI approaches currently utilized include RF impurity importance, RF permutation importance, ML accuracy importance, and SHAP57. The RF permutation importance used in this study measures the overall decrease in accuracy when the predictor data are randomly shuffled. This decrease is indicative of the extent to which the model depends on the predictor. The SHAP value measures the difference between the probability of PTB/SPTB predicted by ML with and without the predictor.
In this study, the importance of RF permutations was used to derive the rankings of the predictors. SHAP plots were then created to evaluate the directions of the associations between each predictor and the dependent variable, PTB/SPTB. This is consistent with a common practice in the field of artificial intelligence: to employ RF permutation importance for deriving the rankings of the predictors then to employ the SHAP plots for the evaluation of the directions of associations between each predictor and the dependent variable. Linear or LR had played this role before the SHAP approach took it over. R-Studio 1.3.959 (R-Studio Inc.: Boston, United States) and Python 3.8.8 (CreateSpace: Scotts Valley, United States) together with numpy 1.22.4, pandas 1.2.4 and sklearn 1.3.2 were employed for the analysis between May 1, 2023–June 30, 2023.
Acknowledgements
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea, funded by the Ministry of Education (2020R1I1A1A01073305, 2020R1I1A1A01073697, and 2018R1D1A1B07046332), Korea University Medicine (K2305451, K2300851, K2227731, and K2022151), a grant of Korea University Anam Hospital (O2207711) and a Korea University grant. The funding source did not affect the results of this study. This research was technically supported by 4P Lab Co., Ltd. for the data analysis.
Author contributions
J.P contributed to the concept/design and data collection and drafted and critically revised the manuscript. K.L contributed to the data analysis and interpretation and drafted the manuscript. J.H contributed to the concept/design, data collection, and critical revision of the manuscript. K. A contributed to the concept/design, data collection, and critical revision of the manuscript.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Ju Sun Heo and Ki Hoon Ahn.
Contributor Information
Ju Sun Heo, Email: jesus82@snu.ac.kr.
Ki Hoon Ahn, Email: akh1220@hanmail.net.
References
- 1.Sananes, N. et al. Prediction of spontaneous preterm delivery in singleton pregnancies: Where are we and where are we going? A review of literature. J. Obstet. Gynaecol.34, 457–461. 10.3109/01443615.2014.896325 (2014). [DOI] [PubMed] [Google Scholar]
- 2.Pande, G. S. & Vagha, J. D. A review of the occurrence of intraventricular hemorrhage in preterm newborns and its future neurodevelopmental consequences. Cureus15, e48968. 10.7759/cureus.48968 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Diggikar, S. et al. Retinopathy of prematurity and neurodevelopmental outcomes in preterm infants: A systematic review and meta-analysis. Front. Pediatr.11, 1055813. 10.3389/fped.2023.1055813 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wang, Y., Liu, S., Lu, M., Huang, T. & Huang, L. Neurodevelopmental outcomes of preterm with necrotizing enterocolitis: A systematic review and meta-analysis. Eur. J. Pediatr.183, 3147–3158. 10.1007/s00431-024-05569-5 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cheong, J. L. Y. & Doyle, L. W. An update on pulmonary and neurodevelopmental outcomes of bronchopulmonary dysplasia. Semin. Perinatol.42, 478–484. 10.1053/j.semperi.2018.09.013 (2018). [DOI] [PubMed] [Google Scholar]
- 6.Meis, P. J. et al. Factors associated with preterm birth in Cardiff, Wales. II. Indicated and spontaneous preterm birth. Am. J. Obstet. Gynecol.173, 597–602. 10.1016/0002-9378(95)90288-0 (1995). [DOI] [PubMed] [Google Scholar]
- 7.Iams, J. D., Romero, R., Culhane, J. F. & Goldenberg, R. L. Primary, secondary, and tertiary interventions to reduce the morbidity and mortality of preterm birth. Lancet371, 164–175. 10.1016/S0140-6736(08)60108-7 (2008). [DOI] [PubMed] [Google Scholar]
- 8.Giouleka, S. et al. Preterm labor: A comprehensive review of guidelines on diagnosis, management, prediction and prevention. Obstet. Gynecol. Surv.77, 302–317. 10.1097/OGX.0000000000001023 (2022). [DOI] [PubMed] [Google Scholar]
- 9.Suff, N., Story, L. & Shennan, A. The prediction of preterm delivery: What is new?. Semin. Fetal. Neonatal. Med.24, 27–32. 10.1016/j.siny.2018.09.006 (2019). [DOI] [PubMed] [Google Scholar]
- 10.Wulff, C. B. et al. Transvaginal sonographic cervical length in first and second trimesters in a low-risk population: A prospective study. Ultrasound Obstet. Gynecol.51, 604–613. 10.1002/uog.17556 (2018). [DOI] [PubMed] [Google Scholar]
- 11.Einerson, B. D., Grobman, W. A. & Miller, E. S. Cost-effectiveness of risk-based screening for cervical length to prevent preterm birth. Am. J. Obstet. Gynecol.215(100), e101-107. 10.1016/j.ajog.2016.01.192 (2016). [DOI] [PubMed] [Google Scholar]
- 12.Medley, N., Poljak, B., Mammarella, S. & Alfirevic, Z. Clinical guidelines for prevention and management of preterm birth: A systematic review. BJOG125, 1361–1369. 10.1111/1471-0528.15173 (2018). [DOI] [PubMed] [Google Scholar]
- 13.Zhang, Y., Feng, W., Li, J., Cui, L. & Chen, Z. J. Periodontal disease and adverse neonatal outcomes: A systematic review and meta-analysis. Front. Pediatr.10, 799740. 10.3389/fped.2022.799740 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Manrique-Corredor, E. J. et al. Maternal periodontitis and preterm birth: Systematic review and meta-analysis. Commun. Dent. Oral Epidemiol.47, 243–251. 10.1111/cdoe.12450 (2019). [DOI] [PubMed] [Google Scholar]
- 15.Figuero, E., Han, Y. W. & Furuichi, Y. Periodontal diseases and adverse pregnancy outcomes: Mechanisms. Periodontol2000(83), 175–188. 10.1111/prd.12295 (2020). [DOI] [PubMed] [Google Scholar]
- 16.Lee, Y. L. et al. Periodontal disease and preterm delivery: A nationwide population-based cohort study of Taiwan. Sci. Rep.12, 3297. 10.1038/s41598-022-07425-8 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.de Oliveira, L. J. C. et al. Periodontal disease and preterm birth: Findings from the 2015 Pelotas birth cohort study. Oral Dis.27, 1519–1527. 10.1111/odi.13670 (2021). [DOI] [PubMed] [Google Scholar]
- 18.Calixto, N. R. et al. Detection of periodontal pathogens in mothers of preterm birth and/or low weight. Med. Oral. Patol. Oral. Cir. Bucal.24, e776–e781. 10.4317/medoral.23135 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lopez, N. J., Uribe, S. & Martinez, B. Effect of periodontal treatment on preterm birth rate: A systematic review of meta-analyses. Periodontol2000(67), 87–130. 10.1111/prd.12073 (2015). [DOI] [PubMed] [Google Scholar]
- 20.Uddin, S., Khan, A., Hossain, M. E. & Moni, M. A. Comparing different supervised machine learning algorithms for disease prediction. BMC Med. Inform Decis. Mak.19, 281. 10.1186/s12911-019-1004-8 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Song, X., Mitnitski, A., Cox, J. & Rockwood, K. Comparison of machine learning techniques with classical statistical models in predicting health outcomes. Stud. Health Technol. Inform107, 736–740 (2004). [PubMed] [Google Scholar]
- 22.Sharifi-Heris, Z., Laitala, J., Airola, A., Rahmani, A. M. & Bender, M. Machine learning approach for preterm birth prediction using health records: Systematic review. JMIR Med. Inform10, e33875. 10.2196/33875 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Alleman, B. W. et al. A proposed method to predict preterm birth using clinical data, standard maternal serum screening, and cholesterol. Am. J. Obstet. Gynecol.208(472), e471–e411. 10.1016/j.ajog.2013.03.005 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Beta, J., Akolekar, R., Ventura, W., Syngelaki, A. & Nicolaides, K. H. Prediction of spontaneous preterm delivery from maternal factors, obstetric history and placental perfusion and function at 11–13 weeks. Prenat. Diagn.31, 75–83. 10.1002/pd.2662 (2011). [DOI] [PubMed] [Google Scholar]
- 25.Koivu, A. & Sairanen, M. Predicting risk of stillbirth and preterm pregnancies with machine learning. Health Inf. Sci. Syst.8, 14. 10.1007/s13755-020-00105-9 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lee, K. S. & Ahn, K. H. Artificial neural network analysis of spontaneous preterm labor and birth and its major determinants. J. Korean Med. Sci.34, e128. 10.3346/jkms.2019.34.e128 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sananes, N. et al. Prediction of spontaneous preterm delivery in the first trimester of pregnancy. Eur. J. Obstet. Gynecol. Reprod. Biol.171, 18–22. 10.1016/j.ejogrb.2013.07.042 (2013). [DOI] [PubMed] [Google Scholar]
- 28.Weber, A. et al. Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women. Ann. Epidemiol.28, 783–789. 10.1016/j.annepidem.2018.08.008 (2018). [DOI] [PubMed] [Google Scholar]
- 29.Nasef, N. A., Mehta, S. & Ferguson, L. R. Susceptibility to chronic inflammation: An update. Arch. Toxicol.91, 1131–1141. 10.1007/s00204-016-1914-5 (2017). [DOI] [PubMed] [Google Scholar]
- 30.Dietrich, T., Kaye, E. K., Nunn, M. E., Van Dyke, T. & Garcia, R. I. Gingivitis susceptibility and its relation to periodontitis in men. J. Dent. Res.85, 1134–1137. 10.1177/154405910608501213 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Khader, Y. S. & Ta’ani, Q. Periodontal diseases and the risk of preterm birth and low birth weight: A meta-analysis. J. Periodontol.76, 161–165. 10.1902/jop.2005.76.2.161 (2005). [DOI] [PubMed] [Google Scholar]
- 32.Corbella, S., Taschieri, S., Francetti, L., De Siena, F. & Del Fabbro, M. Periodontal disease as a risk factor for adverse pregnancy outcomes: A systematic review and meta-analysis of case-control studies. Odontology100, 232–240. 10.1007/s10266-011-0036-z (2012). [DOI] [PubMed] [Google Scholar]
- 33.Ide, M. & Papapanou, P. N. Epidemiology of association between maternal periodontal disease and adverse pregnancy outcomes–systematic review. J. Clin. Periodontol.40(Suppl 14), S181-194. 10.1111/jcpe.12063 (2013). [DOI] [PubMed] [Google Scholar]
- 34.Konopka, T. & Paradowska-Stolarz, A. Periodontitis and risk of preterm birth and low birthweight–a meta-analysis. Ginekol. Pol.83, 446–453 (2012). [PubMed] [Google Scholar]
- 35.Matevosyan, N. R. Periodontal disease and perinatal outcomes. Arch. Gynecol. Obstet.283, 675–686. 10.1007/s00404-010-1774-9 (2011). [DOI] [PubMed] [Google Scholar]
- 36.Teshome, A. & Yitayeh, A. Relationship between periodontal disease and preterm low birth weight: Systematic review. Pan. Afr. Med. J.24, 215. 10.11604/pamj.2016.24.215.8727 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Vergnes, J. N. & Sixou, M. Preterm low birth weight and maternal periodontal status: A meta-analysis. Am. J. Obstet. Gynecol.196(135), e131-137. 10.1016/j.ajog.2006.09.028 (2007). [DOI] [PubMed] [Google Scholar]
- 38.Chambrone, L., Guglielmetti, M. R., Pannuti, C. M. & Chambrone, L. A. Evidence grade associating periodontitis to preterm birth and/or low birth weight: I. A systematic review of prospective cohort studies. J. Clin. Periodontol.38, 795–808. 10.1111/j.1600-051X.2011.01755.x (2011). [DOI] [PubMed] [Google Scholar]
- 39.Tonetti, M. S., Greenwell, H. & Kornman, K. S. Staging and grading of periodontitis: Framework and proposal of a new classification and case definition. J. Clin. Periodontol.45(Suppl 20), S149–S161. 10.1111/jcpe.12945 (2018). [DOI] [PubMed] [Google Scholar]
- 40.Lee, J. H. et al. Performance of a deep learning algorithm compared with radiologic interpretation for lung cancer detection on chest radiographs in a health screening population. Radiology297, 687–696. 10.1148/radiol.2020201240 (2020). [DOI] [PubMed] [Google Scholar]
- 41.Romero, R., Dey, S. K. & Fisher, S. J. Preterm labor: One syndrome, many causes. Science345, 760–765. 10.1126/science.1251816 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Couceiro, J. et al. Inflammatory factors, genetic variants, and predisposition for preterm birth. Clin. Genet.100, 357–367. 10.1111/cge.14001 (2021). [DOI] [PubMed] [Google Scholar]
- 43.Menon, R., Dunlop, A. L., Kramer, M. R., Fortunato, S. J. & Hogue, C. J. An overview of racial disparities in preterm birth rates: Caused by infection or inflammatory response?. Acta Obstet. Gynecol. Scand.90, 1325–1331. 10.1111/j.1600-0412.2011.01135.x (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Loesche, W. Dental caries and periodontitis: contrasting two infections that have medical implications. Infect Dis. Clin. North Am.21, 471–502. 10.1016/j.idc.2007.03.006 (2007). [DOI] [PubMed] [Google Scholar]
- 45.Li, Y. et al. Maternal preterm birth prediction in the United States: A case-control database study. BMC Pediatr.22, 547. 10.1186/s12887-022-03591-w (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zhang, Y. et al. Establishment of a model for predicting preterm birth based on the machine learning algorithm. BMC Pregn. Childb.23, 779. 10.1186/s12884-023-06058-7 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.AlSaad, R., Malluhi, Q. & Boughorbel, S. PredictPTB: An interpretable preterm birth prediction model using attention-based recurrent neural networks. BioData Min.15, 6. 10.1186/s13040-022-00289-8 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Tian, Y. et al. Maternal socioeconomic mobility and preterm delivery: A latent class analysis. Matern. Child Health J.22, 1647–1658. 10.1007/s10995-018-2562-6 (2018). [DOI] [PubMed] [Google Scholar]
- 49.Cai, C. et al. The impact of occupational shift work and working hours during pregnancy on health outcomes: A systematic review and meta-analysis. Am. J. Obstet. Gynecol.221, 563–576. 10.1016/j.ajog.2019.06.051 (2019). [DOI] [PubMed] [Google Scholar]
- 50.Bonzini, M., Coggon, D. & Palmer, K. T. Risk of prematurity, low birthweight and pre-eclampsia in relation to working hours and physical activities: A systematic review. Occup. Environ. Med.64, 228–243. 10.1136/oem.2006.026872 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.van Melick, M. J., van Beukering, M. D., Mol, B. W., Frings-Dresen, M. H. & Hulshof, C. T. Shift work, long working hours and preterm birth: A systematic review and meta-analysis. Int. Arch. Occup, Environ. Health87, 835–849. 10.1007/s00420-014-0934-9 (2014). [DOI] [PubMed] [Google Scholar]
- 52.Montenegro, D. A. et al. Oral and uro-vaginal intra-amniotic infection in women with preterm delivery: A case-control study. J. Investig. Clin. Dent.10, e12396. 10.1111/jicd.12396 (2019). [DOI] [PubMed] [Google Scholar]
- 53.Lobene, R. R., Weatherford, T., Ross, N. M., Lamm, R. A. & Menaker, L. A modified gingival index for use in clinical trials. Clin. Prev. Dent.8, 3–6 (1986). [PubMed] [Google Scholar]
- 54.Tonetti, M. S. & Sanz, M. Implementation of the new classification of periodontal diseases: Decision-making algorithms for clinical practice and education. J. Clin. Periodontol.46, 398–405. 10.1111/jcpe.13104 (2019). [DOI] [PubMed] [Google Scholar]
- 55.Lee, K. S. & Kim, E. S. Explainable artificial intelligence in the early diagnosis of gastrointestinal disease. Diagnostics (Basel)10.3390/diagnostics12112740 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Lee, K. S. & Ham, B. J. Machine learning on early diagnosis of depression. Psychiatry Investig.19, 597–605. 10.30773/pi.2022.0075 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Lee, K. S. & Park, H. Machine learning on thyroid disease: A review. Front. Biosci. (Landmark Ed)27, 101. 10.31083/j.fbl2703101 (2022). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.