Abstract
Gestational hypertension (GH), a prevalent pregnancy complication, requires early risk identification for timely intervention. This study assesses and compares traditional and placental function factors using multivariable logistic regression, random forest, and support vector machine (SVM) models to predict GH risk. We first compared the baseline information and pregnancy-related characteristics between normal pregnant women and those with GH. Then, we modeled the risk of GH based on traditional factors and placental function factors using multivariable logistic regression, random forest, and SVM combined with SHapley Additive exPlanations values. The predictive performance of each model was assessed using receiver operating characteristic curves. Among the models compared, the multivariable logistic regression model based on traditional factors achieved the highest area under the curve (AUC), demonstrating the best predictive performance. The AUC values for random forest and SVM using traditional factors were 0.730 and 0.732, respectively, but their performance was weaker when using placental function factors, with random forest having the lowest AUC (0.612). Feature importance analysis indicated that baseline systolic blood pressure, diastolic blood pressure, high-risk pregnancy, and family history were key predictive factors among traditional factors, while fasting plasma glucose, triglycerides, and C-reactive protein were the most important among placental function factors. Traditional factors best predicted GH, with logistic regression outperforming machine learning methods. While SVM and random forest showed moderate performance with traditional factors, they were less effective with placental function factors. Logistic regression should remain primary, supplemented by other methods for comprehensive prediction.
Keywords: gestational hypertension, multivariable logistic regression, random forest, risk prediction, support vector machine
1. Introduction
Gestational hypertension (GH) is a common and serious complication during pregnancy, typically emerging after 20 weeks of gestation.[1,2] It is characterized by elevated blood pressure and may be accompanied by symptoms such as proteinuria.[3] GH not only affects maternal health but can also lead to adverse pregnancy outcomes for the fetus, such as intrauterine growth restriction and preterm birth, potentially threatening the lives of both mother and child.[4] Early identification and intervention of risk factors for GH can effectively reduce the incidence of pregnancy complications and improve maternal and neonatal outcomes.[5]
In recent years, identifying risk factors and developing predictive models have become critical directions in research on GH.[6,7] Traditional risk factors, such as prepregnancy body mass index (BMI),[8] smoking,[9] and family history of hypertension,[10] have been confirmed to be closely related to the occurrence of GH. Additionally, placental function plays a crucial role during pregnancy, and abnormal placental positioning or inadequate perfusion may increase the risk of GH. However, identifying single factors often falls short in comprehensively assessing risk, and the use of multifactorial combinations and multiple analytical methods holds promise for more accurate prediction of GH.
Multivariable logistic regression, random forest, and support vector machine are 3 commonly used methods for constructing medical predictive models.[11,12] Logistic regression is suitable for evaluating linear relationships between variables, while random forest and support vector machine excel at capturing nonlinear relationships among features.[13] These methods quantify feature importance, clarifying how traditional and placental factors predict GH. While logistic regression shows strong accuracy, machine learning may better handle complex feature relationships.
This study aims to systematically assess the predictive value of traditional factors and placental function factors for GH by comparing the baseline information and pregnancy characteristics between normal pregnant women and those with GH. We constructed predictive models using multivariable logistic regression, random forest, and support vector machine, and analyzed feature importance to explore the role of different factors in predicting GH. Finally, we compared the predictive performance of each model using area under the curve (AUC) and other metrics to provide scientific evidence for the early screening and risk assessment of GH.
2. Materials and methods
2.1. Study subjects
This is a single-center study that retrospectively enrolled eligible normotensive pregnant women and those with GH who attended our hospital (a tertiary general hospital) between January 2021 and January 2023. All participants were singleton pregnancies without other serious complications or history of severe complications in previous pregnancies. Baseline information and pregnancy-related assessments were conducted in early pregnancy (12 weeks of gestation), and grouping was performed based on diagnostic criteria for GH: the GH group met the diagnostic standard of systolic blood pressure ≥ 140 mm Hg and/or diastolic blood pressure ≥ 90 mm Hg after 20 weeks of gestation, excluding other causes of hypertension. Those not meeting this diagnostic criterion were assigned to the normal pregnant group. Inclusion criteria were singleton pregnancy, first prenatal visit within 12 weeks of gestation, meeting the diagnostic criteria for GH, and signed informed consent. Exclusion criteria included preexisting chronic hypertension or antihypertensive medication use, severe complications (e.g., diabetes, cardiovascular disease, kidney disease), history of severe complications such as GH or preeclampsia in previous pregnancies, and other conditions that may influence the occurrence of GH.
2.2. Data collection
Age, smoking, alcohol consumption, family medical history, first pregnancy, and history of miscarriage were recorded through questionnaires. Prepregnancy BMI was calculated by dividing prepregnancy weight (kg) by height squared (m2). Weight and height data were measured using standard scales and stadiometers. Placental position and blood perfusion were assessed via ultrasound, categorized as “normal” or “abnormal.” Baseline systolic and diastolic blood pressure were measured with a standard electronic blood pressure monitor in early pregnancy, using the average of 3 measurements. Urinary protein levels were measured via 24-hour urine collection to reflect renal function changes in pregnant women with GH, expressed as g/24 hours. Serum uric acid levels were measured using a venous blood sample analyzed with chemical methods in the laboratory, expressed as μmol/L. C-reactive protein (CRP) levels in venous blood samples were measured using an immunoturbidimetric method in mg/L, assessing the body’s inflammatory status. Serum creatinine levels were determined by enzymatic assay in mg/dL, evaluating renal filtration function. After 8 hours of fasting, fasting plasma glucose concentration was measured with a glucose oxidase method in mg/dL, reflecting glucose metabolism. Serum triglyceride levels were measured using an enzymatic method in the laboratory, expressed in mg/dL, to evaluate lipid metabolism in pregnant women. All biochemical measurements were conducted at 12 weeks of gestation.
2.3. Statistical analysis
In this study, we employed the random forest algorithm to build a predictive model for GH. Random forest is a nonparametric machine learning algorithm based on ensemble learning, which improves the overall accuracy and robustness of the model by constructing multiple decision trees and combining their predictions.[14] The data were divided into training and testing sets, with the training set used for model training and parameter tuning, and the testing set used to evaluate predictive performance to obtain the model’s generalization ability. The training and testing sets accounted for 70% and 30%, respectively. Parameters were set as ntree = 500, and the selection of mtry used the OOB (out-of-bag error) method.[15] A for loop was used to iterate through different mtry values, training the random forest model, and recording the OOB error rate for each mtry value. The mtry value with the lowest OOB error rate was selected as the optimal parameter. The Gini coefficient was used as the feature importance indicator.
In this study, we used a support vector machine (SVM) model combined with SHAP (SHapley Additive exPlanations) to analyze the impact of different features on GH prediction. The primary purpose of this combination method was to leverage SVM’s strong classification ability while explaining feature importance through SHAP values, providing insights into each feature’s contribution to GH prediction.[16] The tune function with cross-validation was used for parameter tuning to obtain the best model. Once the SVM model was trained, SHAP was applied to calculate each feature’s contribution to the prediction.
The DeLong nonparametric test was employed to compare AUC differences among prediction models, effectively evaluating the statistical significance of diagnostic performance variations between different models.
The Mann–Whitney U test was used to compare baseline information, pregnancy-related characteristics, and biochemical indicators between the normal pregnant and GH groups. For categorical variables, chi-square tests were used for intergroup comparisons. Continuous data were expressed as median (minimum–maximum), and categorical data were presented as frequency (percentage). Using a significance level of 0.05, statistical power of 0.8, and a moderate effect size as criteria, our sample size calculation indicated that a minimum of 64 cases per group was required for continuous data, while at least 88 total cases were needed for categorical data. The sample size enrolled in this study far exceeded these minimum requirements, ensuring robust statistical power.
3. Results
3.1. Differences in baseline information, pregnancy-related characteristics, and biochemical indicators between normal pregnant women and women with GH
The results showed significant differences between the 2 groups in prepregnancy BMI (P = 7.17E‐06), smoking history (P = .020133), family history of hypertension (P = 3.86E‐09), high-risk pregnancy (P = 1.53E‐12), first pregnancy (P = 3.42E‐16), and history of miscarriage (P = 8.07E‐17). Additionally, placental position (P = 7.16E‐12) and placental perfusion (P = .000525) also exhibited significant differences in the GH group. These findings suggest that the above factors may be associated with the occurrence of GH. In contrast, there were no significant differences between the groups in age (P = .225), alcohol consumption (P = .140547), family history of diabetes (P = .108566), and family history of cardiovascular disease (P = .15866). The proportions of prepregnancy BMI, smoking history, family history of hypertension, high-risk pregnancy, first pregnancy, and history of miscarriage were significantly higher in the GH group than in the normal pregnancy group. The rates of abnormal placental position and insufficient placental perfusion were also significantly higher in the GH group (Table 1).
Table 1.
Baseline information and pregnancy characteristics of normal pregnant women and pregnant women with gestational hypertension.
All patients (n = 450) | Normal (n = 225) | Hypertensive disorders of pregnancy (n = 225) | P-value | |
---|---|---|---|---|
Age | 30 (21–40) | 30 (21–39) | 31 (21–40) | .225 |
Prepregnancy BMI | 25.46 (18.52–31.46) | 24.28 (18.52–31.10) | 26.31 (18.58–31.46) | 7.17E‐06 |
Smoking | .0201326 | |||
Yes | 10 (2.22%) | 1 (0.44%) | 9 (4%) | |
No | 440 (97.78%) | 224 (99.56%) | 216 (96%) | |
Drinking | .1405473 | |||
Yes | 24 (5.33%) | 16 (7.11%) | 8 (3.56%) | |
No | 426 (94.67%) | 209 (92.89%) | 217 (96.44%) | |
Family history of hypertension | 3.86E‐09 | |||
Yes | 100 (22.22%) | 24 (10.67%) | 76 (33.78%) | |
No | 350 (77.78%) | 201 (89.33%) | 149 (66.22%) | |
Family history of diabetes | .1085659 | |||
Yes | 80 (17.78%) | 33 (14.67%) | 47 (20.89%) | |
No | 370 (82.22%) | 192 (85.33%) | 178 (79.11%) | |
Family history of cardiovascular disease | 1.53E‐12 | |||
Yes | 65 (14.44%) | 7 (3.11%) | 58 (25.78%) | |
No | 385 (85.56%) | 218 (96.89%) | 167 (74.22%) | |
High-risk pregnancy | 3.42E‐16 | |||
Yes | 114 (25.33%) | 20 (8.89%) | 94 (41.78%) | |
No | 336 (74.67%) | 205 (91.11%) | 131 (58.22%) | |
First pregnancy | 8.07E‐17 | |||
Yes | 240 (53.33%) | 76 (33.78%) | 164 (72.89%) | |
No | 210 (46.67%) | 149 (66.22%) | 61 (27.11%) | |
History of miscarriage | 7.16E‐12 | |||
Yes | 215 (47.78%) | 71 (31.56%) | 144 (64%) | |
No | 235 (52.22%) | 154 (68.44%) | 81 (36%) | |
Placental position | 3.70E‐02 | |||
Normal | 430 (95.56%) | 220 (97.78%) | 210 (93.33%) | |
Abnormal | 20 (4.44%) | 5 (2.22%) | 15 (6.67%) | |
Placental perfusion | .000524568 | |||
Normal perfusion | 408 (90.67%) | 215 (95.56%) | 193 (85.78%) | |
Insufficient perfusion | 42 (9.33%) | 10 (4.44%) | 32 (14.22%) |
Among women with GH, several biochemical and clinical indicators were significantly higher in early pregnancy compared to the normal pregnancy group. Specifically, baseline systolic blood pressure (P = .000824), baseline diastolic blood pressure (P = .000161), 24-hour urinary protein (P = .000848), uric acid (P = 7.78E‐06), C-reactive protein (P = .000424), fasting plasma glucose (P = 7.89E‐05), and triglycerides (P = .000694) were all significantly elevated in the GH group. These significant differences in biochemical and clinical indicators suggest that they may play an important role in the development of GH. There was no significant difference in creatinine levels between the 2 groups (P = .47) (Table 2).
Table 2.
Biochemical and clinical markers in early pregnancy of normal pregnant women and pregnant women with gestational hypertension.
All patients (n = 450) | Normal (n = 225) | Hypertensive disorders of pregnancy (n = 225) | P-value | |
---|---|---|---|---|
Baseline systolic blood pressure | 111.47 (90.04–134.93) | 108.73 (90.04–134.93) | 116.10 (90.30–134.90) | .000824 |
Baseline diastolic blood pressure | 80.50 (70.03–89.95) | 78.76 (70.16–89.95) | 82.41 (70.03–89.78) | .000161 |
Proteinuria (g/24 hours) | 0.13 (0.00–0.27) | 0.1 (0.00–0.17) | 0.16 (0.00–0.27) | .000848 |
Uric acid | 3.80 (2.15–5.31) | 3.49 (2.15–4.51) | 4.06 (2.17–5.30) | 7.78E‐06 |
C-reactive protein (mg/L) | 8.58 (3.04–14.98) | 6.91 (3.04–10.98) | 10.26 (5.06–14.98) | .000424 |
Creatinine (mg/dL) | 0.72 (0.47–0.96) | 0.72 (0.47–0.86) | 0.72 (0.47–0.96) | .47 |
Fasting plasma glucose (mg/dL) | 80.31 (68.61–93.39) | 79.31 (68.61–93.39) | 81.97 (68.61–93.38) | 7.89E‐05 |
Triglycerides (mg/dL) | 127.39 (52.76–204.19) | 112.22 (52.76–203.64) | 143.52 (52.97–204.19) | .000694 |
These differing indicators were divided into 2 categories: traditional factors and placental function factors. Traditional factors included prepregnancy BMI, smoking history, family history of hypertension, family history of cardiovascular disease, high-risk pregnancy, baseline systolic blood pressure, baseline diastolic blood pressure, first pregnancy, and history of miscarriage. These factors are static variables related to the individual’s long-term health status and reflect the health foundation and potential pregnancy complications risk before conception. Placental function factors included placental position, placental perfusion, urinary protein, uric acid, C-reactive protein, fasting plasma glucose, and triglycerides. These factors are related to the structure and function of the placenta, which serves as a critical link between mother and fetus, facilitating nutrient exchange, gas exchange, and waste elimination. The health of the placenta directly impacts pregnancy progression.
3.2. Multivariable logistic regression analysis of the impact of traditional and placental function factors on GH
Overall, the AUC value of the traditional factors group (0.831) was higher than that of the placental function group (0.733) (Table 4) (Fig. 1A and B), indicating that traditional factors are more accurate in predicting GH. Among the traditional factors, prepregnancy BMI (P = .000, odds ratio [OR] = 1.136), family history of hypertension (P = .000, OR = 3.751), family history of cardiovascular disease (P = .000, OR = 8.296), high-risk pregnancy (P = .000, OR = 6.943), baseline systolic blood pressure (P = .044, OR = 1.018), and baseline diastolic blood pressure (P = .014, OR = 1.051) all significantly increased the risk of GH. Meanwhile, among the placental function factors, abnormal placental perfusion (P = .009, OR = 2.873), proteinuria (P = .002, OR = 68.506), uric acid (P = .000, OR = 1.556), C-reactive protein (P = .005, OR = 1.108), fasting plasma glucose (P = .000, OR = 1.057), and triglycerides (P = .001, OR = 1.008) also showed significant positive correlations with GH (Table 3).
Table 4.
ROC curves of multivariate logistic regression analysis, random forest, and support vector machine.
AUC | AUC_CI_Lower | AUC_CI_Upper | Best_Threshold | Youden | Sensitivity | Specificity | |
---|---|---|---|---|---|---|---|
Traditional Factor-logistic | 0.831 | 0.793 | 0.868 | 0.431 | 0.529 | 0.760 | 0.769 |
Traditional factor-RF | 0.730 | 0.655 | 0.805 | 1.500 | 0.460 | 0.633 | 0.827 |
Traditional factor-SVM | 0.732 | 0.671 | 0.820 | 1.500 | 0.492 | 0.721 | 0.770 |
Placental function-logistic | 0.733 | 0.687 | 0.779 | 0.484 | 0.364 | 0.684 | 0.680 |
Placental function-RF | 0.612 | 0.529 | 0.694 | 1.500 | 0.223 | 0.656 | 0.568 |
Placental function-SVM | 0.643 | 0.565 | 0.722 | 1.500 | 0.287 | 0.507 | 0.779 |
AUC = area under the curve, ROC = receiver operating characteristic, SVM = support vector machine.
Figure 1.
(A) ROC curve of multivariable logistic regression for traditional factors. (B) ROC curve of multivariable logistic regression for placental function factors. ROC = receiver operating characteristic.
Table 3.
Multivariate logistic regression analysis of traditional and placental function factors associated with gestational hypertension.
Traditional factor-logistic | Estimate | Std. error | Statistic | P value | OR | CI_lower | CI_upper |
---|---|---|---|---|---|---|---|
PRE-BMI | 0.127 | 0.033 | 3.877 | .000 | 1.136 | 1.065 | 1.211 |
Smoking | 2.153 | 1.120 | 1.923 | .054 | 8.614 | 0.960 | 77.325 |
Family history of hypertension | 1.322 | 0.296 | 4.468 | .000 | 3.751 | 2.100 | 6.698 |
Family history of cardiovascular disease | 2.116 | 0.445 | 4.753 | .000 | 8.296 | 3.467 | 19.853 |
High risk pregnancy | 1.938 | 0.296 | 6.536 | .000 | 6.943 | 3.883 | 12.413 |
PRE-SBP | 0.018 | 0.009 | 2.010 | .044 | 1.018 | 1.000 | 1.036 |
PRE-DBP | 0.050 | 0.020 | 2.461 | .014 | 1.051 | 1.010 | 1.093 |
Placental function-logistic | |||||||
Placental position | 0.892 | 0.583 | 1.532 | .126 | 2.441 | 0.779 | 7.647 |
Placental perfusion | 1.055 | 0.405 | 2.606 | .009 | 2.873 | 1.299 | 6.352 |
Proteinuria | 4.227 | 1.352 | 3.127 | .002 | 68.506 | 4.844 | 968.864 |
Uric acid | 0.442 | 0.115 | 3.844 | .000 | 1.556 | 1.242 | 1.949 |
CRP | 0.102 | 0.036 | 2.811 | .005 | 1.108 | 1.031 | 1.190 |
FPG | 0.055 | 0.015 | 3.672 | .000 | 1.057 | 1.026 | 1.089 |
Triglycerides | 0.008 | 0.002 | 3.257 | .001 | 1.008 | 1.003 | 1.012 |
BMI = body mass index, CRP = C-reactive protein, OR = odds ratio.
3.3. Random forest analysis of the impact of traditional and placental function factors on GH
We then conducted random forest analyses on these 2 types of factors. In parameter selection, the mtry value for the traditional factors group was set to 2, and for the placental function group, it was set to 3 (Fig. 2A and B). The AUC value for the traditional factors group (0.730) was higher than that of the placental function group (0.612) (Table 4) (Fig. 3A and B), indicating that traditional factors provide a more accurate prediction for GH. Among traditional factors, baseline diastolic blood pressure, prepregnancy BMI, and baseline systolic blood pressure contributed the most to predicting GH, while high-risk pregnancy and family history of cardiovascular disease also showed high importance. In contrast, family history of hypertension and smoking had relatively less impact (Fig. 3C). On the other hand, for placental function factors, fasting plasma glucose, C-reactive protein, and triglycerides were the most important predictive indicators, with uric acid and proteinuria also showing significant influence. Placental perfusion and placental position had relatively lower importance (Fig. 3D). These results indicate that different traditional and placental function factors play varying roles in the risk prediction of GH, providing a basis for comprehensive risk assessment.
Figure 2.
(A) Selection of the optimal mtry using the OOB error method in the traditional factors group. (B) Selection of the optimal mtry using the OOB error method in the placental function group. OOB = out-of-bag.
Figure 3.
(A) ROC curve of random forest for traditional factors. (B) ROC curve of random forest for placental function factors. (C) Gini index for traditional factors. (D) Gini index for placental function factors. ROC = receiver operating characteristic.
3.4. Analysis of the impact of traditional and placental function factors on GH using support vector machine combined with SHAP
Similarly, the receiver operating characteristic value for traditional factors (0.732) was higher than that for placental function factors (0.643) (Table 4) (Fig. 4A and B). In detail, among traditional factors, high-risk pregnancy, family history of cardiovascular disease and hypertension, baseline systolic blood pressure, and baseline diastolic blood pressure were the most influential predictive factors, while smoking and prepregnancy BMI had relatively lower importance (Fig. 4C). Conversely, among placental function factors, triglycerides, fasting plasma glucose, proteinuria, and C-reactive protein were more important for predicting the risk of GH, whereas placental position, placental perfusion, and uric acid had lesser influence (Fig. 4D). This indicates that different types of features contribute differently to the risk prediction of GH, suggesting that multiple factors should be considered for a comprehensive assessment.
Figure 4.
(A) ROC curve of support vector machine for the traditional factors group. (B) ROC curve of support vector machine for the placental function group. (C) SHAP values for the traditional factors group. (D) SHAP values for the placental function group. ROC = receiver operating characteristic, SHAP = SHapley Additive exPlanations.
We focused on comparing the AUC differences between the multivariate logistic regression model and 2 other machine learning models for both traditional risk factors and placental functional factors. The DeLong test demonstrated that the multivariate logistic regression model yielded significantly higher AUC values than the other 2 machine learning approaches across both traditional and placental functional factors (P < .01).
4. Discussion
This study compared baseline and pregnancy characteristics between normal and hypertensive pregnancies, evaluating traditional (prepregnancy BMI, family history, high-risk pregnancy) and placental function factors (fasting glucose, CRP, triglycerides) for GH prediction using machine learning. Significant group differences emerged across all factors, with varying predictive contributions to hypertension development.
The multivariable logistic regression analysis revealed superior predictive performance of traditional factors compared to placental function factors for GH, as evidenced by higher AUC values. Specifically, prepregnancy BMI, family history of hypertension and cardiovascular disease, and high-risk pregnancy status emerged as significant risk determinants, potentially indicating etiological involvement in disease development. While demonstrating comparatively lower predictive capacity, placental function parameters (including abnormal placental perfusion, proteinuria, uric acid levels, and C-reactive protein) showed consistent positive associations with GH, substantiating their pathophysiological relevance.
In random forest and SVM analyses, the predictive performance of traditional factors was again superior to that of placental function factors. The AUC values in the traditional factors group for both random forest and SVM were higher than those in the placental function group. This difference may be due to the stability of traditional factors as static variables, while placental function factors may change as pregnancy progresses, thus having a lesser impact on predicting GH. However, there were variations in feature importance across different models. Among traditional factors, baseline diastolic blood pressure, prepregnancy BMI, and baseline systolic blood pressure contributed the most to predicting GH, with high-risk pregnancy and family history of cardiovascular disease also showing high importance. In comparison, among placental function factors, fasting glucose, C-reactive protein, and triglycerides showed high feature importance in both random forest and SVM, suggesting that certain indicators among traditional factors provide stable predictive power, while placental function factors may depend more on dynamic changes during pregnancy.
In the traditional factors group, baseline diastolic blood pressure, prepregnancy BMI, baseline systolic blood pressure, high-risk pregnancy, and family history of cardiovascular disease showed high importance in both random forest and SVM models. This indicates that these features have strong independence and stability in predicting GH risk, effectively distinguishing hypertensive risk across different algorithms. Pregnant women with higher prepregnancy BMI are more likely to develop hypertension during pregnancy, as high BMI represents increased fat accumulation, adding cardiovascular strain, potentially leading to hypertension. Baseline systolic and diastolic blood pressure are direct reflections of blood pressure levels. Family history of hypertension and high-risk pregnancy further amplify the influence on blood pressure, suggesting that individuals with preexisting blood pressure or cardiovascular issues are at greater risk of GH. High-risk pregnancy and family history of cardiovascular disease represent individual baseline health status and genetic risk. Particularly among pregnant women with a family history of hypertension, the likelihood of developing GH is significantly higher, underscoring the importance of family background in predicting GH.
In the placental function factors group, fasting glucose, C-reactive protein, and triglycerides consistently showed high importance across both models, suggesting that placental function and metabolic status also have a significant impact on the occurrence of GH. Fasting glucose levels reflect the metabolic and glucose control conditions of pregnant women, and glucose abnormalities may lead to inadequate placental perfusion and placental dysfunction, thereby increasing the risk of GH.[17,18] CRP is an inflammation marker, and inflammation during pregnancy may affect vascular and placental function.[19] Elevated CRP indicates a potential inflammatory state in the body, which may increase vascular resistance and blood pressure, potentially triggering GH.[20] Triglycerides, as a lipid indicator, reflect the lipid metabolic status of pregnant women.[21] Abnormal lipid metabolism can affect placental blood flow and oxygen supply, leading to placental ischemia, which further increases the risk of GH.[22] The importance of these placental function factors underscores the biological significance of the placenta in the occurrence of GH, suggesting that monitoring metabolic and inflammatory status during pregnancy may be valuable.
In terms of overall model comparison, the multivariable logistic regression model performed best when combining traditional and placental function factors, achieving the highest AUC and offering better sensitivity and specificity. In contrast, SVM and random forest showed slightly lower predictive performance than logistic regression with traditional factors, and their performance with placental function factors was more limited. This finding suggests that the logistic regression model, due to its stability and adaptability, provides higher predictive accuracy for different types of risk factors. In contrast, random forest performed poorly with placental function factors, possibly due to the complexity of the biological significance of these factors and their dynamic changes throughout pregnancy.
The findings of this study provide data support for the clinical assessment of GH risk. Traditional static factors, such as prepregnancy BMI and high-risk pregnancy, are more valuable as early predictors of GH and should therefore be prioritized in prenatal screening (a conclusion well-established by previous research). The innovative aspect of our study lies in identifying biochemical markers related to placental function (such as fasting blood glucose, C-reactive protein, and triglycerides) that are also closely associated with the onset of GH, offering new perspectives and potential intervention targets for disease prediction. These findings not only confirm the critical role of metabolic and inflammatory factors in the pathogenesis of GH but, more importantly, provide clinically actionable screening indicator combinations. Particularly in resource-limited primary healthcare settings, early pregnancy screening incorporating these easily obtainable biochemical markers alongside basic ultrasound assessments could enable more accurate risk stratification, highlighting significant potential for broader clinical application.
This study has several limitations that warrant Acknowledgments. First, as a single-center retrospective investigation, it carries inherent risks of selection bias in data acquisition. Second, the lack of external validation cohorts restricts the generalizability of our findings across diverse clinical settings. Furthermore, by utilizing only first-trimester data points, we were unable to incorporate the dynamic temporal changes of placental functional markers into our predictive models. Future studies should consider implementing longitudinal data collection through multicenter prospective randomized controlled trials coupled with external validation cohorts to enhance the robustness of these conclusions.
5. Conclusion
Overall, this study shows that traditional factors have greater stability and accuracy in predicting GH. The logistic regression model performs best in multivariable prediction of GH and is recommended for early screening and risk assessment based on traditional factors. At the same time, incorporating placental function factors to enrich monitoring in early pregnancy management will help further improve the predictive level for GH.
Author contributions
Conceptualization: Xiaoyuan Sun.
Data curation: Xiaoyuan Sun.
Formal analysis: Xiaoyuan Sun, Fengmei Wang.
Investigation: Fengmei Wang, Chen Huang.
Methodology: Fengmei Wang, Chen Huang, Na Li.
Project administration: Chen Huang, Na Li.
Resources: Na Li.
Software: Na Li.
Supervision: Yue Yang.
Validation: Yue Yang.
Visualization: Yue Yang.
Writing – original draft: Xiaoyuan Sun, Fengmei Wang, Chen Huang, Na Li.
Writing – review & editing: Yue Yang.
Abbreviations:
- AUC
- area under the curve
- BMI
- body mass index
- CRP
- C-reactive protein
- GH
- gestational hypertension
- OOB
- out-of-bag
- OR
- odds ratio
- SHAP
- SHapley Additive exPlanations
- SVM
- support vector machine
This paper has been reviewed by relevant departments of our hospital, such as the Science and Education Department, Medical Department and Ethics Committee of Civil Aviation General Hospital. The research content involved in this research meets the requirements of medical ethics and academic morality of our hospital, and the research content is reasonable, the risks are controllable, and there are no violations. The relevant research carried out is in line with the safe, standardized and true scientific research guiding principles, and in line with the requirements of the clinical research ethics code.
The authors have no funding and conflicts of interest to disclose.
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
How to cite this article: Sun X, Wang F, Huang C, Li N, Yang Y. Analysis of risk factors and early prediction model construction for gestational hypertension. Medicine 2025;104:33(e43869).
Contributor Information
Xiaoyuan Sun, Email: 18500616020@163.com.
Fengmei Wang, Email: wfm1982619@163.com.
Chen Huang, Email: kaoyan_0911@163.com.
Na Li, Email: lina0319@126.com.
References
- [1].Wu P, Green M, Myers JE. Hypertensive disorders of pregnancy. BMJ. 2023;381:e071653. [DOI] [PubMed] [Google Scholar]
- [2].Kintiraki E, Papakatsika S, Kotronis G, Goulis DG, Kotsis V. Pregnancy-Induced hypertension. Hormones (Athens). 2015;14:211–23. [DOI] [PubMed] [Google Scholar]
- [3].Rosenberg EA, Seely EW. Update on preeclampsia and hypertensive disorders of pregnancy. Endocrinol Metab Clin North Am. 2024;53:377–89. [DOI] [PubMed] [Google Scholar]
- [4].Sinkey RG, Battarbee AN, Bello NA, Ives CW, Oparil S, Tita ATN. Prevention, diagnosis, and management of hypertensive disorders of pregnancy: a comparison of international guidelines. Curr Hypertens Rep. 2020;22:66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Bartsch E, Medcalf KE, Park AL, Ray JG; High Risk of Pre-eclampsia Identification Group. Clinical risk factors for pre-eclampsia determined in early pregnancy: systematic review and meta-analysis of large cohort studies. BMJ. 2016;353:i1753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Zhou S, Ji Y, Wang H. The risk factors of gestational hypertension in patients with polycystic ovary syndrome: a retrospective analysis. BMC Pregnancy Childbirth. 2021;21:336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Shen M, Smith GN, Rodger M, White RR, Walker MC, Wen SW. Comparison of risk factors and outcomes of gestational hypertension and pre-eclampsia. PLoS One. 2017;12:e0175914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Sun Y, Shen Z, Zhan Y, et al. Effects of pre-pregnancy body mass index and gestational weight gain on maternal and infant complications. BMC Pregnancy Childbirth. 2020;20:390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Wang X, Lee NL, Burstyn I. Maternal smoking and gestational hypertension: heterogeneous effect by timing of the exposure. Pregnancy Hypertens. 2019;15:123–9. [DOI] [PubMed] [Google Scholar]
- [10].Wu CT, Kuo CF, Lin CP, et al. Association of family history with incidence and gestational hypertension outcomes of preeclampsia. Int J Cardiol Hypertens. 2021;9:100084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Lilhore UK, Manoharan P, Sandhu JK, et al. Hybrid model for precise hepatitis-C classification using improved random forest and SVM method. Sci Rep. 2023;13:12473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Gao D, Feng W, Qiao Y, Jiang X, Zhang Y. Development and validation of a random forest model to predict functional outcome in patients with intracerebral hemorrhage. Neurol Sci. 2023;44:3615–27. [DOI] [PubMed] [Google Scholar]
- [13].Chen Y, Huang X, Wu S, et al. Machine-learning predictive model of pregnancy-induced hypertension in the first trimester. Hypertens Res. 2023;46:2135–44. [DOI] [PubMed] [Google Scholar]
- [14].Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform. 2023;24:bbad002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Fu XY, Song YQ, Lin JY, et al. Developing a prognostic model for primary biliary cholangitis based on a random survival forest model. Int J Med Sci. 2024;21:61–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Valkenborg D, Rousseau AJ, Geubbelmans M, Burzykowski T. Support vector machines. Am J Orthod Dentofacial Orthop. 2023;164:754–7. [DOI] [PubMed] [Google Scholar]
- [17].Burlina S, Dalfrà MG, Belloni P, et al. Can the first fasting plasma glucose test in pregnancy predict subsequent gestational complications? Int J Endocrinol. 2022;2022:9633664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Aziz F, Khan MF, Moiz A. Gestational diabetes mellitus, hypertension, and dyslipidemia as the risk factors of preeclampsia. Sci Rep. 2024;14:6182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Pathak A, Agrawal A. Evolution of C-reactive protein. Front Immunol. 2019;10:943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Osoti AO, Page ST, Richardson BA, et al. Postpartum metabolic syndrome and high-sensitivity C-reactive protein after gestational hypertension and pre-eclampsia. Int J Gynaecol Obstet. 2020;151:443–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Guan P, Su Y, Wang M, et al. A wide range of triglyceride levels is sufficient for fetal growth at gestational weeks 12–16, but higher triglyceride levels are associated with gestational hypertension. Pregnancy Hypertens. 2022;27:74–80. [DOI] [PubMed] [Google Scholar]
- [22].Khaire AA, Thakar SR, Wagh GN, Joshi SR. Placental lipid metabolism in preeclampsia. J Hypertens. 2021;39:127–34. [DOI] [PubMed] [Google Scholar]