Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 1.
Published in final edited form as: Am J Obstet Gynecol. 2024 Mar 26;231(6):649.e1–649.e19. doi: 10.1016/j.ajog.2024.03.031

Prediction of metabolic syndrome following a first pregnancy

Tetsuya KAWAKITA 1, Philip GREENLAND 2, Victoria L PEMBERTON 3, William A GROBMAN 4, Robert M SILVER 5, C Noel Bairey MERZ 6, Rebecca B MCNEIL 7, David M HAAS 8, Uma M REDDY 9, Hyagriv SIMHAN 10, George R SAADE 1
PMCID: PMC11424779  NIHMSID: NIHMS1988866  PMID: 38527600

Abstract

Background:

The prevalence of metabolic syndrome is rapidly increasing in the United States. We hypothesized that prediction models using data obtained during pregnancy can accurately predict the future development of metabolic syndrome.

Objective:

To develop machine-learning models to predict the development of metabolic syndrome using factors ascertained in nulliparous pregnant individuals.

Study Design:

This was a secondary analysis of a prospective cohort study (Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be Heart Health Study [nuMoM2b HHS]). Data were collected from October 2010 to October 2020, and analyzed from July 2023 to October 2023. Participants had in-person visits 2–7 years after their first delivery. The primary outcome was metabolic syndrome, defined by the National Cholesterol Education Program Adult Treatment Panel III criteria, which was measured within 2–7 years after delivery. A total of 127 variables that were obtained during pregnancy were evaluated. The dataset was randomly split into a training set (70%) and a test set (30%). We developed a random forest model and a lasso regression model using variables obtained during pregnancy. We compared the area under the receiver operating characteristic curves (AUROC) for both models. Using the model with the better AUROC, we developed models that included fewer variables based on SHapley Additive exPlanations values and compared them with the original model. The final model chosen would have fewer variables and non-inferior AUROC.

Results:

A total of 4225 individuals met inclusion criteria; the mean (SD) age was 27.0 (5.6) years. Of these, 754 (17.8%) developed metabolic syndrome. The AUROC of the random forest model was 0.878 (95%CI 0.846–0.909), which was higher than that of the lasso model of 0.850 (95%CI 0.811–0.888; P <0.001). Therefore, random forest models using fewer variables were developed. The random forest model with the top 3 variables (high-density lipoprotein, insulin, and high-sensitivity C-reactive protein) was chosen as the final model as it had the AUROC of 0.867 (95%CI 0.839–0.895) which was not inferior to the original model (P=0.08). The AUROC of the final model in the test set was 0.847 (95%CI 0.821–0.873). An online application of the final model was developed (https://kawakita.shinyapps.io/metabolic/).

Conclusions:

We developed a model that can accurately predict the development of metabolic syndrome in 2–7 years after delivery.

Keywords: High-density lipoprotein, High-sensitivity C-reactive protein, Insulin, Machine learning

Introduction:

Cardiovascular disease (CVD) including coronary heart disease, heart failure, stroke, and hypertension, is the leading cause of death in the United States, accounting for 22% of deaths in females.1,2 Even in young adults aged 25–44, CVD accounts for 9.8% of the deaths. Metabolic syndrome is a constellation of CVD risk factors, including abdominal obesity, hyperglycemia, dyslipidemia, and hypertension.3,4 The prevalence of metabolic syndrome is rapidly increasing in the United States.5 From 2011–2012 to 2015–2016, the prevalence increased from 16.2% to 21.3% in people aged 20–39 years old and is higher in females than males among non-Hispanic Black and Hispanic individuals.5,6

The treatment of metabolic syndrome, such as lifestyle modification, can reduce the chance of developing type 2 diabetes and microvascular complications.7 In order to lessen the risk of CVD, prevention of metabolic syndrome is essential. The American Heart Association (AHA) includes a history of hypertensive disorders of pregnancy or gestational diabetes as risk factors for CVD.8 However, other adverse pregnancy outcomes such as preterm delivery or stillbirth are not included in their risk stratification.8 Although previous studies showed an association between adverse pregnancy outcomes (e.g., gestational diabetes, hypertensive disorders of pregnancy, preterm birth) with future development of metabolic syndrome, machine learning models that consider a wide variety of variables are not available.911

We hypothesized that machine learning models using variables ascertained during pregnancy could accurately predict the future development of metabolic syndrome. Accurate identification of individuals at risk for metabolic syndrome could provide opportunities for primary prevention. We sought to develop machine-learning models to predict the future development of metabolic syndrome using factors ascertained solely during pregnancy in nulliparous individuals.

Materials and Methods:

This study was a secondary analysis of data from the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b) from 2010 to 2013 and nuMoM2b-HHS (Heart Health Study) from 2014 to 2020 in the United States.12 Eight sites participated in these studies (Case Western Reserve University, Cleveland, Ohio; Columbia University, New York, New York; Indiana University, Indianapolis, Indianapolis; University of Pittsburgh, Pittsburgh, Pennsylvania; Northwestern University, Chicago, Illinois; University of California at Irvine, Irvine, California; University of Pennsylvania, Philadelphia, Pennsylvania; and University of Utah, Salt Lake City, Utah). All participating institutions obtained Institutional Review Board (IRB) approval and participants gave written informed consent.

The nuMoM2b study was a prospective cohort study in which 10,038 nulliparous individuals with singleton pregnancies enrolled. Individuals were recruited between 6 0/7 weeks gestation and 13 6/7 weeks gestation (1st visit). Subsequently, all individuals who completed the original nuMoM2b study were invited for nuMoM2b-HHS if they had obstetrical delivery information available, agreed to be contacted for future studies, and were at least 18 years old.12 Participants were eligible for an in-person nuMoM2b-HHS visit if 2–7 years had elapsed since the index pregnancy, they were not currently pregnant, at least 6 months postpartum from any subsequent pregnancy, and provided informed consent. In-person visit was obtained at the clinical site or another private location, such as a home. A total of 4508 individuals attended the nuMoM2b HHS in-person visit.

The primary outcome for this analysis was metabolic syndrome which was defined by nuMoM2b-HHS as the presence of three or more of the following criteria: waist circumference ≥88 cm (or ≥80 cm in Asians), fasting glucose ≥100 mg/dL (or medication prescribed for diabetes), HDL cholesterol <50 mg/dL (or medication for low HDL), triglycerides ≥150 mg/dL (or medication prescribed for high tryglycerides), and blood pressure >130/85 mmHg (or medication prescribed for hypertension).13 We excluded individuals with missing information regarding the five criteria for metabolic syndrome, as well as those with chronic hypertension or diabetes mellitus at the time of the index pregnancy.

We examined a series of risk factors that can be ascertained during pregnancy. Risk factors were categorized into demographic, antepartum/intrapartum, social determinants of health (SDoH), and serum analytes including cholesterol, triglycerides, high-sensitivity C-reactive protein (hs-CRP), high-density lipoprotein (HDL), low-density lipoprotein (LDL), hemoglobin A1c (HbA1c), creatinine, albumin, and N-terminal pro b-type natriuretic peptide (NT-proBNP).911,1416 Race and ethnicity were not included because these are socially derived labels.17 Variables with more than 40% missing observations were not considered. We used the k-nearest neighbor imputation method because this method can handle high levels of missingness up to 40%.18 A total of 127 variables were considered as risk factors. Detailed information regarding variables is presented in Supplemental Method 1. A summary of variables is presented in Supplemental Table 1.

After variable selection, we conducted an exploratory factor analysis to further select variables. This analysis is an unsupervised machine-learning method that reduces a large number of variables into a smaller number by extracting the maximum common variance from all variables and putting them into a common score.19 We used the elbow method to determine the ideal number of factors. We computed a heterogenous correlation matrix of all variables. We then used varimax rotation to calculate factor loadings, which indicate how much a factor explains a variable. We excluded variables with factor loadings of less than 0.55 in order to limit the variables included in the model to those that explain the most variance in the sample and avoid overfitting.20

Data cleaning and preprocessing were conducted. Variables that were null or recorded as “unknown” were considered missing. For the features that remained, missing values were imputed using the k-nearest neighbor imputation method. For categorical features, categories with proportions of less than 3% were combined with other categories. Features that had high absolute correlations (>0.8) with other features were removed. During the preprocessing, all categorical variables were one-hot encoded, and numerical variables were standardized using mean and standard deviation transforming them to z-scores. Features with near-zero variance were excluded.

We compared all originally specified variables between individuals who developed metabolic syndrome and those who did not in the entire dataset, using the Student t-test and Mann-Whitney U test for continuous variables, and Chi-square test for categorical variables. The dataset was then randomly split into a training set (70%) and a test set (30%).

We developed machine-learning models using random forest, which is an ensemble method that uses de-correlated unpruned decision trees algorithm using the aggregation of a set of multiple decision trees that are generated by bootstrap sampling.21,22 To compare the performance of machine-learning models with more conventional models, we also developed models using the last absolute shrinkage and selection operator (lasso) method.23 Internal validation and hyperparameter tuning were conducted via a five-fold cross-validation, to measure the bias and variance of each model.24 This process was repeated 20 times to account for the uncertainty of the cross-validation procedure itself. Repeated cross-validation was performed to accomplish internal validation and avoid overfitting and underfitting.25 Hyperparameters that maximize the area under the receiver operating characteristic curve (AUROC) were found.

Although the limitation of machine learning is known to be the black-box nature (models do not disclose which factors are protective or harmful or the degree of effect that risk factors have on the outcome), we examined the model using post-hoc explanation tools such as SHapley Additive exPlanations (SHAP).26,27 For variable importance, SHAP values were used to visualize the contribution of each predictor in the machine learning models. We then compared AUROCs between the random forest model and the lasso model using DeLong nonparametric test.28 We developed models based on the fewest but most important variables based on SHAP values (top 10, 5, 4, 3, and 2 variables) and compared them with the original model.26,27 The final model would include the least number of variables and non-inferior AUROC.

Following reporting recommendations, model performance was assessed using AUROC, calibration curves, and decision curves.29 AUROCs were obtained from cross-validation and from the test dataset. AUROCs were compared using DeLong nonparametric test.28 We mathematically calculated the best cut-off point for the final model using Liu’s method and calculated sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio.30 Calibration curves were plotted after creating ten groups of individuals based on the deciles of the probability of metabolic syndrome. The scatter plots of the predicted and observed metabolic syndrome were connected to form a curve. The ideal curve would be a 45-degree straight line. Decision curve analysis assessed the value of information from the model and considered the likely range of risk and benefit preferences without measuring these from patients.31 Unlike AUROCs that only measure test discrimination, decision curve analysis integrates the preferences of the patients or physicians. The net benefit was calculated as [true-positive rate - (false-positive rate*weighting factor)] with the weighting factor defined as [threshold probability/(1 - threshold probability)]. The threshold probability was defined as the minimum probability above which the patient or physician would choose to intervene (for example, ordering a screening test for metabolic syndrome or providing intervention such as diet counseling or lifestyle modification). A lower threshold probability implies that a patient or physician is more concerned about the metabolic syndrome, while a higher threshold implies that a patient or physician is concerned about the test or intervention (due to, for example, cost, inconvenience, or invasiveness). In this definition, the threshold probability may vary from patient to patient and physician to physician. Detailed information regarding decision curve analysis is available in Supplemental Method 2.

Results:

A total of 4225 individuals (the mean [SD] age was 27.0 [5.6] years) had information regarding metabolic syndrome and had no history of chronic hypertension or pregestational diabetes (Figure 1); 754 (17.8%) developed metabolic syndrome and 3,471 (82.2%) did not. Comparison of variables and missingness are presented in Supplemental Table 1. Of 127 originally specified variables, 70 had sufficient variance in the sample and were selected by the exploratory factor analysis.

Figure 1.

Figure 1.

Cohort diagram.

Beeswarm plots describing the top 10 most important prediction variables for metabolic syndrome were obtained from the models using all 70 variables (Figure 2). In the random forest model, variables from the first trimester of pregnancy (unless otherwise specified) that were important in the model included, in order of importance, HDL level, insulin level, hs-CRP, hip circumference, neck circumference, systolic and diastolic blood pressure in the third trimester, years lived in the United States, diastolic blood pressure in the second trimester, and systolic blood pressure. The lasso model retained similar variables as the random forest model, as well as maternal age, perceived stress scale 0–13 in the first trimester, family income 10K to 14K, and family income 150K to 199K. Unlike the random forest model, neck circumference, years lived in the United States, and systolic blood pressure in the first trimester were not in the top 10 variables.

Figure 2.

Figure 2.

Beeswarm plots.

Each dot represents a single observation in the testing set. The x-axis represents the SHAP value, which quantifies the contribution of features to the prediction model. Mean SHAP values are presented as a bar plot. The color of the dots represents a higher or a lower feature value when compared to other observations.

Abbreviation: Dia BP (diastolic blood pressure); HDL (high-density lipoprotein); HSCRP (high-sentivity C-reactive protein); SHAP (SHapley Additive exPlanations); Sys BP (systolic blood pressure); Years US (years lived in the United States)

AUROCs are presented in Table 1. When considering all 70 variables, the AUROC by cross-validation of the random forest model was 0.878 (95%CI 0.846–0.909), which was higher than that of the lasso model of 0.850 (95%CI 0.811–0.888; P <0.001). Therefore, random forest models using the top 10, 5, 4, 3, and 2 variables were developed. The random forest model using the top 2 variables was inferior to the original model (P =0.03); other models using the top 10, 5, 4, and 3 variables were not inferior to the original model. The random forest model with the top 3 variables (high-density lipoprotein, insulin, and high-sensitivity C-reactive protein) was chosen as the final model with AUROC of 0.867 (95%CI 0.839–0.895). The AUROC of the final model in the test set was 0.847 (95%CI 0.821–0.873). Receiver operating characteristic curves and area under the curves are presented in Supplemental Figure 1. The sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio of the final random forest model are presented in Supplemental Table 2. The best cut-off point was 18% with sensitivity of 0.78 (95%CI 0.72–0.84), specificity of 0.76 (95%CI 0.73–0.78), positive likelihood ratio of 3.23 (95%CI 2.84–3.67), and negative likelihood ratio of 0.29 (95%CI 0.22–0.37). The calibration plot of the random forest model with the top 3 variables is presented in Supplemental Figure 2. The final model had a goodness of fit.

Table 1.

Area under the receiver operating curves.

All 70 variables Top 10 Top 5 Top 4 Top 3 Top 2
Included variables a) Random forest b) Lasso c) Random forest d) Random forest e) Random forest f) Random forest g) Random forest
Cross Validation 0.878 (0.846–0.909) 0.850 (0.811–0.888) 0.879 (0.850–0.907) 0.874 (0.847–0.901) 0.875 (0.848–0.901) 0.867 (0.839–0.895) 0.851 (0.820–0.881)
P-value Referent <0.001 0.74 0.35 0.34 0.08 0.03
Test set 0.880 (0.856–0.903) 0.841 (0.811–0.870) 0.872 (0.846–0.896) 0.862 (0.836–0.887) 0.860 (0.833–0.887) 0.847 (0.821–0.873) 0.838 (0.810–0.864)
P-value Referent <0.001 0.62 0.28 0.25 0.06 0.02

Numbers are shown as area under the receiver operating curves and 95% confidence intervals.

Lastly, the results of the decision curve analysis are presented in Figure 3. Both random forest (all variables and top 3 variables) and lasso models (all variables) provided superior net benefit compared to treating all individuals when the threshold probability was between 0% and 80%. The 95% confidence intervals of each model overlapped with the others, suggesting no significant differences among models. Finally, we have developed an online application, to predict the future development of metabolic syndrome, using the random forest model with the top 3 variables (https://kawakita.shinyapps.io/metabolic/).

Figure 3.

Figure 3.

Decision curve analysis.

The x-axis represents the threshold probability for metabolic syndrome. The y-axis represents the net benefit. The decision curves include the net benefit of each model as well as two clinical alternatives (not treating anyone and treating all individuals) over a specified range of threshold probabilities of outcome.

Figure 3A presents decision curves without 95% confidence intervals. Figure 3B presents decision curves with 95% confidence intervals.

Abbreviation: RF (random forest)

Comment:

Principal findings

In this secondary analysis of the nuMoM2b HHS study, we found that both random forest and lasso models using data ascertained during a first pregnancy provided an excellent prediction of metabolic syndrome 2–7 years later, with AUROCs ranging from 0.841 to 0.880. The random forest model performed better than the lasso model. Using HDL, insulin, and hs-CRP, the random forest model had a similar AUROC compared to the AUROC of the random forest model using 70 variables. Decision curve analysis suggested that the models provided superior net benefit compared to treating all individuals.

Results in the context of what is known

Previous studies showed that adverse pregnancy outcomes such as gestational diabetes, pregnancy-associated hypertension, and preterm birth were associated with future development of metabolic syndrome.911 For example, a secondary analysis of the nuMoM2b HHS study showed that individuals with gestational diabetes, pregnancy-associated hypertension, and preterm birth had an increased risk of metabolic syndrome compared to those without complications (75%, 49%, and 78% increase in risk, respectively).11 Our study expanded the literature by developing prediction models that consider a wide variety of variables. The final model included only three variables and still achieved high level of test discrimination. Finally, our study had similar AUROCs compared to previous retrospective studies using machine learning models without pregnancy information that were performed outside of the United States (AUROCs 0.69–0.97).16,3235 However, we believe our models are unique because of the prospective design of the nuMoM2b HHS dataset, and the rigorous ascertainment of the data by trained and certified research coordinators, both during pregnancy as well as the follow up 2–7 years later. In addition, previous studies did not evaluate their models using calibration plots or decision analysis curves.

Clinical implications

It is well known that adverse pregnancy outcomes (APOs) including preeclampsia, preterm birth, small for gestational age, and gestational diabetes are associated with the future development of adverse outcomes including CVD and mortality.15 36 37 Although we validated our result by splitting the HHS dataset and our random forest model had excellent prediction and calibration, our model needs to be validated externally. The integration of machine learning models into the electronic medical record (EMR) is possible and has been shown to improve clinical workflow.38 Once our model is validated, automated real-time risk prediction of metabolic syndrome could facilitate the referral to primary care providers or intervention programs including dietary counseling and lifestyle modification. Identifying high-risk individuals could provide an opportunity to reduce the incidence of metabolic syndrome. For example, in middle-aged individuals, a primary prevention program decreased the prevalence of metabolic syndrome, chronic hypertension, and abdominal obesity.39

Research implications

Some variables that we included such as HDL, insulin, or high-sensitivity CRP are not routinely obtained during pregnancy. It is well known that lipoprotein levels increase significantly during pregnancy.4042 Median CRP levels are marginally higher during pregnancy.43 Despite the physiologic change in pregnancy, these markers were predictive of the development of metabolic syndrome, which could suggest underlying dyslipidemia or inflammatory status in individuals who developed metabolic syndrome. Pregnancy is a rare opportunity for healthy women to access the healthcare system on a regular basis.44 Therefore, if our model is validated and its use were to be shown to be of benefit, these tests could be obtained during pregnancy to help with risk prediction. In addition, our model should be validated in other datasets to ensure the prediction is accurate in other populations. For example, because our dataset only included nulliparous individuals, validation in multiparous individuals would be useful. Further, randomized controlled trials that examine whether the integration of our model into EMR would improve earlier detection and prevention of metabolic syndrome are needed. Finally, trials should be performed that examine whether interventions such as dietary counseling and lifestyle modifications, with high-risk postpartum individuals, decrease the risk of metabolic syndrome.

Strengths and Limitations:

Our study has many strengths. First, we developed a machine learning model (random forest) and a traditional statistical model (lasso model). Machine learning algorithms can address many variables, do not require expert selection, and can estimate complex interactions between variables with accuracy. In contrast to conventional statistics that are dependent on p-values (which dichotomize associations into significant and non-significant), machine learning methods are free from p-value-dependent limitations.45 In our study, the random forest model performed better than the lasso model based on AUROCs. Second, the nuMoM2b-HHS dataset was the ideal dataset to evaluate pregnancy-associated risk factors for metabolic syndrome as it is a large, diverse, US cohort that collected pregnancy outcomes, SDoH data, biological specimens, and cardiac data longitudinally during and after the pregnancy. Third, even though machine learning is known to be a “black box”, we used SHAP values to visualize the variable importance. Finally, we have developed an online application to demonstrate our random forest model using the top 3 variables (https://kawakita.shinyapps.io/metabolic/).

Our study has limitations. The original nuMoM2b cohort included 10,038 individuals, but only 4,508 attended an HHS in-person visit. However, this sample size was planned and not because of difficult recruitment. It is possible that individuals who agreed to participate in the nuMoM2b-HHS have different characteristics compared to those who did not. However, prior comparisons of the nuMoM2b participants, who were included in HHS and those who were not, showed that these differences were not large and unlikely to cause significant biases.46 Generally, random misclassification biases the results towards the null whereas non-random misclassification can bias results towards positive or negative. Finally, given the black-box nature of machine-learning algorithms, we can’t know why certain variables were included or excluded from the model.

Conclusion:

In this large nuMoM2b HHS cohort, we developed and internally validated models that accurately predict the development of metabolic syndrome within 2–7 years after delivery. Models that considered the top 3 variables performed as well as those that considered all variables. If validated prospectively in other settings, our models may be of use in identifying postpartum individuals at risk for metabolic syndrome to determine benefit from additional screening and medical interventions to mitigate their risk and improve outcomes.

Supplementary Material

1
Download video file (38.1MB, wmv)
2
3
Download video file (135.2MB, wmv)
4
5

Supplemental Figure 1. Receiver operating curves.

6

Supplemental Figure 2. Calibration plot of the final model

The ideal curve would be a 45 degree straight line (dashed line). The x-axis shows the predicted probability of the development of metabolic syndrome. The y-axis shows the observed proportion of individuals who developed metabolic syndrome within 2–7 years after delivery. Vertical bars present corresponding 95% confidence intervals.

7
8

Tweetable statement:

Pregnancy outcomes are useful to predict metabolic syndrome within 2–7 years after pregnancy.

AJOG at a Glance:

A: Why was this study conducted?

The prevalence of metabolic syndrome is rapidly increasing in the United States. Pregnancy is a rare opportunity for healthy women to access the healthcare system regularly. Models that use information gathered during pregnancy and accurately predict the future development of metabolic syndrome in 2–7 years would be useful.

B: What are the key findings?

Using the 3 most important variables detected by ML (high-density lipoprotein, insulin, and high-sensitivity C-reactive protein), the model had good predictive capacity. We developed an online application using the random forest model (https://kawakita.shinyapps.io/metabolic/).

C: What does this study add to what is already known?

ML models, if externally validated, can be prospectively used to identify postpartum individuals at risk for metabolic syndrome who may benefit from additional screening and interventions to mitigate their risk and improve outcomes.

Funding information

nuMoM2b specimen and data collection were supported by grant funding from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD): U10 HD063036; U10 HD063072; U10 HD063047; U10 HD063037; U10 HD063041; U10 HD063020; U10 HD063046; U10 HD063048; and U10 HD063053. In addition, support was provided by Clinical and Translational Science Institutes: UL1TR001108 and UL1TR000153. The nuMoM2b Heart Health Study was supported by cooperative agreement funding from the National Heart, Lung, and Blood Institute and the Eunice Kennedy Shriver National Institute of Child Health and Human Development: U10-HL119991; U10-HL119989; U10-HL120034; U10-HL119990; U10-HL120006; U10-HL119992; U10-HL120019; U10-HL119993; U10-HL120018, and U01HL145358; with supplemental support from the Office of Research on Women’s Health Office of Disease Prevention. Additional support was provided by the National Center for Advancing Translational Sciences through UL-1-TR000124, UL-1-TR000153, UL-1-TR000439, and UL-1-TR001108; and the Barbra Streisand Women’s Cardiovascular Research and Education Program, and the Erika J. Glazer Women’s Heart Research Initiative, Cedars-Sinai Medical Center, Los Angeles.

The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, and Blood Institute, the National Institutes of Health, or the US Department of Health and Human Services.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Disclosure statement of any potential conflict of interest

The authors report no conflict of interest.

Reference:

  • 1.National Center for Health Statistics. National Vital Statistics Report ; v. 70, No. 9. National Center for Health Statistics; 2021. doi: 10.15620/cdc:107021 [DOI] [Google Scholar]
  • 2.Benjamin EJ, Muntner P, Alonso A, et al. Heart Disease and Stroke Statistics—2019 Update: A Report From the American Heart Association. Circulation. 2019;139(10). doi: 10.1161/CIR.0000000000000659 [DOI] [PubMed] [Google Scholar]
  • 3.Eckel RH, Grundy SM, Zimmet PZ. The metabolic syndrome. Lancet Lond Engl. 2005;365(9468):1415–1428. doi: 10.1016/S0140-6736(05)66378-7 [DOI] [PubMed] [Google Scholar]
  • 4.Alberti KGMM, Eckel RH, Grundy SM, et al. Harmonizing the metabolic syndrome: a joint interim statement of the International Diabetes Federation Task Force on Epidemiology and Prevention; National Heart, Lung, and Blood Institute; American Heart Association; World Heart Federation; International Atherosclerosis Society; and International Association for the Study of Obesity. Circulation. 2009;120(16):1640–1645. doi: 10.1161/CIRCULATIONAHA.109.192644 [DOI] [PubMed] [Google Scholar]
  • 5.Ford ES, Giles WH, Dietz WH. Prevalence of the metabolic syndrome among US adults: findings from the third National Health and Nutrition Examination Survey. JAMA. 2002;287(3):356–359. doi: 10.1001/jama.287.3.356 [DOI] [PubMed] [Google Scholar]
  • 6.Hirode G, Wong RJ. Trends in the Prevalence of Metabolic Syndrome in the United States, 2011–2016. JAMA. 2020;323(24):2526–2528. doi: 10.1001/jama.2020.4501 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Diabetes Prevention Program Research Group. Long-term effects of lifestyle intervention or metformin on diabetes development and microvascular complications over 15-year follow-up: the Diabetes Prevention Program Outcomes Study. Lancet Diabetes Endocrinol. 2015;3(11):866–875. doi: 10.1016/S2213-8587(15)00291-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Mosca L, Benjamin EJ, Berra K, et al. Effectiveness-based guidelines for the prevention of cardiovascular disease in women−−2011 update: a guideline from the american heart association. Circulation. 2011;123(11):1243–1262. doi: 10.1161/CIR.0b013e31820faaf8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Shen Y, Li W, Leng J, et al. High risk of metabolic syndrome after delivery in pregnancies complicated by gestational diabetes. Diabetes Res Clin Pract. 2019;150:219–226. doi: 10.1016/j.diabres.2019.03.030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Smith GN, Pudwell J, Walker M, Wen SW. Risk Estimation of Metabolic Syndrome at One and Three Years After a Pregnancy Complicated by Preeclampsia. J Obstet Gynaecol Can. 2012;34(9):836–841. doi: 10.1016/S1701-2163(16)35382-8 [DOI] [PubMed] [Google Scholar]
  • 11.Ehrenthal DB, McNeil RB, Crenshaw EG, et al. Adverse Pregnancy Outcomes and Future Metabolic Syndrome. J Womens Health. 2023;32(9):932–941. doi: 10.1089/jwh.2023.0026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Haas DM, Ehrenthal DB, Koch MA, et al. Pregnancy as a Window to Future Cardiovascular Health: Design and Implementation of the nuMoM2b Heart Health Study. Am J Epidemiol. 2016;183(6):519–530. doi: 10.1093/aje/kwv309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults. Executive Summary of The Third Report of The National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, And Treatment of High Blood Cholesterol In Adults (Adult Treatment Panel III). JAMA. 2001;285(19):2486–2497. doi: 10.1001/jama.285.19.2486 [DOI] [PubMed] [Google Scholar]
  • 14.Park YW, Zhu S, Palaniappan L, Heshka S, Carnethon MR, Heymsfield SB. The metabolic syndrome: prevalence and associated risk factor findings in the US population from the Third National Health and Nutrition Examination Survey, 1988–1994. Arch Intern Med. 2003;163(4):427–436. doi: 10.1001/archinte.163.4.427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Wilson PW, Kannel WB, Silbershatz H, D’Agostino RB. Clustering of metabolic factors and coronary heart disease. Arch Intern Med. 1999;159(10):1104–1109. doi: 10.1001/archinte.159.10.1104 [DOI] [PubMed] [Google Scholar]
  • 16.Sghaireen MG, Al-Smadi Y, Al-Qerem A, et al. Machine Learning Approach for Metabolic Syndrome Diagnosis Using Explainable Data-Augmentation-Based Classification. Diagnostics. 2022;12(12):3117. doi: 10.3390/diagnostics12123117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Vyas DA, Eisenstein LG, Jones DS. Hidden in Plain Sight — Reconsidering the Use of Race Correction in Clinical Algorithms. Malina D, ed. N Engl J Med. 2020;383(9):874–882. doi: 10.1056/NEJMms2004740 [DOI] [PubMed] [Google Scholar]
  • 18.Liao SG, Lin Y, Kang DD, et al. Missing value imputation in high-dimensional phenomic data: imputable or not, and how? BMC Bioinformatics. 2014;15(1):346. doi: 10.1186/s12859-014-0346-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Cudeck R Exploratory Factor Analysis. In: Handbook of Applied Multivariate Statistics and Mathematical Modeling. Elsevier; 2000:265–296. doi: 10.1016/B978-012691360-6/50011-2 [DOI] [Google Scholar]
  • 20.Tabachnick BG, Fidell LS. Using Multivariate Statistics. 5. ed., Pearson internat. ed., [Nachdr.]. Pearson Allyn and Bacon; 20. [Google Scholar]
  • 21.Cutler A, Cutler DR, Stevens JR. Random Forests. In: Zhang C, Ma Y, eds. Ensemble Machine Learning. Springer US; 2012:157–175. doi: 10.1007/978-1-4419-9326-7_5 [DOI] [Google Scholar]
  • 22.Breiman L [No title found]. Mach Learn. 2001;45(1):5–32. doi: 10.1023/A:1010933404324 [DOI] [Google Scholar]
  • 23.Tibshirani R, Bien J, Friedman J, et al. Strong Rules for Discarding Predictors in Lasso-Type Problems. J R Stat Soc Ser B Stat Methodol. 2012;74(2):245–266. doi: 10.1111/j.14679868.2011.01004.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Second edition. Springer; 2019. [Google Scholar]
  • 25.Rodriguez JD, Perez A, Lozano JA. Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation. IEEE Trans Pattern Anal Mach Intell. 2010;32(3):569–575. doi: 10.1109/TPAMI.2009.187 [DOI] [PubMed] [Google Scholar]
  • 26.Sundararajan M, Najmi A. The many Shapley values for model explanation. Published online 2019. doi: 10.48550/ARXIV.1908.08474 [DOI] [Google Scholar]
  • 27.Lundberg S, Lee SI. A Unified Approach to Interpreting Model Predictions. Published online 2017. doi: 10.48550/ARXIV.1705.07874 [DOI] [Google Scholar]
  • 28.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–845. [PubMed] [Google Scholar]
  • 29.Moons KGM, Altman DG, Reitsma JB, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–73. doi: 10.7326/M14-0698 [DOI] [PubMed] [Google Scholar]
  • 30.Liu A, Wu C, Schisterman EF. Nonparametric sequential evaluation of diagnostic biomarkers. Stat Med. 2008;27(10):1667–1678. doi: 10.1002/sim.3203 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak Int J Soc Med Decis Mak. 2006;26(6):565–574. doi: 10.1177/0272989X06295361 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Choe EK, Rhee H, Lee S, et al. Metabolic Syndrome Prediction Using Machine Learning Models with Genetic and Clinical Information from a Nonobese Healthy Population. Genomics Inform. 2018;16(4):e31. doi: 10.5808/GI.2018.16.4.e31 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Yang H, Yu B, OUYang P, et al. Machine learning-aided risk prediction for metabolic syndrome based on 3 years study. Sci Rep. 2022;12(1):2248. doi: 10.1038/s41598-022-06235-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Daniel Tavares L, Manoel A, Henrique Rizzi Donato T, et al. Prediction of metabolic syndrome: A machine learning approach to help primary prevention. Diabetes Res Clin Pract. 2022;191:110047. doi: 10.1016/j.diabres.2022.110047 [DOI] [PubMed] [Google Scholar]
  • 35.Shimoda A, Ichikawa D, Oyama H. Prediction models to identify individuals at risk of metabolic syndrome who are unlikely to participate in a health intervention program. Int J Med Inf. 2018;111:90–99. doi: 10.1016/j.ijmedinf.2017.12.009 [DOI] [PubMed] [Google Scholar]
  • 36.Heida KY, Velthuis BK, Oudijk MA, et al. Cardiovascular disease risk in women with a history of spontaneous preterm delivery: A systematic review and meta-analysis. Eur J Prev Cardiol. 2016;23(3):253–263. doi: 10.1177/2047487314566758 [DOI] [PubMed] [Google Scholar]
  • 37.Dall’Asta A, D’Antonio F, Saccone G, et al. Cardiovascular events following pregnancy complicated by pre-eclampsia with emphasis on comparison between early- and late-onset forms: systematic review and meta-analysis. Ultrasound Obstet Gynecol. 2021;57(5):698–709. doi: 10.1002/uog.22107 [DOI] [PubMed] [Google Scholar]
  • 38.Sendak M, Gao M, Nichols M, Lin A, Balu S. Machine Learning in Health Care: A Critical Appraisal of Challenges and Opportunities. EGEMs Gener Evid Methods Improve Patient Outcomes. 2019;7(1):1. doi: 10.5334/egems.287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Laucevičius A, Rinkūnienė E, Skujaitė A, et al. Prevalence of cardiovascular risk factors in Lithuanian middle-aged subjects participating in the primary prevention program, analysis of the period 2009–2012. Blood Press. 2015;24(1):41–47. doi: 10.3109/08037051.2014.961744 [DOI] [PubMed] [Google Scholar]
  • 40.Wiznitzer A, Mayer A, Novack V, et al. Association of lipid levels during gestation with preeclampsia and gestational diabetes mellitus: a population-based study. Am J Obstet Gynecol. 2009;201(5):482.e1–482.e8. doi: 10.1016/j.ajog.2009.05.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Brizzi P, Tonolo G, Esposito F, et al. Lipoprotein metabolism during normal pregnancy. Am J Obstet Gynecol. 1999;181(2):430–434. doi: 10.1016/S0002-9378(99)70574-0 [DOI] [PubMed] [Google Scholar]
  • 42.Piechota W, Staszewski A. Reference ranges of lipids and apolipoproteins in pregnancy. Eur J Obstet Gynecol Reprod Biol. 1992;45(1):27–35. doi: 10.1016/0028-2243(92)90190-A [DOI] [PubMed] [Google Scholar]
  • 43.Belo L, Santos-Silva A, Rocha S, et al. Fluctuations in C-reactive protein concentration and neutrophil activation during normal human pregnancy. Eur J Obstet Gynecol Reprod Biol. 2005;123(1):46–51. doi: 10.1016/j.ejogrb.2005.02.022 [DOI] [PubMed] [Google Scholar]
  • 44.Saade GR. Pregnancy as a Window to Future Health. Obstet Gynecol. 2009;114(5):958–960. doi: 10.1097/AOG.0b013e3181bf5588 [DOI] [PubMed] [Google Scholar]
  • 45.Shazly SA, Trabuco EC, Ngufor CG, Famuyide AO. Introduction to Machine Learning in Obstetrics and Gynecology. Obstet Gynecol. 2022;139(4):669–679. doi: 10.1097/AOG.0000000000004706 [DOI] [PubMed] [Google Scholar]
  • 46.Haas DM, Parker CB, Marsh DJ, et al. Association of Adverse Pregnancy Outcomes With Hypertension 2 to 7 Years Postpartum. J Am Heart Assoc. 2019;8(19):e013092. doi: 10.1161/JAHA.119.013092 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
Download video file (38.1MB, wmv)
2
3
Download video file (135.2MB, wmv)
4
5

Supplemental Figure 1. Receiver operating curves.

6

Supplemental Figure 2. Calibration plot of the final model

The ideal curve would be a 45 degree straight line (dashed line). The x-axis shows the predicted probability of the development of metabolic syndrome. The y-axis shows the observed proportion of individuals who developed metabolic syndrome within 2–7 years after delivery. Vertical bars present corresponding 95% confidence intervals.

7
8

RESOURCES