Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

medRxiv logoLink to medRxiv
[Preprint]. 2023 Feb 7:2023.02.03.23285385. [Version 1] doi: 10.1101/2023.02.03.23285385

Prediction of Preeclampsia from Clinical and Genetic Risk Factors in Early and Late Pregnancy Using Machine Learning and Polygenic Risk Scores

Vesela P Kovacheva, Braden W Eberhard, Raphael Y Cohen, Matthew Maher, Richa Saxena, Kathryn J Gray
PMCID: PMC9934723  PMID: 36798188

ABSTRACT

Background

Preeclampsia, a pregnancy-specific condition associated with new-onset hypertension after 20 weeks gestation, is a leading cause of maternal and neonatal morbidity and mortality. Predictive tools to understand which individuals are most at risk are needed.

Methods

We identified a cohort of N=1,125 pregnant individuals who delivered between 05/2015-05/2022 at Mass General Brigham hospitals with available electronic health record (EHR) data and linked genetic data. Using clinical EHR data and systolic blood pressure polygenic risk scores (SBP PRS) derived from a large genome-wide association study, we developed machine learning (xgboost) and linear regression models to predict preeclampsia risk.

Results

Pregnant individuals with an SBP PRS in the top quartile had higher blood pressures throughout pregnancy compared to patients within the lowest quartile SBP PRS. In the first trimester, the most predictive model was xgboost, with an area under the curve (AUC) of 0.73. Adding the SBP PRS to the models improved the performance only of the linear regression model from AUC 0.70 to 0.71; the predictive power of other models remained unchanged. In late pregnancy, with data obtained up to the delivery admission, the best performing model was xgboost using clinical variables, which achieved an AUC of 0.91.

Conclusions

Integrating clinical and genetic factors into predictive models can inform personalized preeclampsia risk and achieve higher predictive power than the current practice. In the future, personalized tools can be implemented in clinical practice to identify high-risk patients for preventative therapies and timely intervention to improve adverse maternal and neonatal outcomes.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from medRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES