Skip to main content
BMC Medicine logoLink to BMC Medicine
. 2024 Oct 25;22:495. doi: 10.1186/s12916-024-03717-0

Accelerating adverse pregnancy outcomes research amidst rising medication use: parallel retrospective cohort analyses for signal prioritization

Yeon Mi Hwang 1,2,3, Samantha N Piekos 1, Alison G Paquette 1,4,5, Qi Wei 1, Nathan D Price 1,6,7, Leroy Hood 1,6,8, Jennifer J Hadlock 1,9,
PMCID: PMC11520034  PMID: 39456023

Abstract

Background

Pregnant women are significantly underrepresented in clinical trials, yet most of them take medication during pregnancy despite the limited safety data. The objective of this study was to characterize medication use during pregnancy and apply propensity score matching method at scale on patient records to accelerate and prioritize the drug effect signal detection associated with the risk of preterm birth and other adverse pregnancy outcomes.

Methods

This was a retrospective study on continuously enrolled women who delivered live births between 2013/01/01 and 2022/12/31 (n = 365,075) at Providence St. Joseph Health. Our exposures of interest were all outpatient medications prescribed during pregnancy. We limited our analyses to medication that met the minimal sample size (n = 600). The primary outcome of interest was preterm birth. Secondary outcomes of interest were small for gestational age and low birth weight. We used propensity score matching at scale to evaluate the risk of these adverse pregnancy outcomes associated with drug exposure after adjusting for demographics, pregnancy characteristics, and comorbidities.

Results

The total medication prescription rate increased from 58.5 to 75.3% (P < 0.0001) from 2013 to 2022. The prevalence rate of preterm birth was 7.7%. One hundred seventy-five out of 1329 prenatally prescribed outpatient medications met the minimum sample size. We identified 58 medications statistically significantly associated with the risk of preterm birth (P ≤ 0.1; decreased: 12, increased: 46).

Conclusions

Most pregnant women are prescribed medication during pregnancy. This highlights the need to utilize existing real-world data to enhance our knowledge of the safety of medications in pregnancy. We narrowed down from 1329 to 58 medications that showed statistically significant association with the risk of preterm birth even after addressing numerous covariates through propensity score matching. This data-driven approach demonstrated that multiple testable hypotheses in pregnancy pharmacology can be prioritized at scale and lays the foundation for application in other pregnancy outcomes.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12916-024-03717-0.

Keywords: Drug-related side effects and adverse reactions, Pharmacoepidemiology, Data mining, Pregnancy, Preterm birth

Background

Pharmaceutical companies primarily rely on pre-marketing randomized clinical trials to prevent and assess adverse drug reactions (ADRs). Despite the effort, studies conducted on inpatient populations estimated a serious ADR incidence rate of 6.7% (N ≥ 2,216,000) with a fatality rate of 0.32% (N ≥ 106,000), placing ADRs as the fourth leading cause of morbidity and mortality in the United States (US) health care systems [1, 2]. The incidence rate of ADRs in outpatients is harder to estimate, with studies suggesting rates ranging from 3 to 38% [38]. Estimated incidence rate of ADRs in both inpatient and outpatient demonstrates that unintended drug response is common and expected.

Pre-marketing random clinical trials rarely include pregnant women unless the product targets pregnant women [9]. Consequently, drug efficacy, safety, and dosages are determined based on data from men and non-pregnant women. While pregnant women are the most underrepresented population in clinical trials, they can experience some of the most complex medical situations. During pregnancy, women undergo marked physiological changes that significantly alter the pharmacokinetics and pharmacodynamics of drugs [10]. Therefore, current knowledge in pharmacology should not be directly applied to pregnant women, as inadequate information on the pharmacology of pregnancy exposes them to a high likelihood of experiencing unintended drug responses.

Despite the limited availability of safety information regarding medication use during pregnancy, many pregnant women continue to use medications. Overall, 93.9% of pregnant women take at least one medication (over-the-counter or prescribed) and typically use an average of 4.2 during pregnancy. Usage of prescribed medication by pregnant women varies globally, ranging from 23 to 96%, with the US in 2008 reporting a usage rate of 49% among pregnant women [11]. Given the prevalent use of medication among pregnant women and the challenges associated with conducting prospective clinical trials on this population, leveraging real-world data has emerged as a promising supplemental approach to investigate the effects of drugs during pregnancy. Electronic health records (EHRs) are particularly suitable candidates among these real-world data sources. EHRs contain rich and comprehensive information about patients' longitudinal health profiles, potential confounding factors, and prescription history. Active research on developing novel methodologies for not only ADRs [12, 13] but also for drug repositioning [14] and drug-drug interactions [15, 16] is ongoing.

However, despite these advancements in data-driven healthcare research, the field of pregnancy research has been slower in adopting these novel methodologies. In summary, there is a pressing need to establish a foundational framework for systematically investigating drug responses during pregnancy at scale using real-world data. Such an effort is crucial, as it can lead to the generation of testable hypotheses related to drug effects on pregnancy outcomes, both positive and negative. Furthermore, uncovering drug responses that do not pose risks to adverse pregnancy outcomes can provide valuable insights into drug safety during pregnancy. Here, we selected preterm birth (PTB) as our primary outcome of interest. PTB, defined as birth occurring before 37 weeks of gestation, significantly contributes to perinatal morbidity and mortality in developed countries. PTB accounts for 75% of perinatal mortality cases and over half of long-term morbidity [17].

We employed a large-scale propensity score matching approach on patient records to expedite the generation and prioritization of testable hypotheses related to the risk of PTB. We hypothesized there exist not yet characterized pharmacological signals with medication and risk of PTB. Beyond hypothesis generation, we investigated a few detected drug effect signals using traditional pharmacoepidemiology methods.

Methods

Study design, setting, and participants

Providence St. Joseph Health (PSJH) is an integrated US community healthcare system that provides care in urban and rural settings across seven states: Alaska, California, Montana, Oregon, New Mexico, Texas, and Washington. We used PSJH pregnant patient records who delivered live infants from January 1, 2013, through December 31, 2022 (n = 543,408). We excluded multiple pregnancies and deliveries with gestational age (GA) of less than 20 weeks (n = 516,881). GA was limited to 20 weeks or greater because ascertainment bias is particularly high for EHR data earlier in pregnancy. This study population may be biased toward lower-risk pregnancy cases. This is because high-risk pregnancy cases are often transferred to third-level academic medical centers. We excluded patients who were not continuously enrolled from 180 days before the start of pregnancy (last menstrual period, LMP) to the time of delivery (n = 365,075). Our definition of continuous enrollment was at least one encounter 180 days before LMP and one encounter on or after the delivery date. This was done to partially address surveillance bias.

All procedures were reviewed and approved by the Institutional Review Board at the PSJH through expedited review on 11–04-2020 (study number STUDY2020000196). Consent was waived because disclosure of protected health information for the study involved no more than minimal risk to the privacy of individuals.

Variables

Exposures

We mapped all prescription records during pregnancy to the RxNorm code based on ingredients. We split the cohort into exposed and unexposed groups for individual medication ingredients. Women with medication orders that overlapped with at least 1 day of pregnancy were considered exposed. While medication records may not accurately capture actual medication exposure, there is generally strong agreement between the medication use reported by pregnant women and their medication records [18]. We excluded medications that did not reach a minimum sample size of the exposed, which was 600. This minimum sample size was calculated using Epitools [19], with the following parameters: PTB prevalence rate of the PSJH maternity cohort (7.7%), assumed relative risk (1.55), desired level of confidence (0.9), and desired power for the detection of significant difference (0.8). The calculated minimum sample size was 582, but we rounded it to 600.

Outcomes

The primary outcome of interest was PTB, defined as gestational age at birth (GA; GA < 37 weeks). Secondary outcomes were low birth weight (LBW; birth weight < 2500 g) and small for gestational age (SGA; birth weight < 10th percentile of based on gestational age).

Covariates

We extracted maternal, pre-pregnancy, and prenatal characteristics and comorbidities information from EHR data. Pregnancy and maternal characteristics were collected during prenatal care or at time of delivery. These included parity, preterm history, delivery year, fetal sex, age at LMP, race, ethnicity, insurance status, pregravid body mass index (BMI), smoking, and use of alcohol and illegal drugs (Additional File 1: Table S1).

We conducted a parallel analysis with three different sets of covariates. First, we conducted propensity score matching with the covariates without comorbidities. Second, we addressed pre-pregnancy comorbidities based on the obstetric comorbidity index [20]. Selected comorbidities were renal diseases, chronic lung diseases, diabetes, leukemia, pneumonia, sepsis, cardiovascular diseases, sickle cell diseases, anemia, cystic fibrosis, and asthma (Additional File 1: Table S2). A similar practice was done in an at-scale study conducted by Sentinel System, one of the US Food and Drug Administration (FDA) efforts in surveillance medical products [21]. We excluded comorbidities specific to the prenatal period, such as gestational diabetes; the obstetric comorbidity index is designed to assess the mortality risk at delivery. Third, we selected the 25 most common comorbidities before and during the pregnancy (Additional File 1: Method S1). We acknowledge prenatal comorbidities do not satisfy the covariate definition. However, this study aims to explore the usefulness of EHRs and generate hypotheses. To do so, we employed an exploratory approach beyond the conventional one.

Analysis

Descriptive statistics

We described the source population on maternal characteristics, outcomes, and covariates. The descriptive statistics are presented in Additional File 1: Table S3. We characterized the prescription rate within the PSJH pregnant population in Fig. 1. We used the chi-square test and linear regression to evaluate the difference in prescription rate across categorical variables and continuous variables. Age distribution of this source population is described in Additional File 1: Fig. S2. Prescription patterns from 2013 to 2022 based on their ingredient and ATC classification categories are displayed in Additional File 1: Fig. S3.

Fig. 1.

Fig. 1

Overall prescription rate of PSJH pregnant population. A Plot shows the increase in total prescription rate from 2013 to 2022. The total medication prescription rate increased from 58.5 to 75.3% from 2013 to 2022 (P < 0.0001). The inpatient prescription rate increased from 29.3 to 32.4% (P = 0.2). In contrast, outpatient medication prescriptions increased from 50.5 to 70.1% (P < 0.0001). We evaluated the increase in prescription rate using linear regression. B Plot shows the total prescription rate across age groups (P < 0.0001). We evaluated the decrease in prescription rate across ages using linear regression. C Plot shows the difference in prescription rates between insurance groups (P < 0.0001). We evaluated the difference in prescription rate across categorical variables using the chi-square test. D Plot shows the difference in prescription rate across race groups (P < 0.0001). We evaluated the difference in prescription rate across categorical variables using the chi-square test. E Plot shows the increase in prescription rate based on comorbidity count. The increase in prescription rate across comorbidity count using linear regression

Propensity score matching

We calculated the risk ratio of PTB, LBW, and SGA for individual outpatient medications that reached the minimum sample size. For each medication, the unexposed group was matched to the exposed group on the covariates. Missing values for parity and preterm history were imputed as 0. Missing values for pregravid BMI were imputed to be in the normal BMI category. The remaining covariates were imputed using the median. We used propensity score matching to account for covariates associated with adverse pregnancy outcomes. Compared to other propensity score methods and covariate adjustment methods, propensity score matching provided exceptional covariate balance across most circumstances [22]. An unsupervised learning model with k-nearest neighbors (k = 1), as recommended by a prior study [23], was used to match with replacement by the propensity logit metric. We evaluated the covariate balance using an average standardized mean difference. We excluded medication ingredients with an average standardized mean difference below 0.2. We categorized medications with statistically significant associations based on their indication in three categories: preterm labor (PTL) or PTB, PTB risk factors, and infection (Additional File 1: Table S4 [2496]). Here, we considered association with a P value below 0.1 statistically significant. This is not a conventional practice in hypothesis-testing studies, but our study is hypothesis-generating. We are suggesting potential hypotheses for researchers to investigate further.

Validation

We selected sertraline, acyclovir, and ferrous sulfate for further investigation. They had relatively large exposure groups and were statistically significant in an analysis adjusted for pre-pregnancy/prenatal common diagnoses. Details of the method are described in the Additional File 1: Method S2, Method S3, and Method S4.

Sertraline

Sertraline is a selective serotonin reuptake inhibitor (SSRI) antidepressant. Depression is a treatable disease and a risk factor for PTB [97]. We limited our analytic population to patients who had any depression diagnosis before pregnancy (Additional File 1: Table S2). We evaluated the risk of PTB in patients exposed to sertraline among patients who had depression onset before the pregnancy. Additionally, we assessed the likelihood of delivering preterm in patients exposed to SSRI within the same analytic population that we used to evaluate the risk of PTB in those exposed to sertraline.

Acyclovir

Acyclovir is a treatment for herpes virus infection, including shingles, chicken pox, and genital herpes. Genital herpes is a sexually transmitted disease, which is a risk factor for PTB. We determined the indication of treatment based on dosage [98]. According to the CDC treatment guideline [99], acyclovir is recommended starting at GA 36 weeks to suppress the reactivation of genital herpes among pregnant women. Patients who adhered to this treatment guideline delivered after 36 weeks of gestation, potentially introducing selection bias and leading to a lowered risk of PTB. Initially, we characterized the number of patients who initiated their prescription at 36 weeks of gestation to assess the proportion of patients following this CDC treatment guideline. Subsequently, we examined the likelihood of PTB in patients exposed to acyclovir before 36 weeks of gestation. We replicated the analysis on a subsample of patients who had indications of genital herpes (Additional File 1: Table S2). We then evaluated the risk of PTB among patients exposed to acyclovir or valacyclovir (oral prodrug of acyclovir) before 36 weeks of gestation.

Ferrous sulfate

Ferrous sulfate is a treatment for iron deficiency anemia, which is a risk factor for PTB. We assessed the impact of ferrous sulfate in the anemic group. The anemic group was determined based on the presence of iron-deficiency anemia diagnosis within 180 days before LMP to LMP (Additional File 1: Table S2).

Results

Descriptive statistics

We identified 365,075 patients as our analytic population who had continuously enrolled singleton pregnant patients. This population was enriched with people who were aged 30–34 (32.7%), White or Caucasian race (63.2%), non-Hispanic or Latino ethnicity (77.2%), Medicaid/Medicare insured, living in metropolitan areas (84.2%), and delivered in 2022 (12.4%). Median maternal age increases from 30.3 to 31.5 (P < 0.0001) from 2013 to 2022. The proportion of women aged 35 or older increased from 20.8 to 27.0% from 2013 to 2022 (Additional File 1: Fig. S2). The mean gestational age at delivery was 275.0 days. The average prevalence rates of PTB, SGA, and LBW were 7.7%, 12.1%, and 5.4% (Additional File 1: Table S3).

The total medication prescription rate increased from 58.5 to 75.3% from 2013 to 2022 (P < 0.0001). The inpatient prescription rate slightly increased from 29.3 to 32.4% (P = 0.2) In contrast, outpatient medication prescriptions increased from 50.5 to 70.1% (P < 0.0001) (Fig. 1). The maternal age group of 18–24 had the highest prescription rate of 73.0%. Mothers aged 40 or older had the lowest prescription rate reporting 63.4% (P < 0.0001). The Medicare/Medicaid insurance group had a higher prescription rate reporting 72.2%, than the commercial insurance group (62.6%; P < 0.0001). Amongst the race group, pregnant women who reported Black or African American race had the highest prescription rate of 77.3%, and Asian had the lowest, reporting 64.4% (P < 0.0001). We observed prescription rate increases as the number of comorbidities increased. This trend was similar for both pre-pregnancy and prenatal comorbidities. Approximately half of the pregnant people with no pre-pregnancy/prenatal problem diagnosis had a prescription during pregnancy. Patients with eleven or more pre-pregnancy/prenatal problem diagnoses had a prescription rate higher than 90% (Fig. 1).

Propensity score matching

From the initial pool of 1329 medications, 175 prenatally prescribed medications met the minimum sample size. None of the medications had an effect size below 0.2 after matching all three analyses. When we adjusted for baseline characteristics, pregnancy, and maternal characteristics, we identified a total of 76 (RR < 1: 20, RR ≥ 1:56) associations with a p-value below 0.1. The number of associations with statistical significance narrowed when additionally accounting for pre-pregnancy comorbidities in the obstetric comorbidity index. We observed 75 (RR < 1: 5, RR ≥ 1:70) medications associated with the risk of PTB with statistical significance. Finally, we identified 58 (RR < 1: 12, RR ≥ 1:46) medications associated with the risk of PTB in an analysis adjusted for common diagnoses during the pre-pregnancy and prenatal period (Fig. 2, Fig. 3, and Table 1). Statistically significant correlations were categorized into three categories based on their indication: PTL/PTB, risk factor of PTB, and infection (Additional File 1: Table S4) [2496]. Forty-three medications had indications categorized into at least one category. Four medications fell into the category of PTL/PTB indication. Thirty-two medications had indications that were risk factors for PTB. Nine medications were prescribed in case of infections, including bacterial, fungal, and viral.

Fig. 2.

Fig. 2

Forest plots of association between medication and risk of PTB. Left plot shows the forest plot of baseline analysis that adjusted maternal and pregnancy characteristics. The center plot shows a forest plot of analysis that adjusted for maternal and pregnancy characteristics and pre-pregnancy comorbidities from the obstetric comorbidity index. The right plot is a forest plot of analysis that adjusted for maternal and pregnancy characteristics and prenatal/pre-pregnancy common comorbidities. The Y-axis is the list of medications that met the minimum sample size in descending order of RR of analysis in the center plot. This figure is summarized in Table 1. RR, confidence interval, and p-values are reported in Additional File 2. Supplementary Data

Fig. 3.

Fig. 3

Forest plot of statistically significant association with risk of PTB. This plot is a forest plot of analysis that adjusted for maternal and pregnancy characteristics and prenatal/pre-pregnancy common comorbidities. Selection of prenatal and pre-pregnancy common comorbidities is described in Additional File 1: Method S1. RR, confidence interval, and p-values are reported in Additional File 2. Supplementary Data

Table 1.

Summary of associations based on statistical significance and relative risk

Baseline (maternal/pregnancy characteristics) Baseline + prepregnancy comorbidity index Baseline + prepregnancy/prenatal common comorbidity
RR < 1 0.05 ≥ P 4 3 8
0.1 ≥ P > 0.05 16 2 4
P > 0.1 23 26 36
RR ≥ 1 0.05 ≥ P 49 55 42
0.1 ≥ P > 0.05 7 15 4
P > 0.1 77 75 82

Validation

Sertraline

There were 29,352 patients who had depression diagnosis before the pregnancy. Respectively, 3214 and 5910 patients were exposed to sertraline or any SSRI. They were 1.28 times [1.14, 1.45] and 1.16 times [1.05, 1.28] more likely to deliver preterm than patients without exposure.

Acyclovir

The majority of patients (58.8%; 4947 out of 8420) who had prenatal acyclovir exposure started their prescription at or after 36 weeks of gestation. Those exposed to acyclovir before 36 weeks of pregnancy had 1.77 times (1.77 [1.52, 2.07]) higher likelihood of delivering preterm compared to patients without prenatal acyclovir exposure. However, within the subsample of patients diagnosed with genital herpes, we did not observe an elevated risk of PTB (OR = 1.19 [0.94, 1.50]). Additionally, there was no observed association between exposure to acyclovir and elevated risk of PTB when comparing individuals exposed to acyclovir and valacyclovir before 36 weeks of gestation (OR = 0.86 [0.74, 1.00]).

Ferrous sulfate

There were 774 patients diagnosed with iron deficiency anemia within a 180-day pre-pregnancy period. We observed 294 patients with a prescription for ferrous sulfate during pregnancy. Our analysis revealed no association between the prescription of ferrous sulfate and the risk of PTB (OR = 0.85 [0.48, 1.50]).

Discussion

To our knowledge, this was the first study to use propensity score matching at scale on EHR to generate and prioritize testable hypotheses on drug effects associated with the risk of PTB. We retrospectively assessed 365,075 people who were continuously enrolled in PSJH. The majority of women took prescribed medication during pregnancy. From an initial pool of 1762 medications, we narrowed it down to 172 medications for hypothesis evaluation. Three of these detected signals were selected based on their relatively large exposure groups and statistical significance in an analysis adjusted for pre-pregnancy/prenatal common diagnoses. We investigated the heightened likelihood of delivering preterm associated with sertraline exposure and decreased chance related to acyclovir and ferrous sulfate exposures. We confirmed the association with sertraline, while the associations with acyclovir and ferrous sulfate lacked statistical significance.

We employed propensity score matching at scale on EHR and produced hypotheses for 172 medications. Among them, 57 of 172 mediations had statistically significant associations with the risk of PTB. There were a few prior studies with similar aims. Maric et al. (2019) [100] assessed administrative claims data on 2,538,255 deliveries and identified 863 medications with statistically significant associations. Their number of signals, statistically significant association, far exceeds ours because their sample size was greater and did not eliminate medication that did not meet the minimum sample size. That study had only 5 medications with an odds ratio below 1, whereas we had 12. Another effort to establish a framework to detect drug effect signals in maternal–fetal medicine was conducted by the Sentinel working group. Sentinel initiative, led by the US Food and Drug Administration (FDA), has created novel methods to evaluate the safety of approved medical products, including medications, vaccines, and devices. They used propensity score matching tree-based scan statistics methods on Medicaid data to discover infant outcomes associated with prenatal cephalosporin exposure in the first trimester [21]. That study utilized a different approach as they focused on multiple outcomes and single exposures; our study assessed single outcomes and multiple exposures. Both prior studies utilized claims data, whereas we used EHR.

The majority of patients were prescribed medications during pregnancy. This finding corresponds to observations in earlier studies. According to Mitchell et al. (2011) [11], in the US, 48% of women were exposed to prescribed medication during pregnancy in 2008. A systemic review study conducted on peer-reviewed literature from 1989 to 2010 in developed countries reported that 27% to 93% of pregnant women used prescription drugs, depending on the country [101]. In our study, we observed an increase in prenatal prescription rate from 58.5% to 75.3% from 2013 to 2022. This rate is higher than the prescription rate reported in 2008. The discrepancy in the prescription rate for medication during pregnancy may be attributable to a gradual increase in usage. Mitchell et al. in 2011 described an incremental increase in the use of prescription medications by 60% from 1986 to 2008. We also observed a rise in the prescription rate from 2013 to 2022. As discussed in the introduction, the common use of medication during pregnancy underscores the necessity to promote pharmacology research in pregnant women and to leverage already generated real-world data to expand our understanding of the efficacy and safety of medications during pregnancy.

Surprisingly, the prescription rate decreased as the maternal age increased. We first assumed that the increased prescription rate over the study period was attributable to increasing maternal age based on observation from Mitchell et al. (2011) [11]. Indeed, the median maternal age increased, and the proportion of women aged 30 or older gradually increased over our study period. However, the prescription rate did not correlate with the maternal age, contrary to our speculation. Women in the oldest age group, 40 or older, had the lowest prescription rate, whereas women aged 24 or younger had the highest. The major difference between our study and Mitchell et al. (2011) [11] is the study period and population. Their observation was based on 5008 deliveries from 1997 to 2003 in the US. In contrast, our observation was relatively similar to that from a more recent study [102] on 2.3 million patients who delivered live births from 2000 to 2019. Their study reported that the most prevalent medication exposures (antibacterial agents, antiemetics, and contraceptives) during pregnancy had a prescription pattern across age groups similar to our study. The younger group, 24 or younger, had much higher prescription rates for these medications than those of the older group, 35 or older. This counterintuitive finding regarding the decreasing prescription rate with age might be partly explained by the overall increase in prescription rates over time. Historically, there may have been more reluctance to take medication, but this hesitancy may be diminishing in recent years. Yet, older mothers may still retain some of this reluctance.

We employed traditional pharmacoepidemiology methods to evaluate the detected drug effect signals for sertraline, acyclovir, and ferrous sulfate. Specifically, we focused on assessing the negative association between sertraline/SSRI and the risk of PTB among patients who had an onset of depression before the pregnancy. We further validated this in a separate study [103]. We confirmed the correlation between exposure to sertraline/SSRI and the risk of PTB, and this correlation remained strong and significant through extensive sensitivity analyses. However, our study faced limitations in properly evaluating ferrous sulfate association with a lower risk of PTB due to the small sample size. Only 774 patients received a diagnosis of iron deficiency anemia within the 180-day pre-pregnancy period. Despite the small sample size of our study, a recent study reported that patients exposed to iron supplementation (ferrous sulfate, ferrous gluconate, ferrous fumarate, and ferrous glycinate) experienced reduced odds of preeclampsia and/or PTB [104].

In contrast to sertraline and ferrous sulfate, the signal we observed for acyclovir was misleading. According to CDC treatment guidelines [99], acyclovir is recommended for administration at 36 weeks of gestation for patients with genital herpes. This practice likely introduced selection bias, as the exposure group included patients who surpassed 36 weeks of gestation. In fact, 60% of patients were exposed to acyclovir at or after GA 36 weeks. When we restricted our exposure group to patients exposed to acyclovir before 36 weeks, the protective result associated with the risk of PTB disappeared. Interestingly, our result slightly differed from prior studies. In a previous study, exposure to valacyclovir, not to acyclovir, was associated with a lower risk of spontaneous PTB [100]. The investigation on sertraline and ferrous sulfate demonstrates the potential of our approach to produce and prioritize hypotheses to evaluate. However, misleading signals do exist. Thus, we must take a conservative stance and carefully verify detected drug effect signals.

We identified 118 medications with no statistical significance. Restricting our analyses to medications that satisfy minimum sample size ensures that associations lacking statistical significance are not dismissed as meaningless. Considering that pregnant women are typically excluded from clinical medication trials despite their medication use, the absence of an association with the risk of PTB is a valuable finding supporting the potential drug safety in relation PTB. It underscores the need for similar studies in pregnancy pharmacology to be conducted and repeated on real-world data to gather more evidence on medication’s safety, risks, and benefits for pregnant women.

We had one of the largest sample sizes for hypothesis-generating retrospective EHR studies for pregnant women. While similar studies exist, they often rely on claims data [21, 100]. Claims data may offer a larger sample size but EHR provides richer data on patient’s longitudinal health conditions, encompassing lab results, vital signs, and surveys [105]. Moreover, our study setting PSJH serves community hospitals/clinics in both rural and urban settings in seven western states in the US. This setting better reflects the general population better than the third-level academic hospital, which may focus more on high-risk pregnancies.

To ensure the integrity and reliability of our analyses, we implemented several measures to mitigate bias and ensure the robustness of our findings. We reduced the surveillance bias by restricting to continuously enrolled patients and leveraging propensity score matching. By limiting our study population to patients who were continuously enrolled, we excluded transient patients admitted for delivery who were likely to lack prenatal information. Furthermore, we mitigated the bias by matching patients in the treatment group to those in the control group with similar characteristics across covariates. Given that individuals exposed to medication may have more frequent doctor visits, ensuring comparability of patient health was crucial. Another noteworthy aspect of our approach was our commitment to evaluating all medications without introducing systemic bias. In research, there can be a tendency to focus on variables or hypotheses previously explored or considered more interesting. By conducting assessment on all medications that reached minimum sample size, we aimed to prevent such biases from influencing our analysis, which contributed to the overall rigor of our study.

One major limitation of this study was the absence of multiple testing corrections. We recognize that conducting multiple comparisons increases the likelihood of producing false positives. However, we deliberately did not correct for multiple testing, as the primary objective of this study was to produce hypotheses rather than to test them. Furthermore, different methods for multiple testing correction can yield varying adjusted p-values. Instead of applying specific correction methods, we presented confidence intervals. This decision allows future researchers to use them for meta-analysis, as recommended by a prior study [106]. We underscore the need for cautious consideration of these associations and advocate for thorough evaluation through meticulously designed studies that reflect the characteristics of exposures of interest and their indications.

Another limitation was the high number of missing values for pregravid BMI. To address this, we imputed pregravid BMI as normal, based on the assumption that the absence of this information suggested it was not a primary concern from the clinician’s perspective. However, we acknowledge that the distribution of pregravid BMI categories does not align with the national distribution for women of reproductive age. It is possible that some BMI information may be documented in unstructured notes, which could be extracted using Natural Language Processing (NLP) techniques. As this study is exploratory and aimed at generating hypotheses, we recommend that future research testing the generated hypothesis from this study address this limitation by conducting subgroup analysis focusing on patients with complete pregravid BMI information and leveraging NLP to collect more pregravid BMI information.

Another fundamental limitation of EHR is that medication records might not accurately reflect actual medication exposure. EHR do not capture over-the-counter medications unless the patient specifically reported them. This omission may lead to an underestimation of total medication use. Additionally, the EHR data do not provide information on whether prescribed medications were actually filled, which could result in an overestimation of medication adherence. Nevertheless, pregnant women are a unique clinical cohort with regular clinic visits and close monitoring by healthcare providers. Due to the potential risks to their fetus, pregnant women are especially vigilant about medication use [107] and generally demonstrate high agreement between self-reported medication use and recorded medication data [18]. While some bias may remain, we assume that women adhered to prescribed medications and that any discontinued medications were properly removed from the records. A further limitation is that route of administration was not considered in this analysis.

Lastly, the use of uniform sets of comorbidities is a limitation. Although we conducted multiple analyses with several groups of comorbidities, it is essential to note that individual medications are prescribed for specific indications. Medications with less common indications may not be adequately represented in the covariate we investigated. In the future, we can address this limitation by applying several promising approaches. One such approach is high-dimensional propensity score matching [108]. High-dimensional propensity score matching offers a robust way to control for confounding variables in observational studies. Unlike traditional propensity score matching, which considers a limited number of covariates, high-dimensional matching can involve hundreds of empirical covariates. Another promising approach is leveraging external databases such as ChEMBL. ChEMBL provides valuable information about drug indications, contraindications, and other clinical data. Leveraging external databases like ChEMBL enables researchers to automatically select relevant analytic cohorts and covariates relevant to drug indication and treatment.

Conclusions

Most pregnant women are prescribed medication during pregnancy. This highlights the crucial need to advance pharmacology research in pregnant individuals, utilizing existing real-world data to enhance our knowledge of the safety of medications in pregnancy. We demonstrated the potential of using statistical data mining methods to generate and prioritize hypotheses on medication association with the risk of PTB. This foundational framework can be used for adverse outcomes such as gestational diabetes or preeclampsia. We note that these results should be further validated, reflecting the characteristics of exposures of interest and their indication. We only investigated drug effects associated with the risk of PTB. The mentioned drugs may be attributed to other adverse pregnancy outcomes or congenital disorders.

Supplementary Information

12916_2024_3717_MOESM1_ESM.docx (177.5KB, docx)

Additional File 1. Supplementary Materials. Supplementary Methods S1-S4. Supplementary Tables S1-S4. Method S1- Selection of pre-pregnancy and prenatal common comorbidity. Method S2- Investigation on the association between sertraline/SSRI and elevated risk of PTB. Method S3-Investigation on the negative correlation between exposure to acyclovir and PTB risk. Method S4- Investigation on the association between exposure to ferrous sulfate and decreased risk of PTB. Table S1. Variable definition. Table S2. SNOMED diagnosis codes. Table S3. Descriptive statistics of source population. Table S4. Categorization of medications with statistically significant association with risk of PTB based on their indications.

12916_2024_3717_MOESM2_ESM.xlsx (132.5KB, xlsx)

Additional File 2: Supplementary Data. Results of large-scale propensity score matching analysis on preterm birth, low birth weight, and small for gestational age.

Acknowledgements

We are grateful to the Institute for Systems Biology for startup funds, and to PSJH for sharing their data engineering expertise and computational resources. We would also like to acknowledge SNOMED International for developing and maintaining SNOMED-CT©.

Abbreviations

ADR

Adverse drug reaction

CDC

Centers for Disease Control and Prevention

EHR

Electronic health record

FDA

Food and Drug Administration

GA

Gestational age

LBW

Low birth weight

LMP

Last menstrual period

OR

Odds ratio

PSJH

Providence St. Joseph Health

PTB

Preterm birth

PTL

Preterm labor

RR

Relative risk

SGA

Small for gestational age

SSRI

Selective serotonin reuptake inhibitor

US

United States

Authors’ contributions

Y.H. conceptualized the study, managed resources, curated data, performed formal analysis and investigation, created visualizations, interpreted data, contributed to methodology, wrote the original draft, and reviewed and edited the manuscript. S.N.P. contributed to conceptualization, data curation, data interpretation, and reviewed and edited the manuscript. A.G.P. contributed to conceptualization. Q.W. curated data and reviewed and edited the manuscript. N.D.P. contributed to conceptualization and reviewed and edited the manuscript. L.H. contributed to conceptualization and reviewed and edited the manuscript. J.J.H. contributed to conceptualization, supervised the study, and reviewed and edited the manuscript. Y.H., Q.W., S.N.P., and J.J.H. had full access to and verified the data. All authors read and approved the final manuscript.

Authors’ twitter handles

Twitter handles: @isbscience (Institute for Systems Biology).

Funding

This work was supported by the United States National Institute of Child Health and Human Development under Grant HD091527; Eunice Kennedy Shriver National Institute of Child Health and Human Development.

Data availability

All clinical logic has been shared in the manuscript and the GitHub repository (https://github.com/Hadlock-Lab/PSM_Maternity_at_scale). Results have been aggregated and reported within this paper to the extent possible while maintaining privacy from personal health information (PHI) as required by law. All data is archived within PSJH systems in a HIPAA-secure audited compute environment to facilitate verification of study conclusions.

Declarations

Ethics approval and consent to participate

All procedures were reviewed and approved by the Institutional Review Board at the PSJH through expedited review on 11–04-2020 (study number STUDY2020000196). Consent was waived because disclosure of protected health information for the study involved no more than minimal risk to the privacy of individuals.

Consent for publication

Not applicable.

Competing interests

Y.H., S.N.P., and A.G.P. declare no conflict of interest. J.J.H. has received research funding (paid to institute) from Pfizer, Novartis, Janssen, Gilead and Bristol Myers Squibb. Q.W. has been funded in part by Pfizer, Novartis, and Janssen (paid to the institute). L.H. and N.D.P. are scientific advisors for Sera Prognostics, a pregnancy diagnostics company, and N.D.P. holds stock options. Sera Prognostics is not associated with this study or any of the findings.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12916_2024_3717_MOESM1_ESM.docx (177.5KB, docx)

Additional File 1. Supplementary Materials. Supplementary Methods S1-S4. Supplementary Tables S1-S4. Method S1- Selection of pre-pregnancy and prenatal common comorbidity. Method S2- Investigation on the association between sertraline/SSRI and elevated risk of PTB. Method S3-Investigation on the negative correlation between exposure to acyclovir and PTB risk. Method S4- Investigation on the association between exposure to ferrous sulfate and decreased risk of PTB. Table S1. Variable definition. Table S2. SNOMED diagnosis codes. Table S3. Descriptive statistics of source population. Table S4. Categorization of medications with statistically significant association with risk of PTB based on their indications.

12916_2024_3717_MOESM2_ESM.xlsx (132.5KB, xlsx)

Additional File 2: Supplementary Data. Results of large-scale propensity score matching analysis on preterm birth, low birth weight, and small for gestational age.

Data Availability Statement

All clinical logic has been shared in the manuscript and the GitHub repository (https://github.com/Hadlock-Lab/PSM_Maternity_at_scale). Results have been aggregated and reported within this paper to the extent possible while maintaining privacy from personal health information (PHI) as required by law. All data is archived within PSJH systems in a HIPAA-secure audited compute environment to facilitate verification of study conclusions.


Articles from BMC Medicine are provided here courtesy of BMC

RESOURCES