Skip to main content
Journal of the American Medical Informatics Association : JAMIA logoLink to Journal of the American Medical Informatics Association : JAMIA
. 2021 Sep 24;29(2):306–320. doi: 10.1093/jamia/ocab184

Recurrent preterm birth risk assessment for two delivery subtypes: A multivariable analysis

Ilia Rattsev 1,2,, Natalie Flaks-Manov 3, Angie C Jelin 4,5, Jiawei Bai 6, Casey Overby Taylor 1,2,3
PMCID: PMC8757309  PMID: 34559221

Abstract

Objective

The study sought to develop and apply a framework that uses a clinical phenotyping tool to assess risk for recurrent preterm birth.

Materials and Methods

We extended an existing clinical phenotyping tool and applied a 4-step framework for our retrospective cohort study. The study was based on data collected in the Genomic and Proteomic Network for Preterm Birth Research Longitudinal Cohort Study (GPN-PBR LS). A total of 52 sociodemographic, clinical and obstetric history-related risk factors were selected for the analysis. Spontaneous and indicated delivery subtypes were analyzed both individually and in combination. Chi-square analysis and Kaplan-Meier estimate were used for univariate analysis. A Cox proportional hazards model was used for multivariable analysis.

Results

: A total of 428 women with a history of spontaneous preterm birth qualified for our analysis. The predictors of preterm delivery used in multivariable model were maternal age, maternal race, household income, marital status, previous caesarean section, number of previous deliveries, number of previous abortions, previous birth weight, cervical insufficiency, decidual hemorrhage, and placental dysfunction. The models stratified by delivery subtype performed better than the naïve model (concordance 0.76 for the spontaneous model, 0.87 for the indicated model, and 0.72 for the naïve model).

Discussion

The proposed 4-step framework is effective to analyze risk factors for recurrent preterm birth in a retrospective cohort and possesses practical features for future analyses with other data sources (eg, electronic health record data).

Conclusions

We developed an analytical framework that utilizes a clinical phenotyping tool and performed a survival analysis to analyze risk for recurrent preterm birth.

Keywords: premature birth, pregnancy complications, proportional hazards models, medical informatics, risk factors

INTRODUCTION

The World Health Organization defines preterm birth as delivery before 37 weeks of gestation.1 Complications of preterm birth include higher risk for infection and noninfectious respiratory conditions, neurodevelopmental disorders, and visual and cognitive impairments.2,3 In 2020, complications of preterm birth were the leading cause of death in children below 5 years of age.4 Quantifying the risk for preterm birth can provide a framework to aid physicians in selecting the best management plan for at-risk patients.

Prior preterm deliveries represent on significant risk factor for preterm delivery in future pregnancies.5–8 For patients at high recurrence risk for preterm birth, knowing if they are more likely to deliver early has clinical implications for interventions to prevent preterm delivery, such as cervical cerclage,9–11 zinc supplementation,9 prophylactic tocolytics,12 or progesterone therapy.10,11,13 Many of these interventions, however, have severe adverse effects and have been shown to be associated with neonatal complications.9,14 Therefore, a risk-benefit analysis of a given intervention is an important consideration when choosing the most effective prevention strategy for a particular patient.

The obstetric precursors for preterm birth can be divided into 2 major subtypes: (1) spontaneous preterm birth, which encompasses both spontaneous labor with intact membranes and preterm premature rupture of membranes (PPROM); and (2) medically indicated preterm birth due to fetal or maternal conditions.7,15 Although there are some common risk factors for spontaneous and medically indicated preterm birth, many risk factors are unique for a specific delivery subtype.15 Therefore, one model may not be sufficient to effectively calculate risk scores for both spontaneous and indicated preterm birth.

Risk score prediction for preterm birth is a largely unexplored area, owing to the complex etiology of the outcome. Previous studies primarily focused on modeling one or a few risk factors for preterm birth risk assessment at a time.15–17 In real-world clinical setting, however, patients often express a phenotype that is a combination of many risk factors. As a result, the methods for calculating risk for preterm delivery based on individual risk factors have had a poor predictive value and low clinical utility.18,19 Several recent studies employed multivariable analysis to explore the risk factors for preterm birth.20–23 As more data are published on individual predictors of early delivery, more comprehensive analyses can be performed. Such analyses are needed to adjust for confounders and to improve the accuracy of predictions relevant to preterm birth and its complications so that their utility is high enough for use in clinical decision support systems.

The objective of this study was to develop and apply a framework that uses a clinical phenotyping tool to assess risk for recurrent preterm birth. Our approach employs clinical phenotypes to combine individual clinical risk factors and builds a multivariable model incorporating previous obstetric history, sociodemographic factors, and clinical risk factors for each delivery subtype, as well as for the naïve model.

MATERIALS AND METHODS

To efficiently apply the multivariable model developed in this study on different datasets, we proposed a 4-step framework that can be used by researchers to guide their analysis. Figure 1 provides the overview of the suggested methodology. The first step of the framework is data collection, which is followed by a feature extraction step, in which risk factors for preterm birth are selected from a set of all patient characteristics. The third step of the framework is phenotype mapping, in which clinical risk factors are grouped into clinical phenotypes. The final step is to perform statistical modeling. While the data collection step is unique for each individual study, the feature extraction, clinical phenotyping, and statistical modeling steps remain highly conserved. Each step of the framework is described in detail in the following sections.

Figure 1.

Figure 1.

An overview of the proposed analytical framework. PTB: preterm birth.

Data collection

This is a retrospective cohort study based on data collected as a part of the Genomic and Proteomic Network for Preterm Birth Research Longitudinal Cohort (GPN-PBR LC) in 2009 to 2010.24,25 The GPN-PBR LC includes data collected at multiple times during gestation, data collected at labor and delivery, and biospecimens for 446 high-risk women. The setting and recruitment eligibility criteria are described in detail elsewhere.26 In short, women with a history of spontaneous preterm birth and current singleton pregnancy were recruited at the 3 GPN-PBR clinical sites at the University of Alabama at Birmingham, University of Texas Medical Branch at Galveston, and University of Utah during their routine prenatal visits before 19 weeks of gestation. Exclusion criteria included maternal uterine anomalies, planned cervical cerclage, multifetal gestation, fetal aneuploidy or lethal fetal anomalies, polyhydramnios, inability to follow-up, and serious maternal conditions.26 Following the initial visit, the study participants had 2 follow-up visits (one at 190/7–236/7 weeks and another one at 280/7–316/7 weeks of gestation) and were subsequently admitted for delivery. In addition, the hospital admission form was completed for each study participant in case of emergency hospital stay during the pregnancy. Demographic, socioeconomic, past medical history, obstetric history, lifestyle, substance use, and current medication data were collected from the participants via a survey during the enrollment visit. Experienced symptoms, newly identified conditions, pregnancy complications, and newly prescribed medications were recorded for each participant during the follow-up visits. Maternal peripheral blood samples were collected during all study visits, including admission for delivery. Cervical length and cervical dilation measurements, as well as sonographic parameters of the fetus, were recorded at 2 follow-up visits for most study participants. Psychosocial questionnaire assessing maternal stress, anxiety, and depression levels was completed at the enrollment visit and during the admission for delivery.26

Estimated date of conception (EDC) was determined by the date of last menstrual period (LMP) reported by the participant, and by ultrasound. If EDC determined by LMP and ultrasound differed by more than 1 week, the final EDC was the one estimated by ultrasound. Otherwise, participant-reported date of LMP was used as the final EDC. Gestational age at delivery was reported as the number of days since final EDC to delivery.26 We excluded from the analysis the study participants who were lost to follow-up, had a miscarriage, or delivered a stillborn baby.

Data processing

Feature selection

To perform a comprehensive analysis of potential individual risk factors of preterm birth, we conducted a literature review and consulted a clinician specializing in the field of maternal fetal medicine (A.C.J.). The overarching list of risk factors considered for the analysis is presented in Table 1. Overall, we selected 52 risk factors that can be divided into 3 major subgroups: clinical, sociodemographic, and related to previous obstetric history. All sociodemographic risk factors, preexisting chronic conditions, and previous pregnancy data were extracted from the surveys filled at the enrollment visit. Participant’s complications during current pregnancy were obtained from the forms completed at follow-up visits, emergency hospital encounters, and admission for delivery. To determine previous birth weight for participants with multiple previous gestations, the most recent gestation was used as a reference. Ultrasonographic and newly administered medication reports were obtained from the respective forms filled out at follow-up and hospital admission visits. There were 4 delivery types assigned to participants by the GPN-PBR clinicians: (1) spontaneous; (2) spontaneous, augmented; (3) induced; and (4) no labor (caesarean section). For our analysis, spontaneous augmented deliveries were combined with spontaneous, and induced were combined with no labor into the broader category of indicated deliveries.15

Table 1.

Risk factors for preterm birth

Risk factor Clinical phenotype Evidence level
Sociodemographic
 Age27 NA NA
 Maternal race15 NA NA
 Paternal race28 NA NA
 Maternal ethnicity29 NA NA
 Paternal ethnicity28 NA NA
 Education level30 NA NA
 Household income31 NA NA
 Employment status32,33 NA NA
 Insurance status31 NA NA
 Marital status30 NA NA
 Body mass index15 NA NA
Clinical
Syphilis,34 hepatitis,35 gonorrhea,35 chlamydia,35 trichomonas,35 HSV,36 or GBS37 in current gestation Infection/inflammation Possible
Urinary tract infection or asymptomatic bacteriuria35 Infection/inflammation Possible
Kidney infection35 Infection/inflammation Possible
Pyelonephritis35 Infection/inflammation Possible
Chorioamnionitis35 Infection/inflammation Moderate
Vaginal bleeding35 Decidual hemorrhage Possible
Nonreassuring fetal heart tones, or fetal tachycardia35 Decidual hemorrhage Moderate (when together with vaginal bleeding)
Placenta previa35 Decidual hemorrhage Possible
Abruption35 Decidual hemorrhage Moderate
History of cervical conization procedure35 Cervical insufficiency Possible
History of loop electro-excision procedure35 Cervical insufficiency Possible
Short cervix15 Cervical insufficiency None to strong depending on measures
Funneling or hourglass membranes35 Cervical insufficiency Moderate (when together with short cervix)
Uterine fibroids35 Uterine distension Possible
Low amniotic fluid index35 Placental dysfunction Moderate
Preeclampsia or eclampsia15 Placental dysfunction Moderate for mild, strong for severe preeclampsia and eclampsia
Diabetes38 Maternal comorbidities Strong
Gestational diabetes in the current gestation39 Maternal comorbidities Moderate
Chronic hypertension40 Maternal comorbidities Strong
Cardiac,40a renal,35 autoimmune,41 pulmonary condition,35 anemia,42a or history of seizures35 Maternal comorbidities Moderate
Family history43 Family history None to strong depending on the degree of relatives and preterm delivery subtype
Stress, anxiety, or depression44 Maternal stress None to strong depending on survey responses
IUGR15a Placental dysfunction Moderate
Newly identified fetal anomalies45 Placental dysfunction Possible
Motor vehicle accident in current pregnancy35 Decidual hemorrhage Possible
Analytes
PAPP-A46,47a Placental dysfunction Possible when <0.52 MOM47
hCG48a Placental dysfunction Possible when >50 mIU/mL48
AFP49a Placental dysfunction Possible when >2.5 MOM50
uEstriol51a Placental dysfunction Possible when >2.6 ng52
Inhibin-A53a Placental dysfunction Possible when >2.25 MOM53
Obstetric history
Number of previous deliveries54 NA NA
Number of previous abortions55 NA NA
Time since the last pregnancy16 NA NA
Previous birth weight56 NA NA
Previous caesarean section57 NA NA
Previous preeclampsia or gestational hypertension56a Placental dysfunction Possible
IUGR in previous pregnancy58a Placental dysfunction Possible
Previous oligohydramnios59a Placental dysfunction Possible
Maternal medical condition in previous pregnancy35a Maternal comorbidities Possible
Previous gestational diabetes60a Maternal comorbidities Possible
Previous placental abruption56a Placental dysfunction Possible

Risk factors for preterm birth are listed in the first column. Clinical phenotype and the level of evidence that a risk factor provides for the phenotype are also presented where applicable. Phenotype mapping was done with the phenotyping tool derived from the study by Manuck et al35 and expanded in this work. NA in the clinical phenotype and evidence level columns corresponds to the risk factors that were not mapped to any phenotype and were themselves final variables included in the model.

AFP: alpha-fetoprotein; GSB: group B Streptococcus; hCG: human chorionic gonadotropin; HSV: herpes simplex virus; IUGR: intrauterine growth restriction; MOM: multiple of the median of normal pregnancies; PAPP-A: pregnancy-associated plasma protein A.

a

New risk factors added by the clinical expert.

Clinical phenotype mapping

To reduce the dimensionality of our data, we used the clinical phenotyping tool for spontaneous preterm birth35 that was adjusted by a field specialist (A.C.J.) to incorporate additional clinical risk factors. The phenotyping tool maps the clinical features into 9 comprehensive phenotypes: (1) infection and inflammation, (2) decidual hemorrhage, (3) cervical insufficiency, (4) uterine distension, (5) placental dysfunction, (6) PPROM, (7) family history, (8) maternal comorbidities, and (9) maternal stress.35 A participant’s clinical characteristics can provide possible, moderate, strong, or no evidence for a particular phenotype (Table 1).35 The additional risk factors incorporated into the original clinical phenotyping tool included laboratory test results, complications during previous pregnancies, newly identified fetal anomalies, maternal chronic cardiac conditions, maternal anemia during pregnancy, gestational hypertension, severe preeclampsia, herpes simplex virus, and group B Streptococcus in current gestation. Table 1 describes the corresponding phenotype, and level of evidence for each risk factor that was mapped to a clinical phenotype.

For cervical length and dilation measurements, ultrasound reports before 28 weeks of gestation were reviewed for each participant. Cervical length <0.50 cm or cervical dilation >2 cm were considered indicative of strong evidence for cervical insufficiency, cervical length between 0.50 and 1.50 cm corresponded to the moderate level of evidence, cervical length between 1.50 and 2.50 cm suggested possible evidence, and cervical length >2.50 cm together with cervical dilation measurement of <2 cm indicated no evidence for the phenotype. In case of funneling or hourglass membranes observed together with a cervical length between 1.50 and 2.50 cm, the level of evidence was raised from possible to moderate. Several participants had a clinical diagnosis of short cervix, with ultrasound measures showing no evidence for the phenotype. In such cases, participants with the clinical diagnosis were assigned possible level of evidence for cervical insufficiency. Cerclage placement provided moderate evidence for the phenotype. In case different clinical characteristics indicated different levels of evidence for the same participant, the highest level of evidence was assigned to the phenotype.

The level of evidence for the familiar history phenotype depended on whether the relative with the history of preterm birth was a first- or second-degree relative, and on their delivery subtype. Having a first-degree relative with a history of spontaneous preterm birth provided strong evidence for family history, history of indicated preterm birth for a first-degree relative or spontaneous preterm birth for a second-degree relative was indicative of moderate evidence, having a second-degree relative with a history of indicated preterm delivery corresponded to a possible evidence for the phenotype, and having no first- or second-degree relatives delivering preterm provided no evidence for the family history.

To determine the level of evidence for maternal stress, we examined the scores received on Perceived Stress Scale,61 Beck Anxiety Inventory,62 and Beck Depression Inventory (BDI)63 that were assessed the GPN-PBR LC study participants’ stress, anxiety, and depression levels at enrollment. A BDI score higher than or equal to 31 and a Perceived Stress Scale score of 27 or greater provided moderate evidence for maternal stress. A BDI score of 21 or higher and a Beck Anxiety Inventory score of 22 or higher were indicative of possible evidence. Clinical diagnosis of anxiety or depression and prescription of antidepressants provided strong evidence for the phenotype.

To achieve larger sample size, the evidence levels with a small number of participants were combined. In particular, we grouped together possible, moderate, and strong levels of evidence, and formed the category of “some evidence” for infection and inflammation, decidual hemorrhage, cervical insufficiency, and maternal comorbidities. For maternal stress, moderate level of evidence was combined with strong, while for family history possible and moderate levels of evidence were grouped together. The lack of placental pathology data made it impossible for us to infer strong evidence for the placental dysfunction phenotype. The uterine distension phenotype had a small sample size for any level of evidence and was not included in our analysis. In addition, after a consultation with a clinical professional (A.C.J.), we excluded PPROM from the analysis, as PPROM is normally caused by the other risk factors, rather than representing a risk factor itself.

After grouping the variables into clinical phenotypes, a total of 23 covariates were established for the statistical modeling: 11 sociodemographic risk factors, 7 clinical phenotypes, and 5 characteristics related to previous obstetric history. Continuous variables included age, body mass index (BMI), number of previous deliveries, number of abortions, time since last delivery, and previous birth weight. Categorical variables included the 7 clinical phenotypes, maternal race, paternal race, maternal ethnicity, paternal ethnicity, education level, employment status, insurance status, marital status, household income, and previous caesarean section. The final list of variables included in the model is presented in Table 2 .

Table 2.

Resultant variables for the analysis

Risk factor Type
Sociodemographic
 Age Continuous
 Maternal race Categorical
 Paternal race Categorical
 Maternal ethnicity Categorical
 Paternal ethnicity Categorical
 Education level Categorical
 Household income Categorical
 Employment status Categorical
 Insurance status Categorical
 Marital status Categorical
 BMI Continuous
Clinical phenotype
 Infection/inflammation Categorical
 Decidual hemorrhage Categorical
 Cervical insufficiency Categorical
 Placental dysfunction Categorical
 Family history Categorical
 Maternal comorbidities Categorical
 Maternal stress Categorical
Obstetric history
 Number of previous deliveries Continuous
 Number of previous abortions Continuous
 Time since the last pregnancy Continuous
 Previous birth weight Continuous
 Previous caesarean section Categorical

BMI: body mass index.

Statistical modeling

Univariate analysis

The primary outcomes for the univariate analysis were preterm birth (delivery before 37 weeks) and gestational duration (in weeks). To determine the effect size of individual risk factors on binary outcome, we applied univariate analysis by chi-square test for categorical variables and by 1-way analysis of variance for continuous variables. In addition, Kaplan-Meier estimate was utilized to compare survival curves between different groups for categorical variables, with gestational duration being the outcome. Pairwise log-rank test was used to quantify the difference between survival curves. A P value of less than .05 was considered significant. Each statistical approach was applied separately to the participants with the spontaneous delivery subtype, to participants with the indicated delivery subtype, and to the entire cohort.

Multivariable analysis

Similar to other studies aiming to predict the risk for preterm birth, we utilized survival analysis via a Cox proportional hazards regression model.17,64–66 In real-world clinical settings, patients are often lost to follow-up, which makes it troublesome to analyze their records by standard statistical methods. This is especially true for pregnant women who may need to deliver at a site different from their regular clinic. A Cox regression model has an advantage of managing censored participants that were lost to follow-up. With a prospect of utilizing our model to predict an individual’s risk for preterm birth on the electronic health records (EHRs) in the future, we used Cox regression for estimating risk for preterm birth.

Gestational duration (in weeks) was the outcome variable. Deliveries past 37 weeks of gestation were treated as censored data. Regression imputation was used to predict missing values for BMI and previous birth weight by regressing the known values on gestational duration. The regression imputation strategy was chosen for continuous variables to preserve the individual risk factor contribution to the outcome. Three separate models were developed: for spontaneous delivery, for indicated delivery, and for the entire cohort (naïve model). Hazard ratios (HRs) for delivering before 37 weeks of gestation and 95% confidence intervals (CIs) were calculated for each covariate.

Model assessment

To test the proportional hazards assumption for Cox regression models, we checked for independence between scaled Schoenfeld residuals for each covariate with time.67 To assess the performance of Cox proportional hazards models, we used concordance index, Akaike information criterion (AIC), and log-rank test. The higher concordance index and lower AIC score indicate a better performing model.

Kaplan-Meier analysis was done in Python using the “lifelines” library.68 Cox regression models were built with the “survival” package in R version 3.6.3.69,70

RESULTS

Study population

Among 446 women with a history of spontaneous preterm birth enrolled in the GPN-PBR LC study, 428 qualified for our analysis. Gestational age at delivery ranged from 232/7 to 416/7 weeks. A total of 138 (32%) study participants delivered preterm. A total of 308 (72%) women had a spontaneous delivery subtype, and 120 (28%) were indicated for delivery. The age of included participants ranged between 17 and 47 years, with the mean age being 27.7 years. The analyzed individuals were primarily White or Caucasian (59%, n = 252) and not Hispanic or Latino (74%, n = 318). About half of the population were coming from a low-income background (51%, n = 219). The majority of the study participants were living with partner (75%, n = 320), were unemployed (60%, n = 255), possessed a public insurance (61%, n = 261), and had either completed high school or had a college degree (42%, n = 181 and 41%, n = 174, respectively). A large proportion of women had some evidence for infection or inflammation during current pregnancy (45%, n = 194), and a third of the population had a strong evidence for family history (31%, n = 134). The majority of participants had not undergone caesarean section in previous pregnancies (79%, n = 340) and had no evidence for decidual hemorrhage (88%, n = 376), cervical insufficiency (92%, n = 392), placental dysfunction (73%, n = 313), maternal comorbidities (77%, n = 329), or maternal stress (75%, n = 323). On average, the enrolled participants had 2.2 previous deliveries, 0.45 previous abortions, and slightly over 3 years since their last pregnancy. The descriptive statistics for the study population are presented in Table 3.

Table 3.

Characteristics of the study cohort and univariate risk factor analysis for preterm birth

Spontaneous
Indicated
Total
Total (n = 308) Preterm (n = 113) Term (n = 195) OR (95% CI) Total (n = 120) Preterm (n = 25) Term (n = 95) OR (95% CI) Total (N = 428) Preterm (n = 138) Term (n = 290) OR (95% CI)
Demographics
Maternal race
Caucasian 178 (57.8) 66 (58.4) 112 (57.4) ref 74 (61.7) 12 (48) 62 (65.3) ref 252 (58.9) 78 (56.5) 174 (60) ref
Black or African American 101 (32.8) 39 (34.5) 62 (31.8) 1.07 (0.65-1.77) 35 (29.2) 9 (36) 26 (27.4) 1.79 (0.67-4.76) 136 (31.8) 48 (34.8) 88 (30.3) 1.22 (0.78-1.89)
Other 29 (9.4) 8 (7.1) 21 (10.8) 0.65 (0.27-1.54) 11 (9.2) 4 (16) 7 (7.4) 2.95 (0.75-11.68) 40 (9.3) 12 (8.7) 28 (9.7) 0.96 (0.46-1.98)
Paternal race
Caucasian 164 (53.2) 63 (55.8) 101 (51.8) ref 70 (58.3) 11 (44) 59 (62.1) Ref 234 (54.7) 74 (53.6) 160 (55.2) ref
Black or African American 109 (35.4) 42 (37.2) 67 (34.4) 1.00 (0.61-1.65) 38 (31.7) 9 (36) 29 (30.5) 1.66 (0.62-4.47) 147 (34.3) 51 (37) 96 (33.1) 1.15 (0.74-1.78)
Other 35 (11.4) 8 (7.1) 27 (13.8) 0.48 (0.20-1.11) 12 (10) 5 (20) 7 (7.4) 3.83 (1.03-14.28)a 47 (11) 13 (9.4) 34 (11.7) 0.83 (0.41-1.66)
Maternal ethnicity
Not Hispanic or Latino 223 (72.4) 90 (79.6) 133 (68.2) ref 95 (79.2) 18 (72) 77 (81.1) Ref 318 (74.3) 108 (78.3) 210 (72.4) ref
Hispanic or Latino 85 (27.6) 23 (20.4) 62 (31.8) 0.55 (0.32-0.95)a 25 (20.8) 7 (28) 18 (18.9) 1.66 (0.60-4.58) 110 (25.7) 30 (21.7) 80 (27.6) 0.73 (0.45-1.18)
Paternal ethnicity
Not Hispanic or Latino 229 (74.4) 89 (78.8) 140 (71.8) ref 95 (79.2) 18 (72) 77 (81.1) Ref 324 (75.7) 107 (77.5) 217 (74.8) ref
Hispanic or Latino 79 (25.6) 24 (21.2) 55 (28.2) 0.69 (0.40-1.19) 25 (20.8) 7 (28) 18 (18.9) 1.66 (0.60-4.58) 104 (24.3) 31 (22.5) 73 (25.2) 0.86 (0.53-1.39)
Education level
Less than high school 54 (17.5) 15 (13.3) 39 (20) ref 19 (15.8) 5 (20) 14 (14.7) Ref 73 (17.1) 20 (14.5) 53 (18.3) ref
High school or GED 138 (44.8) 52 (46) 86 (44.1) 1.57 (0.79-3.13) 43 (35.8) 11 (44) 32 (33.7) 0.96 (0.28-3.29) 181 (42.3) 63 (45.7) 118 (40.7) 1.41 (0.78-2.57)
College 116 (37.7) 46 (40.7) 70 (35.9) 1.71 (0.85-3.45) 58 (48.3) 9 (36) 49 (51.6) 0.51 (0.15-1.78) 174 (40.7) 55 (39.9) 119 (41) 1.22 (0.67-2.24)
Employment
Employed 125 (40.6) 51 (45.1) 74 (37.9) ref 48 (40) 9 (36) 39 (41.1) Ref 173 (40.4) 60 (43.5) 113 (39) ref
Unemployed 183 (59.4) 62 (54.9) 121 (62.1) 0.74 (0.46-1.19) 72 (60) 16 (64) 56 (58.9) 1.24 (0.50-3.09) 255 (59.6) 78 (56.5) 177 (61) 0.83 (0.55-1.25)
Insurance status
Public 197 (64) 71 (62.8) 126 (64.6) ref 64 (53.3) 14 (56) 50 (52.6) Ref 261 (61) 85 (61.6) 176 (60.7) ref
Private 91 (29.5) 38 (33.6) 53 (27.2) 1.27 (0.77-2.11) 47 (39.2) 8 (32) 39 (41.1) 0.73 (0.28-3.29) 138 (32.2) 46 (33.3) 92 (31.7) 1.04 (0.67-1.61)
None/self-pay 20 (6.5) 4 (3.5) 16 (8.2) 0.44 (0.14-1.38) 9 (7.5) 3 (12) 6 (6.3) 1.79 (0.40-8.06) 29 (6.8) 7 (5.1) 22 (7.6) 0.66 (0.27-1.60)
Marital status
Living with partner 224 (72.7) 84 (74.3) 140 (71.8) ref 96 (80) 20 (80) 76 (80) Ref 320 (74.8) 104 (75.4) 216 (74.5) ref
Not living with partner 84 (27.3) 29 (25.7) 55 (28.2) 0.88 (0.52-1.49) 24 (20) 5 (20) 19 (20) 1.00 (0.33-3.01) 108 (25.2) 34 (24.6) 74 (25.5) 0.95 (0.60-1.52)
Household income
$0-$12 000 97 (31.5) 31 (27.4) 66 (33.8) ref 39 (32.5) 12 (48) 27 (28.4) Ref 136 (31.8) 43 (31.2) 93 (32.1) ref
$12 001-$24 000 65 (21.1) 16 (14.2) 49 (25.1) 0.70 (0.34-1.41) 18 (15) 4 (16) 14 (14.7) 0.64 (0.17-2.37) 83 (19.4) 20 (14.5) 63 (21.7) 0.69 (0.37-1.28)
$24 001-$50 000 51 (16.6) 20 (17.7) 31 (15.9) 1.37 (0.68-2.78) 25 (20.8) 5 (20) 20 (21.1) 0.56 (0.17-1.85) 76 (17.8) 25 (18.1) 51 (17.6) 1.06 (0.58-1.93)
>$50 000 58 (18.8) 24 (21.2) 34 (17.4) 1.50 (0.77-2.95) 30 (25) 4 (16) 26 (27.4) 0.35 (0.10-1.21) 88 (20.6) 28 (20.3) 60 (20.7) 1.01 (0.57-1.80)
Unknown 37 (12) 22 (19.5) 15 (7.7) 3.12 (1.43-6.83)a 8 (6.7) 0 (0) 8 (8.4) 0.00 (NA) 45 (10.5) 22 (15.9) 23 (7.9) 2.07 (1.04-4.11)a
Age, yb 27.52 27.08 27.77 P-val .29 27.98 27.08 28.22 P-val .31 27.65 27.08 27.92 P-val .13
BMI, kg/m2b 26.87 26.64 27.00 P-val .65 28.02 27.82 28.07 P-val .86 27.19 26.86 27.35 P-val .47
Clinical phenotypes
Infection or inflammation
No evidence 161 (52.3) 54 (47.8) 107 (54.9) ref 73 (60.8) 14 (56) 59 (62.1) Ref 234 (54.7) 68 (49.3) 166 (57.2) ref
Some evidence 147 (47.7) 59 (52.2) 88 (45.1) 1.33 (0.83-2.11) 47 (39.2) 11 (44) 36 (37.9) 1.29 (0.53-3.14) 194 (45.3) 70 (50.7) 124 (42.8) 1.38 (0.92-2.07)
Decidual hemorrhage
No evidence 267 (86.7) 84 (74.3) 183 (93.8) ref 109 (90.8) 22 (88) 87 (91.6) Ref 376 (87.9) 106 (76.8) 270 (93.1) ref
Some evidence 41 (13.3) 29 (25.7) 12 (6.2) 5.26 (2.56-10.82)a 11 (9.2) 3 (12) 8 (8.4) 1.48 (0.36-6.06) 52 (12.1) 32 (23.2) 20 (6.9) 4.08 (2.23-7.44)a
Cervical insufficiency
No evidence 281 (91.2) 95 (84.1) 186 (95.4) ref 111 (92.5) 22 (88) 89 (93.7) Ref 392 (91.6) 117 (84.8) 275 (94.8) ref
Some evidence 27 (8.8) 18 (15.9) 9 (4.6) 3.92 (1.69-9.05)a 9 (7.5) 3 (12) 6 (6.3) 2.02 (0.47-8.73) 36 (8.4) 21 (15.2) 15 (5.2) 3.29 (1.64-6.61)a
Placental dysfunction
No evidence 243 (78.9) 85 (75.2) 158 (81) ref 70 (58.3) 11 (44) 59 (62.1) Ref 313 (73.1) 96 (69.6) 217 (74.8) ref
Possible evidence 42 (13.6) 21 (18.6) 21 (10.8) 1.86 (0.96-3.60) 17 (14.2) 3 (12) 14 (14.7) 1.15 (0.28-4.68) 59 (13.8) 24 (17.4) 35 (12.1) 1.55 (0.87-2.75)
Moderate evidence 23 (7.5) 7 (6.2) 16 (8.2) 0.81 (0.32-2.05) 33 (27.5) 11 (44) 22 (23.2) 2.68 (1.02-7.06)a 56 (13.1) 18 (13) 38 (13.1) 1.07 (0.58-1.97)
Family history
No evidence 176 (57.1) 61 (54) 115 (59) ref 68 (56.7) 11 (44) 57 (60) Ref 244 (57) 72 (52.2) 172 (59.3) ref
Moderate evidence 32 (10.4) 9 (8) 23 (11.8) 0.74 (0.32-1.69) 18 (15) 3 (12) 15 (15.8) 1.04 (0.26-4.19) 50 (11.7) 12 (8.7) 38 (13.1) 0.75 (0.37-1.53)
Strong evidence 100 (32.5) 43 (38.1) 57 (29.2) 1.42 (0.86-2.35) 34 (28.3) 11 (44) 23 (24.2) 2.48 (0.94-6.51) 134 (31.3) 54 (39.1) 80 (27.6) 1.61 (1.04-2.51)a
Maternal comorbidities
No evidence 243 (78.9) 88 (77.9) 155 (79.5) ref 86 (71.7) 16 (64) 70 (73.7) Ref 329 (76.9) 104 (75.4) 225 (77.6) ref
Some evidence 65 (21.1) 25 (22.1) 40 (20.5) 1.10 (0.63-1.94) 34 (28.3) 9 (36) 25 (26.3) 1.58 (0.62-4.01) 99 (23.1) 34 (24.6) 65 (22.4) 1.13 (0.70-1.82)
Maternal stress
No evidence 235 (76.3) 84 (74.3) 151 (77.4) ref 88 (73.3) 15 (60) 73 (76.8) Ref 323 (75.5) 99 (71.7) 224 (77.2) ref
Possible evidence 44 (14.3) 17 (15) 27 (13.8) 1.13 (0.58-2.20) 18 (15) 7 (28) 11 (11.6) 3.10 (1.03-9.29)a 62 (14.5) 24 (17.4) 38 (13.1) 1.43 (0.81-2.51)
Strong evidence 29 (9.4) 12 (10.6) 17 (8.7) 1.27 (0.58-2.78) 14 (11.7) 3 (12) 11 (11.6) 1.33 (0.33-5.34) 43 (10) 15 (10.9) 28 (9.7) 1.21 (0.62-2.37)
Previous obstetric history
Previous Caesarean section
No 262 (85.1) 87 (77) 175 (89.7) ref 78 (65) 16 (64) 62 (65.3) Ref 340 (79.4) 103 (74.6) 237 (81.7) ref
Yes 46 (14.9) 26 (23) 20 (10.3) 2.61 (1.38-4.95)a 42 (35) 9 (36) 33 (34.7) 1.06 (0.42-2.65) 88 (20.6) 35 (25.4) 53 (18.3) 1.52 (0.94-2.47)
Number of previous deliveriesb 2.23 2.26 2.22 P-val .81 2.12 2.44 2.03 P-val .09 2.20 2.29 2.16 P-val .33
Number of abortionsb 0.45 0.42 0.48 P-val .54 0.43 0.68 0.36 P-val .07 0.45 0.46 0.44 P-val .81
Time since last pregnancy, yb 3.32 3.21 3.38 P-val .55 3.51 3.48 3.52 P-val .95 3.37 3.26 3.42 P-val .54
Previous birth weight, gb 2335.37 2139.30 2448.99 P-val .001a 2394.79 2004.48 2497.50 P-val .02a 2352.03 2114.87 2464.88 P-val < .0001a

Values are n (%) or mean. Univariable ORs and 95% CIs were calculated by chi-square test for categorical variables in comparison to the reference category. P-vals were calculated by 1-way analysis of variance test and are reported instead of ORs for continuous variables.

BMI: body mass index; CI: confidence interval; OR: odds ratio; P-val: P value.

a

Statistically significant value.

b

Continuous variable.

Univariate analysis

Findings from univariate analyses are summarized in Table 3. In terms of demographics, on average women delivering preterm are 0.84 years younger and have lower BMI (26.86 kg/m2 for preterm vs 27.35 kg/m2 for term deliveries), although these differences were not found to be statistically significant. Being Hispanic or Latino decreases the risk for spontaneous preterm birth (odds ratio [OR], 0.55; 95% CI, 0.32-0.95). Paternal race other than African American or Caucasian was associated with increased risk for indicated preterm birth (OR, 3.83; 95% CI, 1.03-14.28). In terms of clinical phenotypes, increased risk for spontaneous preterm birth was observed for participants with decidual hemorrhage (OR, 5.26; 95% CI, 2.56-10.82) and with cervical insufficiency (OR, 3.92; 95% CI, 1.69-9.05). Increased risk for indicated preterm birth was observed for participants with possible evidence for maternal stress (OR, 3.10; 95% CI, 1.03-9.29), and with moderate evidence for placental dysfunction (OR, 2.68; 95% CI, 1.02-7.06). An additional risk factor found to be significant when the delivery subtypes were combined was strong level of evidence for family history of preterm birth (OR, 1.61; 95% CI, 1.04-2.51). In terms of previous obstetric history, on average, participants delivering preterm gave birth to lighter babies in the previous gestation (2115 g vs 2465 g; P < .0001). The association of lower birth weight in the previous gestation with preterm birth remained statistically significant across both delivery subtypes (P = .001 for spontaneous and P = .02 for indicated). The increased number of previous abortions was associated with elevated risk for indicated preterm delivery (0.68 vs 0.36). In addition, increased risk for spontaneous preterm birth was observed for participants with previous caesarean section (OR, 2.61; 95% CI, 1.38-4.95).

A pairwise log-rank test for the difference in Kaplan-Meier survival curves showed statistically significant differences for 8 participant risk factors for spontaneous delivery, 1 risk factor for indicated delivery, and 1 additional for the combined cohort. The Kaplan-Meier curves that showed statistically significant differences are displayed in Figure 2. The median gestational age at delivery for participants with the spontaneous delivery subtype was 37 weeks, while the median for indicated delivery was 38 weeks. We observed a difference between the delivery subtype survival curves (P = .04) (Figure 2A). The difference between participants with strong family history and those with no family history appeared significant when delivery subtypes were combined and analyzed together (P = .02) (Figure 2B). For spontaneous delivery, Kaplan-Meier analysis showed 4 sociodemographic groups that delivered earlier than the reference group: participants who reported paternal race other than Caucasian, or African American (P < .005), maternal ethnicity not Hispanic or Latino (P = .04), having a private insurance status (P = .01), and having an unknown household income (P = .05). Participants who reported household income of $50 000 or more were found to deliver significantly later that the group with the lowest household income (P = .04 [not shown]) (Figure 2C-2F). None of the sociodemographic or previous obstetric history risk factors were found to be different for indicated delivery participants. Among the factors related to previous obstetric history, only previous caesarean section was associated with earlier birth timing for the spontaneous delivery subtype (P = .01) (Figure 2G). Participants with some evidence for 3 clinical phenotypes delivered at an earlier gestational age: decidual hemorrhage (P < .005), cervical insufficiency (P < .005), and possible evidence for placental dysfunction (P = .02) (Figure 2H-2J). The only phenotype that had statistically significant findings for indicated delivery was placental dysfunction: moderate evidence for the phenotype was associated with delivery at an earlier gestational age (P < .005) (Figure 2K).

Figure 2.

Figure 2.

Univariate survival analysis by Kaplan-Meier estimate. Change in survival probability over time is plotted with Kaplan-Meier estimator. The survival probability at a given time point is interpreted as probability of not delivering at a given gestational age. The most significant P value for the pairwise log-rank test for significance is displayed in the upper right corner of each graph. The patient characteristics for which Kaplan-Meier curves differed significantly between groups included delivery subtype (A), family history in total population (B), paternal race (C), maternal ethnicity (D), insurance status (E), household income (F), previous caesarean section (G), decidual hemorrhagea (H), cervical insufficiency (I), placental dysfunction in the spontaneous delivery subgroup (J), and placental dysfunction in the indicated delivery subgroup (K). The groups that differed significantly are described in the text.

Multivariable analysis

Among 308 women who had a spontaneous delivery, 113 participants experienced a preterm delivery event, and 195 participants were treated as censored subjects. Four covariates were found to increase the risk for experiencing the event in the multivariable model (Table 4). Participants with some evidence for either decidual hemorrhage (HR, 3.64; 95% CI, 2.17-6.12), or cervical insufficiency (HR, 3.63; 95% CI, 2.00-6.58) were 3.6 times as likely to deliver preterm as women with no evidence for those phenotypes. In addition, lower birth weight in previous delivery was associated with a higher risk for subsequent preterm birth (HR, 1.00; 95% CI, 0.99-1.00; P < .005). Among the sociodemographic factors, unknown household income increased the risk for preterm delivery more than 2-fold when compared with the lowest-income group (HR, 2.40; 95% CI, 1.24-4.66).

Table 4.

Association between birth timing and obstetric history, sociodemographic, and clinical risk factors (Cox regression model)

Hazard ratio (95% CI)
Spontaneous Indicated Total
Sociodemographic factors
Maternal race
Caucasian ref ref ref
Black or African American 2.08 (0.54-8.03) 1.94 × 109 (0.00-inf) 1.73 (0.49-6.12)
Other 1.42 (0.49-4.15) 0.01 (0.00-0.54)a 1.17 (0.46-2.97)
Paternal race
Caucasian ref ref ref
Black or African American 0.67 (0.19-2.37) 5.78 × 10-10 (0.00-inf) 0.70 (0.21-2.33)
Other 0.68 (0.23-2.04) 30.85 (0.79-1208.15) 0.92 (0.36-2.36)
Maternal ethnicity
Not Hispanic or Latino ref ref ref
Hispanic or Latino 0.42 (0.13-1.36) 14.09 (0.14-1427.37) 0.46 (0.15-1.37)
Paternal ethnicity
Not Hispanic or Latino ref ref ref
Hispanic or Latino 2.19 (0.76-6.37) 0.23 (0.01-9.98) 2.26 (0.82-6.25)
Education level
Less than high school ref ref ref
High school or GED 1.32 (0.67-2.61) 2.30 (0.32-16.64) 1.36 (0.77-2.39)
College 0.92 (0.38-2.23) 2.21 (0.20-24.79) 0.80 (0.39-1.63)
Employment
Employed ref ref ref
Unemployed 0.90 (0.57-1.44) 1.17 (0.32-4.20) 0.85 (0.57-1.28)
Insurance status
Public ref ref ref
Private 0.63 (0.28-1.45) 0.07 (0.00-1.26) 0.62 (0.29-1.30)
None/self-pay 0.32 (0.08-1.26) 0.64 (0.02-26.79) 0.57 (0.19-1.73)
Marital status
Living with partner ref ref ref
Not living with partner 0.84 (0.47-1.50) 0.12 (0.02-0.80)a 0.75 (0.45-1.24)
Household income
$0-$12 000 ref ref ref
$12 001-$24 000 0.69 (0.34-1.40) 0.04 (0.01-0.31)a 0.64 (0.35-1.17)
$24 001-$50 000 1.31 (0.54-3.18) 0.02 (0.00-0.43)a 0.86 (0.41-1.82)
>$50 000 1.26 (0.48-3.32) 0.01 (0.00-0.29)a 1.13 (0.48-2.62)
Unknown 2.40 (1.24-4.66)a 3.77 × 10-11 (0.00-inf) 1.85 (1.04-3.30)a
Age 0.98 (0.92-1.04) 0.94 (0.77-1.15) 0.94 (0.89-0.99)a
BMI 0.98 (0.95-1.01) 0.91 (0.78-1.05) 0.98 (0.95-1.01)
Clinical phenotypes
Infection/inflammation
No evidence ref ref ref
Some evidence 1.18 (0.78-1.79) 1.83 (0.45-7.40) 1.12 (0.78-1.62)
Decidual hemorrhage
No evidence ref ref ref
Some evidence 3.64 (2.17-6.12)a 1.17 (0.19-7.08) 2.97 (1.88-4.69)a
Cervical insufficiency
No evidence ref ref ref
Some evidence 3.63 (2.00-6.58)a 2.20 (0.27-17.78) 3.00 (1.74-5.16)a
Placental dysfunction
No evidence ref ref ref
Possible evidence 1.67 (0.98-2.85) 0.36 (0.04-3.30) 1.16 (0.70-1.91)
Moderate evidence 1.07 (0.44-2.61) 10.70 (2.71-42.22)a 1.22 (0.71-2.11)
Family history
No evidence ref ref ref
Moderate evidence 0.54 (0.35-1.17) 0.23 (0.03-2.01) 0.55 (0.28-1.06)
Strong evidence 1.13 (0.74-1.72) 2.65 (0.72-9.73) 1.22 (0.84-1.78)
Maternal comorbidities
No evidence ref ref ref
Some evidence 1.04 (0.64-1.70) 1.96 (0.54-7.16) 1.05 (0.69-1.59)
Maternal stress
No evidence ref ref ref
Possible evidence 1.33 (0.74-2.39) 4.36 (0.93-20.47) 1.16 (0.70-1.90)
Strong evidence 1.29 (0.62-2.66) 1.23 (0.16-9.60) 1.27 (0.71-2.29)
Previous obstetric history
Previous caesarean section
No ref ref ref
Yes 1.65 (0.97-2.82) 4.83 (1.01-23.05)a 1.28 (0.82-2.00)
Number of previous deliveries 1.08 (0.90-1.30) 2.21 (1.17-4.20)a 1.25 (1.08-1.45)a
Number of abortions 1.01 (0.79-1.29) 3.33 (1.54-7.24)a 1.15 (0.93-1.42)
Time since last pregnancy 1.04 (0.94-1.14) 1.24 (0.95-1.62) 1.05 (0.96-1.14)
Previous birth weightb 1.00 (0.99-1.00)a 1.00 (0.99-1.00)a 1.00 (0.99-1.00)a
Model performance
Concordance 0.76 (SE = 0.02) 0.87 (SE = 0.04) 0.72 (SE = 0.02)
AIC 1212.706 239.2805 1593.33
Log-rank test 115.8 (P < .001)a 47.22 (P = .05)a 106.8 (P < .001)a

Hazard ratios and 95% CIs are presented for each category in regard to the reference group. Owing to a small number of participants in the dataset who had preterm indicated delivery (n = 25), maternal race and paternal race were perfectly collinear for the Black or African American category, which resulted in abnormal hazard ratios and infinite CIs. None of the participants who had an indicated preterm delivery had an unknown income, which led to a hazard ratio approaching zero and an infinite CI. Concordance, AIC values, and log-rank test statistics are presented as model performance measures. Higher concordance, lower AIC, and lower P values for log-rank test are characteristics of a better fit.

AIC: Akaike information criterion; BMI: body mass index; CI: confidence interval; inf: infinity; ref: reference.

a

Statistically significant value.

b

The 95% CI for previous birth weight does not cross the line of no effect. The upper CI value is reported as 1.00 due to rounding.

Only 25 of 120 participants in the indicated delivery cohort delivered before 37 weeks of gestation, and the majority of participants were censored. Among the sociodemographic determinants, the HR was decreased for participants who had a race other than Caucasian or Black or African American (HR, 0.01; 95% CI, 0.00-0.54). The other significant sociodemographic factors that reduced the risk included not living with partner (HR, 0.12; 95% CI, 0.02-0.80) and having income higher than $12 000 (for the $12 001-$24 000 group: HR, 0.04; 95% CI, 0.01-0.31; for the $24 001-$50 000 group: HR, 0.02; 95% CI, 0.00-0.43; and for the higher than $50 000 group: HR, 0.01; 95% CI, 0.00-0.29). Previous obstetric history played a major role in predicting subsequent indicated preterm delivery. Lower previous birth weight (HR, 0.10; 95% CI, 0.99-1.00; P < .01), higher number of previous deliveries (HR, 2.21; 95% CI, 1.17-4.20), higher number of abortions (HR, 3.33; 95% CI, 1.54-7.24), and previous caesarean section (HR, 4.83; 95% CI, 1.01-23.05) were all predictive of indicated preterm birth. Moderate evidence for placental dysfunction remained the only clinical risk factor for the indicated cohort (HR, 10.70; 95% CI, 2.71-42.22).

Multivariable analysis of the cohort independent of the delivery type identified 6 important covariates. Maternal age was the only one that did not show any significance for distinct delivery subtypes but appeared significant for the combined cohort. Lower maternal age was associated with increased risk for delivering preterm (HR, 0.94; 95% CI, 0.89-0.99). The findings for decidual hemorrhage and cervical insufficiency were consistent with the results for the spontaneous delivery cohort, with higher level of evidence increasing the risk in both phenotypes (HR, 2.97; 95% CI, 1.88-4.69; and HR, 3.00; 95% CI, 1.74-5.16). Higher number of previous deliveries and lower birth weight were the final important features in the Cox proportional hazards model (HR, 1.25; 95% CI, 1.08-1.45; and HR, 1.00; 95% CI, 0.99-1.00; P < .005).

Model assessment

Proportional hazards assumption

The scaled Schoenfeld residuals for the Cox regression models were significantly correlated with time for the naïve model fit on the combined data (P < .005 globally). The problematic variables included employment (P < .01), cervical insufficiency (P < .005), and placental dysfunction (P = .04). For the spontaneous delivery subtype, no significant correlation was found globally, although maternal ethnicity and cervical insufficiency were time-dependent (P = .02 and P < .005, respectively). Owing to a low number of subjects delivering before 37 weeks in the indicated cohort, maternal race and paternal race were perfectly colinear for Black or African American in both groups, and none of the observed participants had an unknown household income. This resulted in the infinite upper CI values for these 3 categories. As a consequence, we were unable to estimate Schoenfeld residuals for the resultant model for the indicated delivery cohort.

Model performance

The performance metrics for the Cox proportional hazards regression models are summarized at the bottom of Table 4. The higher concordance index and lower AIC score indicate a better-performing model. Both models specific for a delivery subtype were superior to the naïve model in terms of the concordance index and AIC. Cox regression model was more concordant for indicated delivery than for spontaneous delivery (0.87 vs 0.76) and had a lower AIC value (239.28 vs 1212.71), which indicates better performance of the model when applied to the indicated delivery cohort. The models fit on the total and spontaneous delivery cohorts were highly statistically significant, and the indicated delivery model showed borderline statistical significance on the log-rank test.

DISCUSSION

The analytical framework developed in this study was effective to model preterm birth in the retrospective cohort of women with a history of preterm birth. Our comprehensive analysis included 52 known risk factors for premature delivery. To reduce dimensionality of the data, we utilized a clinical phenotyping tool for preterm birth guided by a clinician expertise. We examined the association between individual risk factors and preterm birth by chi-square and univariate survival analysis. We then developed a multivariable Cox regression model that calculates the risk for recurrent preterm birth. Use of the multivariable model helped us account for the complex phenotype of preterm birth and to adjust for confounding risk factors. In addition to the entire cohort, we performed separate analyses of spontaneous and indicated delivery subtypes. Both delivery subtype-specific models performed better than the combined model, with the indicated delivery model showing the best performance. Half of the covariates included were predictive of an earlier delivery time across all models. To our knowledge, the present study is the most comprehensive analysis of known risk factors for preterm birth published to date.20–23

The 4-step framework described in this study provides researchers with a meaningful and reliable strategy for analyzing risk factors for preterm birth on various datasets. The comprehensive nature of the variables selected for the analysis approximates the complex patient profile, while clinical phenotyping organizes the data and reduces complexity. We believe that due to these features, the proposed framework can be efficiently integrated into analysis of preterm birth using EHR data. More work, however, would be needed to explore potential applications with EHR data. Although there exists a tool developed by Gao et al71 that predicts preterm birth from EHR data, it is not based on known risk factors, does not separate the patients by the delivery-subtype, and has a low positive predictive value. Therefore, there is a need for more efficient strategies to analyze patient’s risk for preterm birth from EHR data.

It is important to note that the present study was not aiming to identify new risk factors predictive of preterm birth, but rather looked to combine already known risk factors in a meaningful way. The inclusion of many factors more closely reflects the complex patient profile observed in a real-world setting, in which the interplay between individual risk factors may change patient’s overall risk for delivering preterm. The multivariable model accounts for such interplay and adjusts for potential confounding factors.

Previous studies have found numerous clinical risk factors associated with preterm birth. Many of them are either closely related in etiology or affect the mother in similar ways.35 Therefore, modeling preterm birth with individual factors as covariates adds unnecessary complexity to the models and can lead to false discoveries. The work of others has shown that feature engineering with clinical expert knowledge can reduce complexity of machine learning models without affecting their performance.72 In the present study, clinical risk factors were combined into clinical phenotypes according to a previously developed classification of risk factors and expanded upon by a clinical specialist. Dimension reduction is an important step in analyzing unstructured and noisy medical records data. The clinical phenotyping step of the proposed framework helps with dimension reduction and organizes data in a meaningful way, reducing the complexity of the models and improving interpretability. This demonstrates the potential of the proposed framework to be applied for the analysis of highly unstructured EHR data.

It is important to realize the difference in the etiology of preterm birth subtypes. Spontaneous and indicated preterm births have different underlying risk factors, and therefore, the separate risk scores must be calculated for different delivery subtypes. As was expected, stratification by delivery subtype improved the performance of our model. The better performance of the model for indicated delivery may be explained in part by the fact that patients are usually induced for delivery before term due to very particular clinical characteristics, whereas spontaneous preterm birth depends on many other factors such as mother’s lifestyle44,73–75 and genetics,76–78 which were not included in the current analysis.

In a clinical setting, the results obtained from the multivariable model can help physicians identify the patients at high risk of delivering preterm and apply appropriate interventions to mitigate the risk. It is important to perform risk assessment early to improve the effectiveness of interventions. Therefore, we suggest that preterm birth risk assessment should be conducted during a routine prenatal care visit in the first or early second trimester of pregnancy. However, because the indication for delivery does not usually occur until the late phase of gestation, it is impossible to know which subtype-specific model to apply at the suggested time point. Considering the superior performance of subtype-specific models over the naïve model, we suggest estimating the risk for both spontaneous and indicated preterm birth with 2 separate models. The results from the spontaneous model can help a physician choose an appropriate course of action to reduce the risk for spontaneous preterm birth. The results of an indicated model, however, are harder to interpret, given the fact that some major risk factors are themselves clinician’s reasons for indication. A high risk score derived from the indicated model in the absence of the reason for indication would signal the physician that a patient is likely to develop a condition that would lead to indication prior to 37 weeks. With that knowledge, the physician can apply proper preventative measures to avoid such developments.

Our study had some important limitations. First, the relatively small number of participants who delivered preterm in the indicated delivery cohort reduced the statistical significance of the findings and made the model difficult to interpret. Second, we imputed prepregnancy BMI and previous birth weight for participants with missing values, which may have affected our findings. Third, the data used in this study were collected in 2009 to 2010. There may be changes in practice or new interventions since then to prevent preterm birth that were not captured in the dataset. Fourth, the proportional hazards assumption did not hold for several covariates in the naïve and spontaneous models, although it held globally for the spontaneous delivery subtype. In the future, time-dependent covariates could be introduced for the problematic variables to overcome this limitation. Recent work by Stensrud et al.,79 however, reported that virtually all real-world datasets will violate proportional hazards assumption, and that the assumption does not necessarily need to hold true, when the goal is to predict the outcome. Finally, only risk factors related to participants’ previous pregnancy history, sociobehavior, and clinical characteristics were included in the analysis. Future studies are needed to explore changes in model performance with the inclusion of genomic and behavior pattern covariates.

Other future directions include exploring the utility of the proposed framework and assessing the predictive potential of the created Cox regression models on the EHR data. We believe that the analytical framework proposed in this study will be useful when working with highly unstructured medical records data. The expanded version of the clinical phenotyping tool can help with extracting meaningful data in that setting, and Cox regression analysis can adjust for patients that are “lost to follow-up.”

CONCLUSION

The analytical framework developed in this study was effective to comprehensively analyze risk factors for recurrent preterm birth. Delivery subtype–specific multivariable models were statistically significant and performed better than the combined model. The proposed clinical phenotyping strategy reduces the complexity of the models and may serve as a tool to extract relevant information from medical records. Further studies, however, are needed to assess the performance of the models on EHR data before they can be used to predict patient’s risk for preterm birth prospectively.

FUNDING

This work was supported in part by a Microsoft Investigator Fellowship awarded to COT.

AUTHOR CONTRIBUTIONS

The first draft of the manuscript was prepared by IR, NFM, and COT. IR, NFM, and COT contributed to the conceptualization and design of the study. ACJ provided critical clinical direction. COT acquired the data. IR, NFM, JB, and COT contributed to the design of the statistical approaches. IR and NFM contributed to data analysis. All authors contributed to writing of the manuscript and approved the submitted version.

CONFLICT OF INTEREST STATEMENT

None declared.

DATA AVAILABILITY STATEMENT

Data underlying the study were provided by a third party, the Data and Specimen Hub maintained by the National Institutes of Health Eunice Kennedy Shriver National Institute of Child Health and Human Development. Data and biospecimens are available upon request from https://dash.nichd.nih.gov/study/13. This study uses data readily available for download from DASH.

REFERENCES

  • 1.World Health Organization (WHO). Preterm birth. https://www.who.int/news-room/fact-sheets/detail/preterm-birth. Accessed June 4, 2021.
  • 2. Platt MJ.  Outcomes in preterm infants. Public Health  2014; 128 (5): 399–403. [DOI] [PubMed] [Google Scholar]
  • 3. Mwaniki MK, Atieno M, Lawn JE, Newton CRJC.  Long-term neurodevelopmental outcomes after intrauterine and neonatal insults: a systematic review. Lancet  2012; 379 (9814): 445–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.UNICEF. Levels and trends in child mortality 2020. https://www.unicef.org/reports/levels-and-trends-child-mortality-report-2020. Accessed June 1, 2021.
  • 5. Yang J, Baer RJ, Berghella V, et al.  Recurrence of preterm birth and early term birth. Obstet Gynecol  2016; 128 (2): 364–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Phillips C, Velji Z, Hanly C, Metcalfe A.  Risk of recurrent spontaneous preterm birth: a systematic review and meta-analysis. BMJ Open  2017; 7 (6): e015402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Ananth CV, Ananth CV, Vintzileos AM.  Epidemiology of preterm birth and its clinical subtypes. J Matern Neonatal Med  2006; 19 (12): 773–82. [DOI] [PubMed] [Google Scholar]
  • 8. Mazaki-Tovi S, Romero R, Kusanovic JP, et al.  Recurrent preterm birth. Semin Perinatol  2007; 31 (3): 142–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Medley N, Vogel JP, Care A, Alfirevic Z.  Interventions during pregnancy to prevent preterm birth: an overview of Cochrane systematic reviews. Cochrane Database Syst Rev  2018; 11: CD012505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Conde-Agudelo A, Romero R, Da Fonseca E, et al.  Vaginal progesterone is as effective as cervical cerclage to prevent preterm birth in women with a singleton gestation, previous spontaneous preterm birth, and a short cervix: updated indirect comparison meta-analysis. Am J Obstet Gynecol  2018; 219 (1): 10–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Alfirevic Z, Stampalija T, Medley N.  Cervical stitch (cerclage) for preventing preterm birth in singleton pregnancy. Cochrane Database Syst Rev  2017; 6 (6): CD008991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Hubinont C, Debieve F.  Prevention of preterm labour: 2011 update on tocolysis. J Pregnancy  2011; 2011: 941057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Cetingoz E, Cam C, Sakalli M, Karateke A, Celik C, Sancak A.  Progesterone effects on preterm birth in high-risk pregnancies: a randomized placebo-controlled trial. Arch Gynecol Obstet  2011; 283 (3): 423–9. [DOI] [PubMed] [Google Scholar]
  • 14. Mackeen AD, Seibel-Seamon J, Muhammad J, Baxter JK, Berghella V.  Tocolytics for preterm premature rupture of membranes. Cochrane Database Syst Rev  2014; 2: CD007062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Goldenberg RL, Culhane JF, Iams JD, Romero R.  Epidemiology and causes of preterm birth. Lancet  2008; 371 (9606): 75–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Schummers L, Hutcheon JA, Hernandez-Diaz S, et al.  Association of short interpregnancy interval with pregnancy outcomes according to maternal age. JAMA Intern Med  2018; 178 (12): 1661–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Shen M, Smith GN, Rodger M, White RR, Walker MC, Wen SW.  Comparison of risk factors and outcomes of gestational hypertension and pre-eclampsia. PLoS One  2017; 12 (4): e0175914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Oskovi Kaplan ZA, Ozgu-Erdinc AS.  Prediction of preterm birth: maternal characteristics, ultrasound markers, and biomarkers: an updated overview. J Pregnancy  2018; 2018: 8367571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Son M, Miller ES.  Predicting preterm birth: cervical length and fetal fibronectin. Semin Perinatol  2017; 41 (8): 445–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Zhang Q, Ananth CV, Li Z, Smulian JC.  Maternal anaemia and preterm birth: a prospective cohort study. Int J Epidemiol  2009; 38 (5): 1380–9. [DOI] [PubMed] [Google Scholar]
  • 21. van de Mheen L, Schuit E, Lim AC, et al.  Prediction of preterm birth in multiple pregnancies: development of a multivariable model including cervical length measurement at 16 to 21 weeks’ gestation. J Obstet Gynaecol Canada  2014; 36 (4): 309–19. [DOI] [PubMed] [Google Scholar]
  • 22. Tellapragada C, Eshwara VK, Bhat P, et al.  Risk factors for preterm birth and low birth weight among pregnant Indian women: a hospital-based prospective study. J Prev Med Public Health  2016; 49 (3): 165–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Letouzey M, Foix-L’Hélias L, Torchin H, et al. ; EPIPAGE-2 Working Group on Infections. Cause of preterm birth and late-onset sepsis in very preterm infants: the EPIPAGE-2 cohort study. Pediatr Res  2021. Feb 24 [E-pub ahead of print]. doi:10.1038/s41390-021-01411-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.National Institute of Child Health and Human Development. NICHD genomic and proteomic network for preterm birth research. https://www.omicsdi.org/dataset/dbgap/phs000714. Accessed April 1, 2021.
  • 25.NICHD genomic and proteomic network for preterm birth research, dbgap, V1 [daetaset]. 1969. https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000714. Accessed September 6, 2021.
  • 26.National Institute of Child Health and Human Development DASH - study overview. https://dash.nichd.nih.gov/study/13. Accessed April 1, 2021.
  • 27. Fuchs F, Monet B, Ducruet T, Chaillet N, Audibert F.  Effect of maternal age on the risk of preterm birth: A large cohort study. PLoS One  2018; 13 (1): e0191002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Meng Y, Groth SW.  Fathers count: the impact of paternal risk factors on birth outcomes. Matern Child Health J  2018; 22 (3): 401–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Manuck TA.  Racial and ethnic differences in preterm birth: a complex, multifactorial problem. Semin Perinatol  2017; 41 (8): 511–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Hidalgo-Lopezosa P, Jiménez-Ruz A, Carmona-Torres JM, Hidalgo-Maestre M, Rodríguez-Borrego MA, López-Soto PJ.  Sociodemographic factors associated with preterm birth and low birth weight: a cross-sectional study. Women Birth  2019; 32 (6): e538–e543. [DOI] [PubMed] [Google Scholar]
  • 31. Dang B.  Birth outcomes among low-income women—documented and undocumented. Perm J  2011; 15 (2): 39–43. doi:10.7812/tpp/10-131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Rodrigues T, Barros H.  Maternal unemployment: an indicator of spontaneous preterm delivery risk. Eur J Epidemiol  2008; 23 (10): 689–93. [DOI] [PubMed] [Google Scholar]
  • 33. Ahern J, Pickett KE, Selvin S, Abrams B.  Preterm birth among African American and white women: a multilevel analysis of socioeconomic characteristics and cigarette smoking. J Epidemiol Community Health  2003; 57 (8): 606–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Korenromp EL, Rowley J, Alonso M, et al.  Global burden of maternal and congenital syphilis and associated adverse birth outcomes—estimates for 2016 and progress since 2012. PLoS One  2019; 14 (2): e0211720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Manuck TA, Esplin MS, Biggio J, et al.  The phenotype of spontaneous preterm birth: application of a clinical phenotyping tool. Am J Obstet Gynecol  2015; 212 (4): 487.e1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Li DK, Raebel MA, Cheetham TC, et al.  Genital herpes and its treatment in relation to preterm delivery. Am J Epidemiol  2014; 180 (11): 1109–17. [DOI] [PubMed] [Google Scholar]
  • 37. Bianchi-Jassir F, Seale AC, Kohli-Lynch M, et al.  Preterm birth associated with Group B streptococcus maternal colonization worldwide: systematic review and meta-analyses. Clin Infect Dis  2017; 65 (suppl_2): S133–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Kong L, Nilsson IAK, Gissler M, Lavebratt C.  Associations of maternal diabetes and body mass index with offspring birth weight and prematurity. JAMA Pediatr  2019; 173 (4): 371–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Padula AM, Yang W, Lurmann FW, Balmes J, Hammond SK, Shaw GM.  Prenatal exposure to air pollution, maternal diabetes and preterm birth. Environ Res  2019; 170: 160–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Battarbee AN, Sinkey RG, Harper LM, Oparil S, Tita ATN.  Chronic hypertension in pregnancy. Am J Obstet Gynecol  2020; 222 (6): 532–41. [DOI] [PubMed] [Google Scholar]
  • 41. Korevaar TIM, Derakhshan A, Taylor PN, et al. ; The Consortium on Thyroid and Pregnancy—Study Group on Preterm Birth. Association of thyroid function test abnormalities and thyroid autoimmunity with preterm birth: a systematic review and meta-analysis. JAMA  2019; 322 (7): 632–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Rahman MM, Abe SK, Rahman MS, et al.  Maternal anemia and risk of adverse birth and health outcomes in low- and middle-income countries: Systematic review and meta-analysis. Am J Clin Nutr  2016; 103 (2): 495–504. [DOI] [PubMed] [Google Scholar]
  • 43. Koire A, Chu DM, Aagaard K.  Family history is a predictor of current preterm birth. Am J Obstet Gynecol MFM  2021; 3 (1): 100277. [DOI] [PubMed] [Google Scholar]
  • 44. Staneva A, Bogossian F, Pritchard M, Wittkowski A.  The effects of maternal depression, anxiety, and perceived stress during pregnancy on preterm birth: a systematic review. Women Birth  2015; 28 (3): 179–93. [DOI] [PubMed] [Google Scholar]
  • 45. Craigo SD.  Indicated preterm birth for fetal anomalies. Semin Perinatol  2011; 35 (5): 270–6. [DOI] [PubMed] [Google Scholar]
  • 46. Goetzinger KR, Cahill AG, Macones GA, Odibo AO.  Association of first-trimester low PAPP-A levels with preterm birth. Prenat Diagn  2010; 30 (4): 309–13. [DOI] [PubMed] [Google Scholar]
  • 47. Gomes MS, Carlos-Alves M, Trocado V, Arteiro D, Pinheiro P.  Prediction of adverse pregnancy outcomes by extreme values of first trimester screening markers. Obstet Med  2017; 10 (3): 132–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Ville Y, Rozenberg P.  Predictors of preterm birth. Best Pract Res Clin Obstet Gynaecol  2018; 52: 23–32. [DOI] [PubMed] [Google Scholar]
  • 49. Wang X, Chen Y, Kuang H, et al.  Associations between maternal AFP and β-HCG and preterm birth. Am J Perinatol  2019; 36 (14): 1459–63. [DOI] [PubMed] [Google Scholar]
  • 50. Waller DK, Lustig LS, Cunningham GC, Feuchtbaum LB, Hook EB.  The association between maternal serum alpha-fetoprotein and preterm birth, small for gestational age infants, preeclampsia, and placental complications. Obstet Gynecol  1996; 88 (5): 816–22. [DOI] [PubMed] [Google Scholar]
  • 51. Olsen RN, Dunsmoor-Su R, Capurro D, McMahon K, Gravett MG.  Correlation between spontaneous preterm birth and mid-trimester maternal serum estriol. J Matern Neonatal Med  2014; 27 (4): 376–80. [DOI] [PubMed] [Google Scholar]
  • 52. Soghra K, Zohreh S, Kobra AK, Reza MM.  Single measurement of salivary estriol as a predictor of preterm birth. Pak J Biol Sci  2014; 17 (5): 730–4. [DOI] [PubMed] [Google Scholar]
  • 53. Huang SY, Wang YC, Yin WC, et al.  Is maternal serum inhibin A a good predictor in preterm labor? - Experience from a community hospital in Taiwan. Biomed J  2020; 43 (2): 183–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Hiersch L, Pasternak Y, Melamed N, et al.  The risk of preterm birth in women with three consecutive deliveries—the effect of number and type of prior preterm births. J Clin Med  2020; 9 (12): 3933.doi:10.3390/jcm9123933 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Magro Malosso ER, Saccone G, Simonetti B, Squillante M, Berghella V.  US trends in abortion and preterm birth. J Matern Neonatal Med  2018; 31 (18): 2463–7. [DOI] [PubMed] [Google Scholar]
  • 56. Kvalvik LG, Wilcox AJ, Skjærven R, Østbye T, Harmon QE.  Term complications and subsequent risk of preterm birth: registry based study. BMJ  2020; 369 (8243): m1007.doi:10.1136/bmj.m1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Williams CM, Asaolu I, Chavan NR, et al.  Previous cesarean delivery associated with subsequent preterm birth in the United States. Eur J Obstet Gynecol Reprod Biol  2018; 229: 88–93. [DOI] [PubMed] [Google Scholar]
  • 58. Ananth CV, Kaminsky L, Getahun D, Kirby RS, Vintzileos AM.  Recurrence of fetal growth restriction in singleton and twin gestations. J Matern Neonatal Med  2009; 22 (8): 654–61. [DOI] [PubMed] [Google Scholar]
  • 59. Chhabra S, Dargan R, Bawaskar R.  Oligohydramnios: a potential marker for serious obstetric complications. J Obstet Gynaecol  2007; 27 (7): 680–3. [DOI] [PubMed] [Google Scholar]
  • 60. Getahun D, Fassett MJ, Jacobsen SJ.  Gestational diabetes: risk of recurrence in subsequent pregnancies. Am J Obstet Gynecol  2010; 203 (5): 467.e1–6. [DOI] [PubMed] [Google Scholar]
  • 61. Cohen S, Kamarck T, Mermelstein R.  A global measure of perceived stress. J Health Soc Behav  1983; 24 (4): 385–96. [PubMed] [Google Scholar]
  • 62. Beck AT, Epstein N, Brown G, Steer RA.  An inventory for measuring clinical anxiety: psychometric properties. J Consult Clin Psychol  1988; 56 (6): 893–7. [DOI] [PubMed] [Google Scholar]
  • 63. Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J.  An inventory for measuring depression. Arch Gen Psychiatry  1961; 4 (6): 561–71. [DOI] [PubMed] [Google Scholar]
  • 64. Weile LKK, Hegaard HK, Wu C, et al.  Alcohol intake in early pregnancy and spontaneous preterm birth: a cohort study. Alcohol Clin Exp Res  2020; 44 (2): 511–21. [DOI] [PubMed] [Google Scholar]
  • 65. Wang YY, Li Q, Guo Y, et al.  Association of long-term exposure to airborne particulate matter of 1 μmor less with preterm birth in China. JAMA Pediatr  2018; 172 (3): e174872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Cox DR.  Regression models and life-tables. J R Stat Soc Ser B  1972; 34 (2): 187–202. [Google Scholar]
  • 67. Schoenfeld D.  Partial residuals for the proportional hazards regression model. Biometrika  1982; 69: 239–41. [Google Scholar]
  • 68. Davidson-Pilon C, Kalderstam J, Jacobson N, et al. CamDavidsonPilon/lifelines: 0.25.10. 2021. https://zenodo.org/record/4579431#.YTg70o5KhPZ. Accessed September 8, 2021.
  • 69. Therneau T, Grambsch P.  Modeling Survival Data: Extending the Cox Model. Berlin, Germany: Springer; 2000.
  • 70. Therneau TA. Package for survival analysis in R. https://cran.r-project.org/web/packages/survival/index.html. Accessed September 6, 2021.
  • 71. Gao C, Osmundson S, Velez Edwards DR, Jackson GP, Malin BA, Chen Y.  Deep learning predicts extreme preterm birth from electronic health records. J Biomed Inform  2019; 100: 103334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Roe KD, Jawa V, Zhang X, et al.  Feature engineering with clinical expert knowledge: a case study assessment of machine learning model complexity and performance. PLoS One  2020; 15 (4): e0231300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Ion R, Bernal AL.  Smoking and preterm birth. Reprod Sci  2015; 22 (8): 918–26. [DOI] [PubMed] [Google Scholar]
  • 74. Gete DG, Waller M, Mishra GD.  Effects of maternal diets on preterm birth and low birth weight: a systematic review. Br J Nutr  2020; 123 (4): 446–61. [DOI] [PubMed] [Google Scholar]
  • 75. Okun ML, Schetter CD, Glynn LM.  Poor sleep quality is associated with preterm birth. Sleep  2011; 34 (11): 1493–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Zhang G, Feenstra B, Bacelis J, et al.  Genetic associations with gestational duration and spontaneous preterm birth. N Engl J Med  2017; 377 (12): 1156–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Zhang G, Srivastava A, Bacelis J, Juodakis J, Jacobsson B, Muglia LJ.  Genetic studies of gestational duration and preterm birth. Best Pract Res Clin Obstet Gynaecol  2018; 52: 33–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Wadon M, Modi N, Wong HS, Thapar A, O’Donovan MC.  Recent advances in the genetics of preterm birth. Ann Hum Genet  2020; 84 (3): 205–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Stensrud MJ, Hernán MA.  Why test for proportional hazards?  JAMA  2020; 323 (14): 1401–2. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data underlying the study were provided by a third party, the Data and Specimen Hub maintained by the National Institutes of Health Eunice Kennedy Shriver National Institute of Child Health and Human Development. Data and biospecimens are available upon request from https://dash.nichd.nih.gov/study/13. This study uses data readily available for download from DASH.


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES