Predicting Neonatal Encephalopathy From Maternal Data in Electronic Medical Records

Thomas Li; Cheng Gao; Chao Yan; Sarah Osmundson; Bradley A Malin; You Chen

. 2018 May 18;2018:359–368.

Predicting Neonatal Encephalopathy From Maternal Data in Electronic Medical Records

Thomas Li ¹, Cheng Gao ², Chao Yan ², Sarah Osmundson ², Bradley A Malin ², You Chen ²

PMCID: PMC5961831 PMID: 29888094

Abstract

Neonatal encephalopathy (NE) is a leading cause of neonatal mortality and lifetime neurological disability. The earlier the risk of NE can be assessed, the more effective interventions can be in preventing adverse outcomes. Existing studies that focus on intrapartum risk factors do not provide the early prognostic forecasting necessary to prepare healthcare professionals to intervene early in a high-risk NE case. This work used maternal data in a supervised machine learning framework to predict NE events. Specifically, we 1) collected the electronic medical records (EMRs) for 104 NE newborns and 31,054 non-NE newborns and their mothers, 2) trained and tested a regularized logistic regression on imbalanced and high-dimensional EMR data, and 3) discerned important features that could be possible risk factors. The learned model offers prenatal predictions of NE cases with an average area under the receiving operator characteristic curve (AUC) of 87% and identified the most important predictors.

1. Introduction

Neonatal encephalopathy (NE) is a leading cause of infant mortality and long-term neurological morbidity^1,2. The American College of Obstetricians and Gynecologists (ACOG) defines it as a syndrome of disturbed neurological function in the earliest days of life in an infant born at or beyond 35 weeks of gestation, manifested by a subnormal level of consciousness or seizures, and often accompanied by difficulty with initiating and maintaining respiration and depression of tone and reflexes¹. Historically, the cause of NE was assumed to be asphyxial in nature, and the term hypoxic-ischemic encephalopathy was widely used² until the broader term neonatal encephalopathy was recently standardized by the ACOG. The change in terminology demonstrates a paradigm shift and recognition that the syndrome has multiple causal pathways. It was found that 70% of NE cases were most likely attributed to antepartum risk factors like abnormal fetal growth, maternal infection, and socioeconomic status^3,4. Recognizing distal risk factors such as these are important because they may exert early pathological damage and, as an artifact, predispose the fetus to NE. If this is the case, then early, preventative interventions can minimize the effects of distal risk factors and decrease the risk of NE¹.

Traditionally, approaches based on the knowledge of clinical experts have been relied upon to learn risk factors associated with NE. Domain expertise is leveraged to determine a set of candidate variables, after which statistical models are applied to ascertain which variables demonstrated an association with NE^3–10. Most notably, a Western Australian study found high birth weight to be highly correlated with NE through a univariate analysis³. Another study considered over 50 antepartum and intrapartum features and constructed a high performance logistic regression model using eight features that were identified as significantly associated with NE via univariate chi-squared tests¹⁰. There are, however, several limitations to these previous approaches. First, this type of investigation restricts the study to a small number of variables, making it difficult to identify novel risk factors. Second, since these approaches rely on the expert knowledge, the variables investigated may be subject to selection bias, such that potential predictor variables could be neglected. Third, previous studies focused on assessing the risk of NE only after birth, using variables that often are inaccessible before delivery, such as birth weight and Apgar scores.

To the best of our knowledge, there has not been an investigation into the extent to which NE can be predicted ahead of birth. If preterm predictions could be made, clinicians could be afforded an opportunity to improve the quality of NE care or prevent it from developing. For instance, it has been shown that planned or elective Cesarean delivery is associated with a reduced incidence of infants suffering from NE^12–14. To this end, we aimed to leverage antenatal maternal variables to model the risk of NE before birth. Maternal features have been extensively studied^3,5,7,15,16. However, maternal features were often treated as discrete variables and mixed in with intrapartum and neonatal features to do NE risk assessment. In the context of NE, there has not been a targeted effort to diversely investigate the maternal profile as a whole, perhaps because expert-driven approaches cannot investigate a large number of features, and consequently, cannot capture the full diversity of the maternal condition during pregnancy.

Our investigation was motivated by the need for a data-driven approach that models the risk of NE from a broad set of maternal variables. Our goal was to study the predictor variables that can present early in the NE causal pathway while mitigating variable selection bias. To this end, we present a retrospective study of 31,158 mother-infant pairs from over a decade of maternal electronic medical record (EMR) data at Vanderbilt University Medical Center (VUMC)¹⁷. The maternal variables were defined by billing codes that represent symptoms, diagnosis, and procedures, as well as demographic information. Given the low incidence of NE, we adopt a bootstrapping strategy to handle the imbalanced data (104 NE cases vs. 31,054 NE controls). We introduce a regularized logistic regression model that ranks the importance of maternal variables in terms of their ability to distinguish NE cases from control.

2. Materials and Methods

2.1. Materials

2.1.1. Dataset

The dataset was drawn from the EMR system at VUMC, which covers over 2.5 million patient records over a 29-year period. All data were de-identified and the study was approved by the Vanderbilt IRB. Table 1 summarizes various features about the population and the balance of the dataset. We identified 31,158 links between mothers and infants. The EMRs of the infants were inspected for the purpose of case identification of NE ¹. Maternal EMR information prior to delivery was collected to train and test the classification models, as well as the subsequent analysis of risk factors. Specifically, for each delivery event, we collected age, race, ICD-9 and CPT codes assigned before the time of the linked delivery. As shown in Table 1, the average age of mothers at delivery was 23 years old, with the youngest at age 12 and the oldest at age 50. The mothers were 66% White, 17.2% Black and 16.8% some other races. There were 7,860 unique ICD-9 codes and 6,095 unique CPT procedural codes observed in the mothers’ EMRs prior to delivery.

Table 1:

Summary of the VUMC Maternal Dataset. (W = White; B = Black; O = Other)

#Mother-Infant Pairs	Mother Age	Mother Race	#NE Cases	#Controls	#Unique ICD-9	#Unique CPT
31,158	23 [12,50]	W: 66%, B: 17.2%, O: 16.8%	104	31,054	7,860	6,095

Open in a new tab

2.1.2. Identification of Cases and Controls

In this study, a case corresponds to a mother who had at least one infant who was diagnosed with NE. We identified cases through a two-step process. First, we automatically identified NE candidates from the clinical notes of the delivered infants of the mothers by matching keywords in the natural language. Specifically, we conducted a free text search using regular expressions built on a set of keywords: {HIE, hypoxic-ischemic encephalopathy, hypoxemia, neonatal encephalopathy, asphyxia}. In doing so, we extracted 141 NE candidates. Since matching a keyword does not guarantee the actual incidence of the disease, in the second step, we manually reviewed each of the clinical notes of NE candidates, and with the confirmation of a clinically-knowledgeable expert, we labeled 104 of the candidates as NE and their mothers as cases. One of the primary reasons that the 37 candidates were excluded is that their clinical notes had some expression of a negation of the matched term (e.g., the infant did not suffer from HIE).

We defined a control as a mother who did not deliver any infants diagnosed with NE during a birthing event. As such, controls include mothers whose infants did not match on any term in the keyword list, as well as the mothers of infants who were considered as NE candidates in the first step of case identification but filtered by manual review at the second step. In the end, the number of controls was 31,054. Note that the case identification is based on each delivery event, such that a mother who gave birth to both a NE infant and a non-NE infant(s) and could be labeled as both a case and control (though this would constitute two separate instances). However, we did not observe any mothers that would be labeled as both a case and control.

2.2. Methods

We designed a data-driven framework that consists of 5 core modules: 1) feature construction to characterize each case/control, 2) cohort establishment to handle the imbalanced data, 3) predictive model training, 4) performance evaluation to assess predictive capability, and 5) feature analysis to determine which features were most associated with NE. The 5 modules are depicted in Figure 1. Details for each module are reported in the following subsections.

A framework for NE prediction and risk factor analysis based on EMR data.

2.2.1. Feature Construction

A feature space from the records of mothers was constructed. For each mother in the cases/controls, we extracted age, race, and the ICD-9 and CPT codes that were documented in the EMR within one year prior to delivery. The billing codes were limited to one year prior to delivery in an attempt to control for variability of observational window duration between cases and controls. This decision is explained in more depth in the discussion section. To reduce the dimensionality of the feature space and mitigate redundancy, we generalized ICD-9 codes by rolling them up the hierarchy as follows. First, all codes that pertain to pregnancy, childbirth, or conditions originating from the perinatal period (e.g., 630.* to 679.*, 760.* to 779.*, and 23.* to 23.9*) were rolled up to one character after their decimal. Second, all other codes were rolled up to the set of characters before the decimal point (e.g., 320.1 and 320.82 both generalized to 320).

As summarized in Table 2, all of the ICD-9 and CPT codes were represented as binary features, indicating the presence or absence of such codes in the EMR. Age and race were represented as ordinal and nominal features respectively. We borrowed the age categories from previous clinical studies on NE^3,5. Since only a small number of mothers fell into race categories other than “black” or “white”, we grouped them into the “other” category.

Table 2:

Summary of Features

Feature Type	ICD-9 codes	CPT codes	Age	Race
Value	{0,1}	{0,1}	1: [20, 29], 2: [30, 34], 3: greater than 34	1: White, 2: Black, 3: Other

Open in a new tab

2.2.2. Cohort Construction

As mentioned earlier, our dataset is highly imbalanced in terms of cases and controls, which is a concern for several reasons. First, it is widely accepted that learning from an imbalanced dataset leads to models that lack stability and exhibit poor generalization performance¹⁸. Second, there is a strong possibility that training in this setting will yield a model that simply predicts each newborn as a control (i.e., no NE). The model would achieve both a high accuracy and high false negative rate (i.e., predicting cases as controls) and would, thus, lack clinical utility. To address the class imbalance problem, we adopted an undersampling technique to construct a set of controls that contain the same number of instances as the cases.

Given that a single sample, although constructed randomly, may yield a biased prediction model, we conducted sampling in a bootstrapped manner, such that we composed k sample sets to evaluate the variance and confidence interval of the learned statistics. Specifically, we sampled 300 sets of 104 controls, each of which was analyzed with respect to the same set of 104 cases. For each cohort, features that appeared in less than 1% of the instances in a single cohort were eliminated from that cohort’s feature space to reduce sparsity and streamline data transfer.

2.2.3. Predictive Models

To train the models and evaluate their performance on our 300 cohorts, we used 75% of the data to perform a stratified sampling within each cohort for 10-fold cross-validation. The remaining 25% of the cohort served as a test dataset. This sampling strategy ensures that the models are both trained and evaluated on the rebalanced data.

During the training phase, we trained a regularized logistic regression model with a mixture of the l₁ and l₂ penalties, also known as the elastic net penalty, that identified important features. This method has been shown to perform better at various classification problems than both the l₁ or l₂ penalties alone on input spaces where features greatly outnumber the number of samples. Regularization is designed to limit error due to variance, and thus improve the generalizability of a trained model¹⁹. This is a highly desirable quality for our use case of training many models on many cohorts from a single population because all the models should perform consistently well, and be robust to the noise in the specific cohort they were trained on.

Regularized logistic regression is able to perform automatic feature selection^19,20 by ranking features and selecting a subset of heavily weighted features. In doing so, it eliminates or reduces the impact of less important features by setting their coefficients to zero or close to zero. The coefficients corresponding to features, i.e. the parameters of the model, indicate the importance of features. Based on this metric, feature selection is conducted.

After executing the training and validation module, we generate 300 predictive models with different coefficients, which we represent as {f₁, f₂,…, f_m} in Figure 1.

2.2.4. Performance Evaluation and Feature Ranking

To evaluate the generalization performance of each of the learned predictive models, we applied each model on its corresponding test dataset (i.e., a group of the remaining 25% samples in each cohort). We report the distribution of the area under the receiver operating characteristic curve (AUC) measure. Moreover, we record precision, sensitivity, and specificity for each model and computed the mean and the 95% confidence interval of AUC, precision, sensitivity and specificity values of all 300 models.

As mentioned above, regularization distinguishes important features² from among a large feature space in each classifier. These features are either positive or negative indicators for the presence of NE. To a certain degree, the learned coefficients reflect the relative contribution of the corresponding features to the NE prediction. If 0 represents the non-NE class and 1 represents the NE class, then the greater the weight of a certain feature, the more its presence relates to the NE. In contrast, the more negative a weight is, the more its presence is associated with a non-NE outcome. If a feature is eliminated from a model, then its corresponding coefficient will be zero.

To determine the predictive performance of a feature for NE, we computed the proportion that a feature’s weight covers with respect to the sum of all weights in the model. These proportions were then averaged across all models that utilized that feature, or, in other words, all models that retained the feature with a non-zero coefficient. We refer to this value as the feature importance, which is formalized in Equation 1, where the importance of feature x_p is computed based on the learned $m * n$ weights ${w_{k}^{i}}$ from m classifiers and n features. We also designed another metric, average frequency, to measure the contribution of each feature to NE classification. Average frequency for feature x_p is calculated as the fraction of models having a non-zero coefficient for feature x_p out of all of the m models. This metric is formalized as Equation 2, where function I(γ) equals to 1 if condition γ is satisfied, 0 otherwise.

I m p o r t a n c e (x_{p}) = \frac{1}{m} \sum_{i = 1}^{m} \sum_{k = 1}^{n} w_{p}^{i} / w_{k}^{i}

(1)

A v e r a g e_F r e q u e n c y (x_{p}) = \frac{1}{m} \sum_{i = 1}^{m} I (w_{p}^{i} \neq 0) .

(2)

After computing the importance and the average frequency of each feature, we were able to rank features accordingly. Features above a certain rank are regarded as potentially important features, which, may serve as early evidence for either the presence or absence of NE.

3. Results

An average of 796.6 blank sparse features (occurring in less than 1% of examples within a cohort) were eliminated from each cohort. Table 3 shows that after feature and cohort construction steps, cohorts had 213.8 ICD-9 codes on average and 336.8 CPT codes on average, constituting 552.6 features on average including Age and Race. Models had an average of 45.06 features with non-zero coefficients. In other words, an average of 507.54 features were eliminated during model training. In the subsequent sections, we report the results of 1) the balanced performance of 300 predictive models on the NE prediction and 2) the importance of the features.

Table 3:

Summary of Cohort Features

	# Sparse Features Eliminated	# ICD-9 codes	# CPT codes	# Selected Features
Mean	796.6	213.8	336.8	45.06
95% Confidence Interval	[782.1, 811.0]	[212.7, 214.9]	[335.1, 338.5]	[41.34, 48.78]

Open in a new tab

3.1. The balanced AUC, precision, sensitivity and specificity

For each of the 300 predicted models, we recorded their AUC score, precision, sensitivity, and specificity. The distribution of the predictive models as a function of AUC is shown in Table 5. It can be seen that 89% of the models achieved an AUC larger than 0.8. The mean of the AUC scores is 0.869 and its distribution is shown in Figure 2.

Table 5:

The number of positive cases with an assigned code from CPT group 829.

CPT code	Cases with Code	Description
82947	16	Glucose; quantitative, blood (except reagent strip)
82950	69	Glucose; post glucose dose (includes glucose)
82951	14	Glucose; tolerance test, 3 specimens (includes glucose)
82952	15	Glucose; tolerance test, each additional beyond 3 specimens
82962	22	Blood glucose by glucose monitoring devices cleared by the FDA for home use
82977	4	Gamma Glutamyl Transferase

Open in a new tab

Histogram of AUC scores of 300 models. The normal distribution in red curve is generated based on the obtained mean and the standard deviation.

A Kolmogorov-Smirnov Test²¹ failed to reject a null hypothesis that the sample distribution of AUC scores satisfies a normal distribution at the 0.05 significance level. The 95% confidence interval for the mean of AUC scores was 0.8625 to 0.8745. The small confidence interval suggests that proposed framework can support NE prediction models with both stable and high predictive performance.

Our models also achieved an average precision of 0.8080 within a 0.8000 to 0.8159 95% confidence interval, an average sensitivity of 0.8079 within 0.7982 to 0.8177, and specificity of 0.8104 within 0.8014 to 0.8194.

3.2. Features Importance

On average, there were a small number of important features (~50) in each predictive model. Generally, over 90% features were eliminated in the models. To orient the audience, we only report on features that, on average, have an importance value greater than or equal to 0.05, and an average frequency value greater than or equal to 0.2. These two measures suggested 15 features, shown in Figure 3, for further investigation. Intuitively, features such as ICD-9 code 660.0 for obstructed labor by malposition of fetus at onset of labor and 660.6 for a failed trail of labor, make sense as positive predictors and are consistent with the finding that acute intrapartum events can lead to NE¹. Note that the features that appear to represent observations after a birth, such as negative predictors ICD-9: V24 and CPT: 99238, are most likely from a previous delivery the mother had. These features that occur after a mother’s previous deliveries are still considered prenatal features with respect to her subsequent deliveries.

The 15 features with average frequency ≥ 0.2 and average importance ≤ 0.05.

Age range and race category were included in the model to control for variability of age and race between case and controls. Importance and frequency of these potential confounding variables, shown in Figure 4, are relatively minimal, indicating that the distribution of age and race is similar between case and controls.

Average importance and frequency for Age Range and Race Category.

CPT code 82947 was a positive feature that exhibited a much higher average importance than any other feature and appeared in all 300 models. To verify that the cases were not dominated by this feature, we calculated the number of cases having that code and found only 16 cases that had this feature. Furthermore, we investigated if there were any codes related with 82947 which appeared in the model. Given that CPT code billing procedure maintains that codes for related procedures are grouped by prefix, we identified codes with 829 as the first three digits and calculated the number of cases having each of them, as shown in Table 5. If these codes are correlated with each other, in the context of the NE prediction objective, then all of our models will likely be dominated by the 829 procedure group. To further investigate, we measured the correlation between pairs of codes, and the correlation matrix between the CPT codes observed in the 829 group is visualized in Figure 5. From the figure, it can be seen that, surprisingly, little correlation exists between the codes aside from 82951 and 82952.

A plot of the correlation matrix for CPT codes in the 829 group. Crosses indicate that the relationship between the corresponding pair of codes failed to be confirmedas significant.Blue and red represent positive and negative relationships respectively. The larger the circle the greater the correlation.

4. Discussion

This work represents a prediction model of NE built around early maternal variables. Our prediction AUC of 87% on a relatively small number of features about an expecting mother is consistent with the hypothesis that cases of NE can be the result of distal risk factors, rather than acute intrapartum events. Since our model was trained with widely used medical billing codes, the same approach is easily replicable by other HCOs that are interested in modeling their maternal population.

Out of the selected features, a form of glucose test (CPT 82947) appeared as a highly important positive predictor. We offer two possible interpretations for this result. First, glucose tests would accompany a case of pregnancy induced diabetes, which could be a real risk factor for NE. Pregnancy induced diabetes is linked to fetal macrosomia²², a condition of high birth weight due to increased fetal glucose intake, and high birth weight is a well known risk factor for birth trauma and NE^1,3,22. Secondly, glucose tests could be ordered for maternal patients new to VUMC. These new maternal patients may have been identified as predisposed to a high-risk delivery and redirected to VUMC, the premier medical institution in the area. The fact that all cases fall within a years length of encounter, discussed further below, supports this conclusion.

We acknowledge that this is a pilot study and has several limitations. First, the findings may be biased by the way in which high risk cases present to the VUMC. As the leading academic medical center in the area, VUMC receives patient cases that are often more complex or require more specialized care than other institutions to handle. As an artifact, many of the high-risk patients exhibited relatively short observational windows in the VUMC EMR. Figure 6a and 6b illustrates the length of observation history of mothers in the EMR before delivery windows for control and case patients, respectively. Mothers who had babies with NE had one year or less of medical history, which accentuates the concern that positive cases may have complications before receiving care at VUMC, introducing ascertainment bias into the data. Although we attempted to control for this confounding variable by collecting features within one year of the delivery in the controls, the models may, to some degree, be learning the distribution of diagnosis and procedures a patient receives after they are classified as high risk. Second, the number of NE cases in this study was relatively small and may not capture sufficient information of mothers with NE babies to perform prediction. The data observed was taken from a single healthcare organization, and thus the patient population may be specific to VUMC. Our approach, however, can be applied with the data from any HCO that uses standardized billing terminologies.

Distribution of the length of observation (in years) for mothers whose baby (a) did not have NE and (b) those who did have NE. Only features that fell within one year prior to delivery were incorporated in our models.

Based on our findings and the limitations of the study, we outline the following improvements and extensions for the next iteration of this study. To begin, additional maternal and prenatal features can be investigated such as maternal lab values, in particular glucose values, and fetal ultrasound results. Medication data was previously included in the model, using the same binary feature construction scheme, but these features had low importance in the resulting model and no improvement in prediction was observed. Perhaps a non binary feature encoding may improve the signal received from medication prescriptions. Finally, we hope to understand the billing practices and clinical workflow behind the top features in order infer their potentially causal relationship with NE.

5. Conclusion

NE is a lethal condition that threatens millions of infants with premature death or lifelong morbidity. To date, the pathology of this complex affliction has eluded thorough understanding. Almost all existing NE studies are unable to perform a prenatal risk assessment of NE, making it challenging to adapt efficient interventions to prevent or treat NE in a timely manner. This paper introduces the first data-driven approach to predicting NE before an infant is born, and provides a mechanism to identify high-risk maternal factors for NE. The evaluation relied on a retrospective study of maternal EMR data at a large academic medical center. We considered a representation of the maternal condition centered around symptoms, diagnosis, and procedures, and trained classification models that consistently predict with high performance. However, this study is still preliminary in its scope and size. Further investigation and clinical interpretation of the reported features is required to understand the model, and replication of the study with other populations is needed to arrive at a better picture of the NE pathology. We believe this investigation opens up new and easily replicable approaches to understanding NE, and even may prelude a prediction system for recommending early NE interventions through the electronic health system.

Table 4:

Performance of the NE models.

	AUC	Precision	Sensitivity	Specificity
Mean	0.8685	0.8080	0.8079	0.8104
95% Confidence Interval	[0.8625, 0.8745]	[0.8000, 0.8159]	[0.7982, 0.8177]	[0.8014, 0.8194]
K-S Test	Normal	Normal	Normal	Normal

Open in a new tab

Acknowledgements

This research was supported, in part, by grants T15LM00740, R00LM011933, and R01LM010207 of the National Library of Medicine, National Institutes of Health.

Footnotes

The VUMC creates an EMR for each delivered infant.

The threshold |ω| > 0 is used to define the importance of features.

References

1.Academy College of Obstetricians and Gynecologists. Neonatal Encephalopathy and Neurologic Outcome. Pediatrics. (Second Edition) 2014;133(5):e1482–8. [Google Scholar]
2.Kurinczuk J, White-Koning M, Badawi N. Epidemiology of neonatal encephalopathy and hypoxicischaemic encephalopathy. Early Hum Dev. 2010;86(6):329–38. doi: 10.1016/j.earlhumdev.2010.05.010. [DOI] [PubMed] [Google Scholar]
3.Badawi N, Kurinczuk J, Keogh J, Alessandri L, O’Sullivan F, Burton P, et al. Antepartum risk factors for newborn encephalopathy: the Western Australian case-control study. BMJ. 1998;317(7172):1549–53. doi: 10.1136/bmj.317.7172.1549. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Blume H, Loch C, Li C. Neonatal Encephalopathy and Socioeconomic Status. Archives of Pediatrics & Adolescent Medicine. 2007;161(7):663. doi: 10.1001/archpedi.161.7.663. [DOI] [PubMed] [Google Scholar]
5.Bahubali G, Vishnu B, Ramachandra R, Nandakumar S, Adhisivam B, Rojo J, et al. Antenatal and intrapartum risk factors for perinatal asphyxia: A case control study. Curr Pediatr Res. 2017;17(2):119–22. [Google Scholar]
6.Kaye D. Antenatal and intrapartum risk factors for birth asphyxia among emergency obstetric referrals in Mulago Hospital, Kampala, Uganda. East Afr Med J. 2004;80(3):140–3. doi: 10.4314/eamj.v80i3.8683. [DOI] [PubMed] [Google Scholar]
7.Milulescu A, Vladareanu S, Filipescu A, Mureanu N, Vladareanu R. Maternal risk factors for the neonatal hypoxic-ischemic encephalopathy. Preliminary results. Romanian Society of Ultrasonography in Obstetrics and Gynecology. 2017;12:135–8. References. [Google Scholar]
8.Jones R, Heep A, Odd D. Biochemical and clinical predictors of hypoxicischemic encephalopathy after perinatal asphyxia. The Journal of Maternal-Fetal & Neonatal Medicine. 2017:1–6. doi: 10.1080/14767058.2017.1297790. [DOI] [PubMed] [Google Scholar]
9.Lundgren C, Brudin L, Wanby A, Blomberg M. Ante- and intrapartum risk factors for neonatal hypoxic ischemic encephalopathy. The Journal of Maternal-Fetal & Neonatal Medicine. 2017:1–7. doi: 10.1080/14767058.2017.1321628. [DOI] [PubMed] [Google Scholar]
10.Martinez-Biarge M., Diez-Sebastian J., Wusthoff C., Mercuri E, Cowan F. Antepartum and Intrapartum Factors Preceding Neonatal Hypoxic-Ischemic Encephalopathy. PEDIATRICS. 2013;132(4):e952–e959. doi: 10.1542/peds.2013-0511. [DOI] [PubMed] [Google Scholar]
11.Landon M, Hauth J, Leveno K, Spong C, Leindecker S, Varner M, et al. Maternal and perinatal outcomes associated with a trial of labor after prior Cesarean delivery. New Engl J Med. 2004;351(25):2581–9. doi: 10.1056/NEJMoa040405. [DOI] [PubMed] [Google Scholar]
12.Spong C, Landon M, Gilbert S, Rouse D, Leveno K, Varner M, et al. Risk of Uterine Rupture and Adverse Perinatal Outcome at Term After Cesarean Delivery. Obstet Gynecol. 2007;110(4):801–7. doi: 10.1097/01.AOG.0000284622.71222.b2. [DOI] [PubMed] [Google Scholar]
13.Lilford R, Groot H, Moore P, Bingham P. The relative risks of caesarean section (intrapartum and elective) and vaginal delivery: a detailed analysis to exclude the effects of medical disorders and br J Obstet Gynaecol. 1990;97(10):883–92. doi: 10.1111/j.1471-0528.1990.tb02442.x. [DOI] [PubMed] [Google Scholar]
14.Yang X, Sun S. Comparison of maternal and fetal complications in elective and emergency cesarean section: a systematic review and meta-analysis. Arch Gynecol Obstet. 2017;296(3):503–12. doi: 10.1007/s00404-017-4445-2. [DOI] [PubMed] [Google Scholar]
15.Cooke R. Chorioamnionitis, maternal fever and neonatal encephalopathy. Developmental Medicine & Child Neurology. 2008;50(1):9–9. doi: 10.1111/j.1469-8749.2007.00009.x. [DOI] [PubMed] [Google Scholar]
16.Shalak L, Perlman J, Jackson G, Laptook A. Depression at Birth in Term Infants Exposed to Maternal Chorioamnionitis: Does Neonatal Fever Play a Role? Journal of Perinatology. 2005;25(7):447–452. doi: 10.1038/sj.jp.7211326. [DOI] [PubMed] [Google Scholar]
17.Giuse DA. Supporting communication in an integrated patient record system. AMIA Annu Symp Proc. 2003:1065. [PMC free article] [PubMed] [Google Scholar]
18.Haibo He, Garcia E. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering. 2009;21(9):1263–84. [Google Scholar]
19.Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005;67(2):301–320. [Google Scholar]
20.Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Soft. 2010;33(1):1–22. [PMC free article] [PubMed] [Google Scholar]
21.Massey FJ., Jr The Kolmogorov-Smirnov test for goodness of fit. Journal of the American statistical Association. 1951;46(253):68–78. [Google Scholar]
22.KC K., Shakya S, Zhang H. Gestational Diabetes Mellitus and Macrosomia: A Literature Review. Annals of Nutrition and Metabolism. 2015;66(2):14–20. doi: 10.1159/000371628. [DOI] [PubMed] [Google Scholar]

[r1-2840672] 1.Academy College of Obstetricians and Gynecologists. Neonatal Encephalopathy and Neurologic Outcome. Pediatrics. (Second Edition) 2014;133(5):e1482–8. [Google Scholar]

[r2-2840672] 2.Kurinczuk J, White-Koning M, Badawi N. Epidemiology of neonatal encephalopathy and hypoxicischaemic encephalopathy. Early Hum Dev. 2010;86(6):329–38. doi: 10.1016/j.earlhumdev.2010.05.010. [DOI] [PubMed] [Google Scholar]

[r3-2840672] 3.Badawi N, Kurinczuk J, Keogh J, Alessandri L, O’Sullivan F, Burton P, et al. Antepartum risk factors for newborn encephalopathy: the Western Australian case-control study. BMJ. 1998;317(7172):1549–53. doi: 10.1136/bmj.317.7172.1549. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4-2840672] 4.Blume H, Loch C, Li C. Neonatal Encephalopathy and Socioeconomic Status. Archives of Pediatrics & Adolescent Medicine. 2007;161(7):663. doi: 10.1001/archpedi.161.7.663. [DOI] [PubMed] [Google Scholar]

[r5-2840672] 5.Bahubali G, Vishnu B, Ramachandra R, Nandakumar S, Adhisivam B, Rojo J, et al. Antenatal and intrapartum risk factors for perinatal asphyxia: A case control study. Curr Pediatr Res. 2017;17(2):119–22. [Google Scholar]

[r6-2840672] 6.Kaye D. Antenatal and intrapartum risk factors for birth asphyxia among emergency obstetric referrals in Mulago Hospital, Kampala, Uganda. East Afr Med J. 2004;80(3):140–3. doi: 10.4314/eamj.v80i3.8683. [DOI] [PubMed] [Google Scholar]

[r7-2840672] 7.Milulescu A, Vladareanu S, Filipescu A, Mureanu N, Vladareanu R. Maternal risk factors for the neonatal hypoxic-ischemic encephalopathy. Preliminary results. Romanian Society of Ultrasonography in Obstetrics and Gynecology. 2017;12:135–8. References. [Google Scholar]

[r8-2840672] 8.Jones R, Heep A, Odd D. Biochemical and clinical predictors of hypoxicischemic encephalopathy after perinatal asphyxia. The Journal of Maternal-Fetal & Neonatal Medicine. 2017:1–6. doi: 10.1080/14767058.2017.1297790. [DOI] [PubMed] [Google Scholar]

[r9-2840672] 9.Lundgren C, Brudin L, Wanby A, Blomberg M. Ante- and intrapartum risk factors for neonatal hypoxic ischemic encephalopathy. The Journal of Maternal-Fetal & Neonatal Medicine. 2017:1–7. doi: 10.1080/14767058.2017.1321628. [DOI] [PubMed] [Google Scholar]

[r10-2840672] 10.Martinez-Biarge M., Diez-Sebastian J., Wusthoff C., Mercuri E, Cowan F. Antepartum and Intrapartum Factors Preceding Neonatal Hypoxic-Ischemic Encephalopathy. PEDIATRICS. 2013;132(4):e952–e959. doi: 10.1542/peds.2013-0511. [DOI] [PubMed] [Google Scholar]

[r11-2840672] 11.Landon M, Hauth J, Leveno K, Spong C, Leindecker S, Varner M, et al. Maternal and perinatal outcomes associated with a trial of labor after prior Cesarean delivery. New Engl J Med. 2004;351(25):2581–9. doi: 10.1056/NEJMoa040405. [DOI] [PubMed] [Google Scholar]

[r12-2840672] 12.Spong C, Landon M, Gilbert S, Rouse D, Leveno K, Varner M, et al. Risk of Uterine Rupture and Adverse Perinatal Outcome at Term After Cesarean Delivery. Obstet Gynecol. 2007;110(4):801–7. doi: 10.1097/01.AOG.0000284622.71222.b2. [DOI] [PubMed] [Google Scholar]

[r13-2840672] 13.Lilford R, Groot H, Moore P, Bingham P. The relative risks of caesarean section (intrapartum and elective) and vaginal delivery: a detailed analysis to exclude the effects of medical disorders and br J Obstet Gynaecol. 1990;97(10):883–92. doi: 10.1111/j.1471-0528.1990.tb02442.x. [DOI] [PubMed] [Google Scholar]

[r14-2840672] 14.Yang X, Sun S. Comparison of maternal and fetal complications in elective and emergency cesarean section: a systematic review and meta-analysis. Arch Gynecol Obstet. 2017;296(3):503–12. doi: 10.1007/s00404-017-4445-2. [DOI] [PubMed] [Google Scholar]

[r15-2840672] 15.Cooke R. Chorioamnionitis, maternal fever and neonatal encephalopathy. Developmental Medicine & Child Neurology. 2008;50(1):9–9. doi: 10.1111/j.1469-8749.2007.00009.x. [DOI] [PubMed] [Google Scholar]

[r16-2840672] 16.Shalak L, Perlman J, Jackson G, Laptook A. Depression at Birth in Term Infants Exposed to Maternal Chorioamnionitis: Does Neonatal Fever Play a Role? Journal of Perinatology. 2005;25(7):447–452. doi: 10.1038/sj.jp.7211326. [DOI] [PubMed] [Google Scholar]

[r17-2840672] 17.Giuse DA. Supporting communication in an integrated patient record system. AMIA Annu Symp Proc. 2003:1065. [PMC free article] [PubMed] [Google Scholar]

[r18-2840672] 18.Haibo He, Garcia E. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering. 2009;21(9):1263–84. [Google Scholar]

[r19-2840672] 19.Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005;67(2):301–320. [Google Scholar]

[r20-2840672] 20.Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Soft. 2010;33(1):1–22. [PMC free article] [PubMed] [Google Scholar]

[r21-2840672] 21.Massey FJ., Jr The Kolmogorov-Smirnov test for goodness of fit. Journal of the American statistical Association. 1951;46(253):68–78. [Google Scholar]

[r22-2840672] 22.KC K., Shakya S, Zhang H. Gestational Diabetes Mellitus and Macrosomia: A Literature Review. Annals of Nutrition and Metabolism. 2015;66(2):14–20. doi: 10.1159/000371628. [DOI] [PubMed] [Google Scholar]

PERMALINK

Predicting Neonatal Encephalopathy From Maternal Data in Electronic Medical Records

Thomas Li

Cheng Gao, PhD

Chao Yan, MS

Sarah Osmundson, MD

Bradley A Malin, PhD

You Chen, PhD

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.1.1. Dataset

Table 1:

2.1.2. Identification of Cases and Controls

2.2. Methods

Figure 1.

2.2.1. Feature Construction

Table 2:

2.2.2. Cohort Construction

2.2.3. Predictive Models

2.2.4. Performance Evaluation and Feature Ranking

3. Results

Table 3:

3.1. The balanced AUC, precision, sensitivity and specificity

Table 5:

Figure 2.

3.2. Features Importance

Figure 3.

Figure 4.

Figure 5.

4. Discussion

Figure 6.

5. Conclusion

Table 4:

Acknowledgements

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases