Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Nov 1.
Published in final edited form as: Comput Methods Programs Biomed. 2021 Sep 13;211:106397. doi: 10.1016/j.cmpb.2021.106397

Learning the impact of acute and chronic diseases on forecasting neonatal encephalopathy

Eugene Jeong 1, Sarah Osmundson 2, Cheng Gao 1, Digna R Velez Edwards 1,2, Bradley Malin 1,3,4, You Chen 1,4,*
PMCID: PMC8551018  NIHMSID: NIHMS1741443  PMID: 34530389

Abstract

Objective:

There is a wide range of risk factors predisposing to the onset of neonatal encephalopathy (NE), including maternal antepartum/intrapartum comorbidities or events. However, few studies have investigated the difference in the impact of acute and chronic diseases on forecasting NE, which could assist clinicians in choosing the best course of action to prevent NE or reduce its severity and complications. In this study, we aimed to engineer features based on acute and chronic diseases and assess the differences of the impact of acute and chronic diseases on NE prediction using machine learning models.

Materials and methods:

We used ten years of electronic health records of mothers from a large academic medical center to develop three types of features: chronic disease, recurrence of an acute disease, and temporal relationships between acute diseases. Two types of NE prediction models, based on acute and chronic diseases, respectively, were trained with feature selection. We further compared the prediction performance of the models with two state-of-the-art NE forecasting models. The machine learning models ranked the three types of engineered features based on their contributions to the NE prediction.

Results:

The NE model trained on acute disease features showed significantly higher AUC than the model relying on chronic disease features (AUC difference: 0.161, p-value < 0.001). The NE model trained on both acute and chronic disease features achieved the highest average AUC (0.889), with a significant improvement over the best existing model (0.854) with p = 0.0129. Recurrence of “known or suspected fetal abnormality affecting management of mother (655)” was assigned the highest weights in predicting NE.

Conclusions:

Machine learning models based on the three types of engineered features significantly improve NE prediction. Our results specifically suggest that acute disease-associated features play a more important role in predicting NE.

Keywords: Machine Learning, Neonatal Encephalopathy, Eletronic Health Records, Maternal Medical History

1. Introduction

Neonatal encephalopathy (NE) is characterized by disordered neurologic function in the first few days of life in an infant born at or beyond 35 weeks, manifested by difficulty breathing, seizures, low muscle tone and poor reflexes, and reduced level of consciousness[1]. NE is a major cause of neonatal mortality and short- and long-term disabilities [24]. Infants who experience NE may require significant medical care over the course of their life [5]. In one study from the United States, the median cost for an NE survivor requiring therapeutic hypothermia was $58,552 (IQR $32,476 - $130,203) for the neonatal intensive care unit stay alone [6]. However, the indirect costs to the child, family, social services, and education systems are many times greater [2, 710]. Forecasting and preventing NE can reduce mortality, disability, and associated costs.

Investigations into forecasting NE focus on applying by using statistical models to identify risk factors from maternal medical history (e.g., maternal myasthenia gravis or myopathies or diabetes), pregnancy or labor history (e.g., maternal hypertension or pre-eclampsia), or neonates examination (e.g., head circumference abnormalities or dysmorphic features) [1115]. These studies have played a significant role in providing deeper insights into the understanding of NE; however, they identified a limited number of NE risk factors. More recent studies have applied machine learning algorithms to a wide range of risk factors existing in electronic health records (EHRs) data to predict NE [11, 13, 1517]. These algorithms leveraged demographics (e.g., age and race), as well as maternal medical history represented by diagnosis and procedure codes. Natural language processing and deep learning models were applied to EHRs to increase the performance in the prediction of NE. The machine learning-based approaches incorporated a large number of factors into the NE prediction models but did not investigate the contextual information for each of the factors. Therefore, the NE models, as well as the investigated factors were hard to be interpreted. For instance, these investigations are limited in that they did not quantify the impact of acute and chronic diseases on the NE forecast, which can assist healthcare organizations (HCOs) and clinicians in prioritizing the management of chronic and acute diseases to prevent NE or reduce the severity of its complications[18, 19].

This study aims to assess differences in the impact of acute and chronic diseases on predicting NE via feature engineering and machine learning. Specifically, we develop three types of features based on EHRs: 1) chronic illness, 2) recurrence of acute diseases, and 3) the transition from one acute disease to another. We then build NE prediction models on these features and further compare our NE prediction models with the state-of-the-art models to demonstrate the effectiveness of the engineered three types of features.

2. Materials and methods

The workflow of the NE prediction framework is shown in Figure 1. First, we used inclusion and exclusion criteria to create a case cohort (i.e., mothers who gave birth to infants with NE) from EHRs. Next, propensity score matching was used to identify controls (mothers who did not give birth to infants with NE) with matched age, race, singletons, mode of delivery, and delivery year. We develop feature engineering approaches to create three types of features and then apply feature selection to reduce the space into a manageable form.

Figure 1.

Figure 1.

The overall framework of Neonatal Encephalopathy prediction models

Next, we train three types of NE prediction models: 1) Acute Disease Model (ADM), 2) Chronic Disease Model (CDM), and 3) Acute and Chronic Disease Model (ACDM), and identify features that are assigned with a higher weight in the NE prediction. Finally, we compare the three types of models with the two state-of-the-art NE prediction models.

2.1. Cohort construction

We collected all EHR data for women who gave birth, along with data on their newborns, from January 2007 to May 2017, at Vanderbilt University Medical Center (VUMC). The diagnostic codes in the EHR were documented in either the ICD-9 or ICD-10 (after 2015) codes. Given that ICD codes were designed to support insurance billing and not clinical research, we converted the billing codes into phenome-wide association study (PheWAS) codes or Phecodes. The PheCode system was built upon the ICD-9 and ICD-10, and it consists of 1866 hierarchical groups of related billing codes that describe a similar clinical syndrome or medical problem[20, 21]. ICD codes were mapped from Phecodes using Phecodes Map (available at https://phewascatalog.org/phecodes). The Institutional Review Board at Vanderbilt University Medical Center approved this study.

Women who delivered a NE infant were designated as cases, and those who did not have NE newborns were controls. If a case had ≥ 2 NE babies delivered, we only predicted the first time of NE baby delivery since the prediction of the second time of the NE baby delivery might be affected by the incident of the first time. To identify mothers who gave birth to a NE infant, we reviewed the clinical notes of both the infants and the mothers by using a set of keywords and processes as described in Li et al[15]. The regular expression is built on a set of keywords: {asphyxia, HIE, hypoxemia, hypoxic-ischemic encephalopathy, neonatal encephalopathy, reflexes, seizures, consciousness, respiration, magnesium, metabolic, chorioamnionitis, anesthesia}. Notably, a clinical expert then manually reviewed the clinical notes of all NE candidates to exclude records with negations, such as “no NE.” This process yielded 99 cases and 31,054 controls.

The cases and controls were matched by age at delivery, race, singletons, mode of delivery, and year of delivery. The cases were all singletons. The delivery year was matched to exclude the possibility of a change in the diagnostic method (e.g., changes from ICD-9 to ICD-10). Propensity score matching was performed using the nearest-neighbor matching method. After propensity score matching was performed, 99 cases were matched with 297 controls (mean differences as 0 and variance ratio as 1).

2.2. Feature engineering

We define three sets of features based on Phecodes, which are used to characterize a mother’s disease: 1) chronic disease, 2) recurrence of an acute disease, and 3) temporal relationships between acute diseases. Chronic disease was chosen as a feature to examine the effects of chronic diagnosis history on NE prediction, while recurrence of an acute disease and temporal relationships between acute diseases were used to investigate the effect of the management of acute disease on NE prediction. We used the Chronic Condition Indicator (CCI) created by the Agency for Healthcare Quality and Research (AHRQ) to categorize the diagnosis codes as chronic or non-chronic diseases [2224]. The CCI defines chronic diseases as those lasting 12 months or longer and either 1) placing limitations on self-care, independent living, and social interactions or 2) resulting in the need for ongoing intervention with medical products, services, and special equipment. The CCI has been widely used as a standardized classification of chronic conditions in medicine literature[2527]. The CCI tool was expanded to identify four types of conditions: acute, chronic, both, and not applicable. We used ICD codes that fall into acute and chronic categories to engineer our three types of features.

Chronic conditions develop slowly over time and require ongoing clinical follow-up. They have the potential to put women at high risk for pregnancy complications [14]. Chronic diseases were represented as binary features, indicating whether or not the mothers were diagnosed with each of the chronic conditions. Examples of chronic disease features are shown in Figure 2A.

Figure 2.

Figure 2.

The three types of features.

A recurrence feature corresponds to the recurrence of an acute disease. The recurrence of diseases during the antepartum period can increase the risk of adverse fetal complications[28, 29]. We developed eight subtypes of recurrence features. These are defined as the number of times an acute disease was observed within the investigated period, the average value and standard deviation of durations between recurrences, the median value and standard error of durations between recurrences, minimum and maximum values of durations between recurrences, and the interquartile range of durations between recurrences. Examples of recurrence features are shown in Figure 2B.

Temporal features include information about the temporal order of acute diseases. We added temporal features to see whether occurrences of temporal patterns between diagnoses before delivery are a risk factor for NE. Several disease-spectrum-wide studies have shown that the order of diagnoses affects the risk of future perinatal complications[3033]. The details by which temporal features are engineered are provided in Appendix A.1. Examples of temporal features are shown in Figure 2B.

2.3. Feature selection

Features occurring in less than a total of 10 matched cases and controls were excluded. Those features have little influence on the NE predictions due to an extremely small number of samples. Additionally, feature selection algorithms are used to select the most significant and relevant features to improve classification accuracy.

Among various feature selection algorithms, we used a wrapper method based on sequential forward feature selection (SFS)[34, 35], which is a greedy search algorithm that finds the optimal feature subset by evaluating all features individually and then select the one that results in the best performance. The SFS process adds one feature at a time in each iteration until a pre-set criterion is reached. The stopped criteria in our study is that the AUC does not increase by more than 0.5% for 3 subsequent iterations. We performed hyper-parameter tuning and feature selection simultaneously in the inner loop of leave-one-out nested cross-validation because the investigated feature subsets may impact the values of hyper-parameters (Appendix A.2). The process of feature selection, hyper-parameter tuning, and leave-one-out cross-validation are shown in Figure A1 in Appendix.

2.4. Training models

We constructed models based on typical machine learning frameworks to compare and contrast their individual performance: elastic net regularized logistic regression (LR), random forest (RF), and XGBoost. Logistic regression was chosen since it is one of the most common algorithms used for binary classification. We chose a random forest (RF) prediction algorithm because it is a robust method that performs consistently well with noisy and missing data. XGBoost uses the same model representation and inference as gradient-boosted decision trees but incorporates a different training algorithm.

To learn the impact of acute and chronic diseases on NE forecasts, we trained three types of models: CDM relying on chronic disease features, ADM based on recurrence and temporal features, and ACDM using a combination of all three types of features.

In addition, we compared the ACDM models with two state-of-the-art NE forecasting models. The first model, from Li et. al.[15], represented ICD-9 and CPT codes as the binary features (i.e., present/absent) and applied an elastic net logistic regression (ENLR), which we refer to as ENLR-Li. The second model, from Gao et al. [13], mapped the ICD-9 codes into vector space using word2vec and built a Long Short-Term Memory (LSTM) model, which we refer to as LSTM-Gao. To enable a direct comparison, we mapped ICD codes to Phecodes and reconstructed these prediction models using our matched cases and controls. We did not use LSTM as a machine learning algorithm to train our NE prediction models because the LSTM discovers relevant time-dependent relationships on its own, whereas our feature engineering steps are not compatible with the LSTM architecture.

We employed the leave-one-out nested cross-validation method for hyper-parameters tuning, feature selection, and evaluation of performances of the classification models. The training and testing processes are depicted in Figure A1 in Appendix.

2.5. Performance evaluation

We measured sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the receiver operating characteristic curve (AUC), for the performance of models. Since we used leave-one-out nested cross validation, we did not get 95% confidence intervals of model performances (e.g., AUC). We used DeLong’s test [34], a nonparametric test, to assess if there is a statistically significant difference in the AUC between two models (i.e., CDM vs. ADM, and ACDM vs. LSTM-Gao). To determine the importance of each feature, we used the Gini importance, which is directly derived from the resulting random forest models. The higher the Gini importance score of a feature, the more significant role in prediction[36, 37]. Meanwhile, we applied a Fisher’s exact test and Mann-Whitney U test to categorical and continuous variables, respectively, to measure the ability of a variable in distinguishing a NE case from control samples.

3. Results

3.1. Clinical and demographic characteristics of the patients

Since we identified the controls by matching their age, race, singletons, mode of delivery, and year of delivery, the distributions of these factors are exactly the same in cases and controls (Table 1).

Table 1.

A summary of the characteristics of the study population.

Cases (n = 99) Controls (n = 297)

Age
19 and under 7 (7.1) 21 (7.1)
20–24 19 (19.2) 51 (19.2)
25–29 32 (32.3) 96 (32.3)
30–34 26 (26.3) 78 (26.3)
35 and over 15 (15.1) 45 (15.1)
Race
White 61 (61.6) 183 (61.6)
Black 26 (26.3) 81 (26.3)
Other 12 (12.1) 36 (12.1)
Mode of delivery
Natural 37 (37.4) 111 (37.4)
Cesarean 62 (62.6) 186 (62.6)
Year of Birth
2007–2010 20 (20.2) 60 (20.2)
2011–2014 49 (49.5) 147 (49.5)
2015–2017 30 (30.3) 90 (30.3)

Data are n (%) unless otherwise specified

The results of chronic and acute features in distinguishing cases from controls are shown in Table A.1. The results show that there is no significant difference in the ability of chronic diseases in distinguishing cases from controls. The results show 13 out of 304 features can distinguish cases from controls. Among the 13 features, eight are recurrent, and five are temporal features. They are related to Known or suspected fetal abnormality affecting the management of mother (Phecode: 655; ICD-9: 655–656, 659 and 678; ICD-10: O30, O35-O36, O43, and O75-O77).

3.2. Performance comparison of the ADM and CDM models

The total number of all three types of features was 304 (Table A.1.). Consequently, the total number of features used for training the CDM was 43. For the ADM and ACDM, we applied feature selection, and the number of features in both ADM and ACDM was 54. The ADM models were trained based on 35 recurrent features and 19 temporal features, whereas the ACDM models were trained based on 8, 32, and 14 chronic, recurrence, and temporal features, respectively.

The performance measures for all ADM and CDM models are listed in Table A.2. The average AUC for CDM was 0.681, while the average AUC for ADM was 0.842. The random forest models had the highest AUC in both ADM and CDM models. The differences in NE prediction performance between the best ADM and CDM models are shown in Table 2. Their ROCs are shown in Figure 3. The best ADM model outperformed the CDM model, with an AUC difference of 0.161 and p-value < 0.001. This suggests that chronic features provide limited information in NE prediction, as the AUC of the models based on them is approximately 0.7.

Table 2.

Performance of the best CDM and ADM models.

Model AUC Sensitivity Specificity PPV NPV

CDM (Random forest) 0.681 0.7 0.644 0.4 0.864
ADM (Random forest) 0.842 0.25 0.983 0.833 0.797

CDM, chronic disease; ADM, acute disease; AUC, area under the receiver operating characteristics; PPV, positive predictive value; NPV, negative predictive value.

Figure 3.

Figure 3.

Receiver operating characteristic curves of the CDM and ADM models.

3.3. Performance comparison of the ACDM with the state-of-the-art

The performance measures for all ACDM, ENLR-Li, and LSTM-Gao models are provided in Table A.2. The ACDM achieved the best AUC (0.889), compared to ENLR-Li (AUC: 0.828), and LSTM-Gao (AUC: 0.854), as shown in Table 3. Their ROCs are shown in Figure 4. The comparison of AUCs after DeLong’s test showed that ACDM significantly outperformed both ENLR-Li and LSTM-Gao, with an AUC difference of 0.0612 (p < 0.001) and 0.0351 (p = 0.0129).

Table 3.

Performance of the best ACDM, ENLR-Li, and LSTM-Gao models.

Model AUC Sensitivity Specificity PPV NPV
ACDM (Random forest) 0.889 0.45 0.933 0.692 0.836
ENLR-Li 0.828 0.1 0.967 0.5 0.76
LSTM-Gao 0.854 0.5 0.9 0.625 0.844

ACDM, acute and chronic disease; AUC, area under the receiver operating characteristics; PPV, positive predictive value; NPV, negative predictive value.

Figure 4.

Figure 4.

Receiver operating characteristic curves of the ACDM, ENLR-Li, and LSTM-Gao models.

3.4. Feature importance measured by the ACDM model

The twenty features with the largest Gini importance according to the best ACDM model are listed in Table 4. Thirteen of these features (65%) corresponded to recurrence of acute diseases, such as the recurrence of Known or suspected fetal abnormality affecting management of mother (Phecode: 655). Five the features (25%) were temporal relations among acute illnesses, such as malposition and malpresentation of fetus or obstruction (Phecode: 652; ICD-9: 652–653, and 660; ICD-10: O32–33, and O64–66) ➔ Known or suspected fetal abnormality affecting management of mother (Phecode: 655).

Table 4.

The best ACDM model generated the twenty features with the highest Gini importance.

Rank Features Gini importance

1 The number of 655 recurrences 0.108
2 SD between 655 recurrences 0.0919
3 SE between 655 recurrences 0.0809
4 Interquartile range of 655 recurrences 0.0731
5 Mean duration between 665 recurrences 0.0599
6 Mean duration between 655 recurrences 0.0592
7 Maximum duration between 655 recurrences 0.0565
8 Median duration between 655 recurrences 0.0511
9 The number of 665 recurrences 0.0432
10 Maximum duration between 665 recurrences 0.0398
11 Minimum duration between 655 recurrences 0.0379
12 665→655 0.0302
13 652→655 0.0233
14 653→655 0.0196
15 293 0.0155
16 652→653 0.0132
17 655→665 0.0112
18 The number of 653 recurrences 0.0108
19 Mean duration between 635 recurrences 0.0104
20 785 0.00818

Phecode 293: “Symptoms involving head and neck”

Phecode 635: “Hemorrhage during pregnancy; childbirth and postpartum”

Phecode 652: “Malposition and malpresentation of fetus or obstruction”

Phecode 653: “Problems associated with amniotic cavity and membranes”

Phecode 655: “Known or suspected fetal abnormality affecting management of mother”

Phecode 665: “Obstetrical/birth trauma”

Phecode 785: “Abdominal pain”

Two of the features (10%) were chronic illnesses, including Symptoms involving head and neck (Phecode: 293; ICD-9: 784; ICD-10: R90-R22) and Abdominal pain (Phecode: 785; ICD-9: 789;ICD-10: R10). The Gini importance for each feature is listed in Table A.3.

4. Discussion

This investigation augments the existing knowledge base for NE in that it differentiates acute from chronic diseases when predicting NE. The CDM models showed poor performance on predicting NE compared with the ADM models, while the ACDM models accounting for the impact of both acute and chronic diseases achieved significantly better performance than the current state of the art.

Beyond improving NE prediction performance, our work also generated many recurrent and temporal features, which were assigned the highest weights in predicting NE. The twenty features with the highest importance identified by the best ACDM model included well-known risk factors. For example, Known or suspected fetal abnormality affecting management of mother identified by our model is a known predictor of NE [11, 12, 38]. Also, Obstetrical/birth trauma (Phecode: 665; ICD-9: 664–665, and 767; ICD-10: O70–71, P10–15, and P52), which was identified as an important predictor in the ACDM model, can occur from forceps or difficult cesarean delivery and is highly associated with NE [39, 40]. Hemorrhage in early pregnancy is another example of a well-known NE factor identified by the ACDM model. It is recognized that in any situation where there is a hemorrhage in early pregnancy, the infant can be affected if not delivered fast enough [41].

Beyond the features confirmed by knowledge in the literature, we also discovered several potential novel NE risk factors. Specifically, the temporal relation of Problems associated with amniotic cavity and membranes (Phecode: 653) ➔ Known or suspected fetal abnormality affecting management of mother (Phecode: 655) was one of the most important features for predicting NE. One likely interpretation is that if maternal membranes have been ruptured for a prolonged period of time, it can cause infection and ultimately lead to NE [42, 43]. A second temporal relation we discovered was Malposition and malpresentation of fetus or obstruction (Phecode: 652) ➔ Known or suspected fetal abnormality affecting management of mother (Phecode: 655. This also appears to have face value since malposition usually refers to breech presentation and most women have a cesarean if this is discovered, what we suspect is happening here is either a preterm infant of a fetus in a breech presentation or the obstruction is leading to head entrapment and NE. Symptoms involving head and neck and Abdominal pain are not generally considered predictors of NE, but they were the novel predictors in our results, which suggest other avenues for research.

The duration between the initial onset and recurrence of an acute disease also plays a key role in predicting NE. For instance, the number of recurrences for fetal abnormality affecting management of mother and mean duration between the recurrences were the 1st and 6th ranked features by the best ACDM model. A higher frequency of an acute disease recurrence with a short duration between the recurrences may reflect a patient with high severity of the disease. The more severe the disease, the more likely a patient will visit the hospital often and receive the same diagnosis.

The recurrent and temporal features generated by this study can provide evidence for clinicians to assess acute diseases in terms of their recurrences and temporal relations among them. This, in turn, can assist clinicians in understanding the progression pathways of NE. Also, the NE predictive results have the potential to assist clinicians in making clinical decisions. For instance, given a set of patients whose due dates of childbirth are on the horizon, the model could be used to predict the risk for them to deliver a NE baby. The predicted results could be sent along with identified risks (chronic and acute features) to clinicians to help them make a decision – which patient and what types of risks need to be managed in advance to prevent NE or reduce its severity. Although the NE prediction model developed in this study can achieve an AUC of 0.889, the recall (0.45) and PPV (0.692) are not high. Thus, there is still a big space for performance improvement.

We created a list of recurrent features — which represent the impact of recurrence of an acute disease — and temporal features— which represent the impact of temporal relationships between acute diseases— and showed their importance in predicting NE. The ranks of these features could provide evidence for researchers when they design hypotheses for clinical trials. For researchers who focus on the prediction of perinatal morbidity, the engineered features we used in this study may be added to their models to improve the prediction performance.

The strength of our study is that we developed approaches to engineer three types of features and designed extensive experiments to measure the difference in the impact of acute and chronic diseases on NE prediction. Both features and experimental designs can be leveraged by the following machine learning-based NE study. Meanwhile, compared to the state-of-the-art NE prediction models, our model achieved the best performance, with an improvement of AUC ranging from 2% to 4%.

However, there are several limitations in this study that should be recognized, which can serve as guidelines for further future investigations. First, we relied upon data from a single organization with a relatively small number of cases, reducing the generalizability of our findings. Further validation studies need to use a larger quantity of EHR data from a variety of hospitals with varying practice patterns and patient populations. Second, we note that, although we adjusted for certain potential confounders (e.g., age, race, singletons, mode of delivery, and year of delivery), there are others that may affect the disease progression (e.g., medications, smoking habits, social determinants, access to health care, health insurance, health inequality, and general health status). In our study, we did not find an efficient way to combine those large number of factors (e.g., over 10 thousand medications) in our setting (a small number of cases). Finally, the reasons why models based on chronic features have worse performances than those based on acute features are unclear. A potential reason would be we engineered a feature for each chronic (binary feature) disease, which carries less information than that an acute disease carries (8 continuous variables per acute condition). Most chronic diseases can occur with varying grades of severity, but, unfortunately, due to the limitation of the EHR data, it is hard to determine the severity of chronic illnesses in our study.

This study investigated the differences in NE prediction models that rely on acute and chronic diseases. The empirical results indicated that acute diseases play a more important role in predicting NE.

Supplementary Material

1

Summary points.

What was already known on the topic

  • Neonatal encephalopathy (NE) is associated with high mortality and morbidity, and antenatal maternal factors have been implicated in the pathogenesis of NE.

  • Many studies have applied machine learning algorithms for the prediction of neonatal encephalopathy using electronic health records.

What this study adds to our knowledge

  • The developed models can predict NE risk ahead of childbirth rather than several hours or days after the delivery.

  • This study demonstrated that acute diseases play a more important role in predicting NE risk than chronic diseases (AUC difference: 0.161, p-value < 0.001).

  • Incorporating the recurrence of an acute disease and temporal relationships between acute diseases is important to predict NE.

  • Symptoms involving head and neck and Abdominal pain are not generally considered predictors of NE, but they indicate as important predictors in our model.

  • The machine-learned NE prediction model achieves a 4% improvement in the AUC over state-of-the-art models.

Highlights.

  • We measure and quantify the impact of acute and chronic diseases on the neonatal encephalopathy (NE) forecast, which can help healthcare organizations and clinicians prioritize the management of chronic and acute diseases to prevent NE.

  • Compared to chronic illnesses, acute diseases occurring before childbirth play a more important role in predicting NE.

  • Recurrences of acute diseases and the transition from one acute disease to another before childbirth were assigned the highest weight in predicting NE.

Acknowledgment

The authors would like to acknowledge everybody who participated in this study.

Funding

This research was supported, in part, by the National Library of Medicine of the National Institutes of Health under Award Number R01LM012854 and T15LM007450.

Footnotes

Declarations of Competing interests

The authors declare that they have no competing interests.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • [1].Executive summary: Neonatal encephalopathy and neurologic outcome, second edition. Report of the American College of Obstetricians and Gynecologists’ Task Force on Neonatal Encephalopathy, Obstetrics and gynecology, 123 (2014) 896–901. [DOI] [PubMed] [Google Scholar]
  • [2].de Vries LS, Jongmans MJ, Long-term outcome after neonatal hypoxic-ischaemic encephalopathy, Arch Dis Child Fetal Neonatal Ed, 95 (2010) F220–224. [DOI] [PubMed] [Google Scholar]
  • [3].Marlow N, Rose AS, Rands CE, Draper ES, Neuropsychological and educational problems at school age associated with neonatal encephalopathy, Arch Dis Child Fetal Neonatal Ed, 90 (2005) F380–387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Lawn J, Shibuya K, Stein C, No cry at birth: global estimates of intrapartum stillbirths and intrapartum-related neonatal deaths, Bull World Health Organ, 83 (2005) 409–417. [PMC free article] [PubMed] [Google Scholar]
  • [5].Kruse M, Michelsen SI, Flachs EM, Bronnum-Hansen H, Madsen M, Uldall P, Lifetime costs of cerebral palsy, Developmental medicine and child neurology, 51 (2009) 622–628. [DOI] [PubMed] [Google Scholar]
  • [6].Massaro AN, Murthy K, Zaniletti I, Cook N, DiGeronimo R, Dizon ML, Hamrick SE, McKay VJ, Natarajan G, Rao R, Richardson T, Smith D, Mathur AM, Children’s Hospital Neonatal C, Intercenter Cost Variation for Perinatal Hypoxic-Ischemic Encephalopathy in the Era of Therapeutic Hypothermia, The Journal of pediatrics, 173 (2016) 76–83 e71. [DOI] [PubMed] [Google Scholar]
  • [7].Donaldson L, An organisation with a memory, Clinical medicine, 2 (2002) 452–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Kancherla V, Amendah DD, Grosse SD, Yeargin-Allsopp M, Van Naarden Braun K, Medical expenditures attributable to cerebral palsy and intellectual disability among Medicaid-enrolled children, Research in developmental disabilities, 33 (2012) 832–840. [DOI] [PubMed] [Google Scholar]
  • [9].Centers for Disease C, Prevention, Economic costs associated with mental retardation, cerebral palsy, hearing loss, and vision impairment--United States, 2003, MMWR. Morbidity and mortality weekly report, 53 (2004) 57–59. [PubMed] [Google Scholar]
  • [10].Juul SE, Comstock BA, Heagerty PJ, Mayock DE, Goodman AM, Hauge S, Gonzalez F, Wu YW, High-Dose Erythropoietin for Asphyxia and Encephalopathy (HEAL): A Randomized Controlled Trial - Background, Aims, and Study Protocol, Neonatology, 113 (2018) 331–338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Martinez-Biarge M, Diez-Sebastian J, Wusthoff CJ, Mercuri E, Cowan FM, Antepartum and intrapartum factors preceding neonatal hypoxic-ischemic encephalopathy, Pediatrics, 132 (2013) e952–959. [DOI] [PubMed] [Google Scholar]
  • [12].Badawi N, Kurinczuk JJ, Keogh JM, Alessandri LM, O’Sullivan F, Burton PR, Pemberton PJ, Stanley FJ, Antepartum risk factors for newborn encephalopathy: the Western Australian case-control study, BMJ, 317 (1998) 1549–1553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Gao C, Yan C, Osmundson S, Malin BA, Chen Y, A Deep Learning Approach to Predict Neonatal Encephalopathy from Electronic Health Records, 2019 IEEE International Conference on Healthcare Informatics (ICHI), 2019, pp. 1–7. [Google Scholar]
  • [14].Badawi N, Kurinczuk JJ, Keogh JM, Alessandri LM, O’Sullivan F, Burton PR, Pemberton PJ, Stanley FJ, Intrapartum risk factors for newborn encephalopathy: the Western Australian case-control study, BMJ, 317 (1998) 1554–1558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Li T, Gao C, Yan C, Osmundson S, Malin BA, Chen Y, Predicting Neonatal Encephalopathy From Maternal Data in Electronic Medical Records, AMIA Jt Summits Transl Sci Proc, 2017 (2018) 359–368. [PMC free article] [PubMed] [Google Scholar]
  • [16].Ziv E, Tymofiyeva O, Ferriero DM, Barkovich AJ, Hess CP, Xu D, A machine learning approach to automated structural network analysis: application to neonatal encephalopathy, PLoS One, 8 (2013) e78824. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Weiss RJ, Bates SV, Song Y, Zhang Y, Herzberg EM, Chen YC, Gong M, Chien I, Zhang L, Murphy SN, Gollub RL, Grant PE, Ou Y, Mining multi-site clinical data to develop machine learning MRI biomarkers: application to neonatal hypoxic ischemic encephalopathy, Journal of translational medicine, 17 (2019) 385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Popernack ML, Gray N, Reuter-Rice K, Moderate-to-Severe Traumatic Brain Injury in Children: Complications and Rehabilitation Strategies, Journal of pediatric health care : official publication of National Association of Pediatric Nurse Associates & Practitioners, 29 (2015) e1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Smith MA, Seibel NL, Altekruse SF, Ries LA, Melbert DL, O’Leary M, Smith FO, Reaman GH, Outcomes for children and adolescents with cancer: challenges for the twenty-first century, Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 28 (2010) 2625–2634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Wei WQ, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ, Gamazon ER, Cox NJ, Roden DM, Denny JC, Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record, PLoS One, 12 (2017) e0175508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Wu P, Gifford A, Meng X, Li X, Campbell H, Varley T, Zhao J, Carroll R, Bastarache L, Denny JC, Theodoratou E, Wei WQ, Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation, JMIR medical informatics, 7 (2019) e14325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].H.C.a.U. Project, Chronic Condition Indicator (CCI) for ICD-10-CM.
  • [23].H.C.a.U. Project, Chronic Condition Indicator (CCI) for ICD-9-CM.
  • [24].Perrin EC, Newacheck P, Pless IB, Drotar D, Gortmaker SL, Leventhal J, Perrin JM, Stein RE, Walker DK, Weitzman M, Issues involved in the definition and classification of chronic health conditions, Pediatrics, 91 (1993) 787–793. [PubMed] [Google Scholar]
  • [25].Berry JG, Rodean J, Hall M, Alpern ER, Aronson PL, Freedman SB, Brousseau DC, Shah SS, Simon HK, Cohen E, Marin JR, Morse RB, O’Neill M, Neuman MI, Impact of Chronic Conditions on Emergency Department Visits of Children Using Medicaid, The Journal of pediatrics, 182 (2017) 267–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Berry JG, Ash AS, Cohen E, Hasan F, Feudtner C, Hall M, Contributions of Children With Multiple Chronic Conditions to Pediatric Hospitalizations in the United States: A Retrospective Cohort Analysis, Hospital pediatrics, 7 (2017) 365–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Garg T, Young AJ, Kost KA, Danella JF, Larson S, Nielsen ME, Kirchner HL, Burden of Multiple Chronic Conditions among Patients with Urological Cancer, The Journal of urology, 199 (2018) 543–550. [DOI] [PubMed] [Google Scholar]
  • [28].Silversides CK, Harris L, Haberer K, Sermer M, Colman JM, Siu SC, Recurrence rates of arrhythmias during pregnancy in women with previous tachyarrhythmia and impact on fetal and neonatal outcomes, The American journal of cardiology, 97 (2006) 1206–1212. [DOI] [PubMed] [Google Scholar]
  • [29].Sep SJ, Schreurs MP, Bekkers SC, Kruse AJ, Smits LJ, Peeters LL, Early-pregnancy changes in cardiac diastolic function in women with recurrent pre-eclampsia and in previously pre-eclamptic women without recurrent disease, BJOG : an international journal of obstetrics and gynaecology, 118 (2011) 1112–1119. [DOI] [PubMed] [Google Scholar]
  • [30].Beck MK, Westergaard D, Jensen AB, Groop L, Brunak S, Temporal Order of Disease Pairs Affects Subsequent Disease Trajectories: The Case of Diabetes and Sleep Apnea, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 22 (2017) 380–389. [DOI] [PubMed] [Google Scholar]
  • [31].Jensen AB, Moseley PL, Oprea TI, Ellesoe SG, Eriksson R, Schmock H, Jensen PB, Jensen LJ, Brunak S, Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients, Nature communications, 5 (2014) 4022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Beck MK, Jensen AB, Nielsen AB, Perner A, Moseley PL, Brunak S, Diagnosis trajectories of prior multi-morbidity predict sepsis mortality, Scientific reports, 6 (2016) 36624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Giannoula A, Gutierrez-Sacristan A, Bravo A, Sanz F, Furlong LI, Identifying temporal patterns in patient disease trajectories using dynamic time warping: A population-based study, Scientific reports, 8 (2018) 4216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Janecek A, Gansterer W, Demel M, Ecker G, On the Relationship Between Feature Selection and Classification Accuracy, in: Yvan S, Huan L, Iñaki I, Louis W, Yves Van de P (Eds.) Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, PMLR, Proceedings of Machine Learning Research, 2008, pp. 90–105. [Google Scholar]
  • [35].Kumar V, Minz S, Feature Selection: A literature Review, Smart Comput. Rev, 4 (2014) 211–229. [Google Scholar]
  • [36].Genuer R, Poggi J-M, Tuleau-Malot C, Variable selection using random forests, Pattern Recogn. Lett, 31 (2010) 2225–2236. [Google Scholar]
  • [37].Breiman L, Random Forests, Machine Learning, 45 (2001) 5–32. [Google Scholar]
  • [38].Kurinczuk JJ, White-Koning M, Badawi N, Epidemiology of neonatal encephalopathy and hypoxic-ischaemic encephalopathy, Early human development, 86 (2010) 329–338. [DOI] [PubMed] [Google Scholar]
  • [39].Linder N, Linder I, Fridman E, Kouadio F, Lubin D, Merlob P, Yogev Y, Melamed N, Birth trauma--risk factors and short-term neonatal outcome, The journal of maternal-fetal & neonatal medicine : the official journal of the European Association of Perinatal Medicine, the Federation of Asia and Oceania Perinatal Societies, the International Society of Perinatal Obstet, 26 (2013) 1491–1495. [DOI] [PubMed] [Google Scholar]
  • [40].Haupt H, [Brain damage in the newborn due to birth injury], Munchener medizinische Wochenschrift, 103 (1961) 1011–1014 concl. [PubMed] [Google Scholar]
  • [41].Scheidegger S, Held U, Grass B, Latal B, Hagmann C, Brotschi B, National A, Cooling Register G, Association of perinatal risk factors with neurological outcome in neonates with hypoxic ischemic encephalopathy, The journal of maternal-fetal & neonatal medicine : the official journal of the European Association of Perinatal Medicine, the Federation of Asia and Oceania Perinatal Societies, the International Society of Perinatal Obstet, (2019) 1–8. [DOI] [PubMed] [Google Scholar]
  • [42].Lindstrom K, Hallberg B, Blennow M, Wolff K, Fernell E, Westgren M, Moderate neonatal encephalopathy: pre- and perinatal risk factors and long-term outcome, Acta obstetricia et gynecologica Scandinavica, 87 (2008) 503–509. [DOI] [PubMed] [Google Scholar]
  • [43].Miller SP, Ramaswamy V, Michelson D, Barkovich AJ, Holshouser B, Wycliffe N, Glidden DV, Deming D, Partridge JC, Wu YW, Ashwal S, Ferriero DM, Patterns of brain injury in term neonatal encephalopathy, The Journal of pediatrics, 146 (2005) 453–460. [DOI] [PubMed] [Google Scholar]
  • [44].Smith JC, Denny JC, Chen Q, Nian H, Spickard A 3rd, Rosenbloom ST, Miller RA, Lessons learned from developing a drug evidence base to support pharmacovigilance, Applied clinical informatics, 4 (2013) 596–617. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES