Dynamic Mortality Risk Predictions for Children in Intensive Care Units: Development and Validation of Machine Learning Models

Eduardo A Trujillo Rivera; James M Chamberlain; Anita K Patel; Hiroki Morizono; Julia A Heneghan; Murray M Pollack

doi:10.1097/PCC.0000000000002910

. Author manuscript; available in PMC: 2023 May 5.

Published in final edited form as: Pediatr Crit Care Med. 2022 May 5;23(5):344–352. doi: 10.1097/PCC.0000000000002910

Dynamic Mortality Risk Predictions for Children in Intensive Care Units: Development and Validation of Machine Learning Models

Eduardo A Trujillo Rivera ¹, James M Chamberlain ², Anita K Patel ³, Hiroki Morizono ⁴, Julia A Heneghan ⁵, Murray M Pollack ⁶

PMCID: PMC9117400 NIHMSID: NIHMS1772580 PMID: 35190501

Abstract

Objective:

Assess a machine learning method of serially updated mortality risk.

Design:

Retrospective analysis of a national database (Health Facts®, Cerner Corporation, Kansas City, MO).

Setting:

Hospitals caring for children in intensive care units (ICUs).

Patients:

A total of 27,354 admissions cared for in ICUs from 2009 to 2018.

Interventions:

None

Main Outcome:

Hospital mortality risk estimates determined at 6-hour time periods during care in the ICU. Models were truncated at hour 180 due to decreased sample size secondary to discharges and deaths.

Measurements and Main Test Results.

The Criticality Index, based on physiology, therapy, and care intensity, was computed for each admission for each time period and calibrated to hospital mortality risk (Criticality Index - Mortality (CI-M)) at each of 29 time periods (initial assessment: 6 hours; last assessment: 180 hours). Performance metrics and clinical validity were determined from the held-out test sample (n = 3,453, 13%). Discrimination assessed with the area under the receiver operating characteristic curve was 0.852 (95% CI: 0.843, 0.861) overall and ≥ 0.80 for all individual time periods. Calibration assessed by the Hosmer-Lemeshow goodness-of-fit test showed good fit overall (p = 0.196) and was statistically not significant for 28 of the 29 time periods. Calibration plots for all models revealed the intercept ranged from −0.002 to 0.009, the slope ranged from 0.867 to 1.415, and the R² ranged from 0.862 to 0.989. Clinical validity assessed using population trajectories and changes in the risk status of admissions (clinical volatility) revealed clinical trajectories consistent with clinical expectations and greater clinical volatility in deaths than survivors (p < 0.001).

Conclusion and Relevance:

Machine learning models incorporating physiology, therapy, and care intensity can track changes in hospital mortality risk during intensive care. The CI-M’s framework and modeling method are potentially applicable to monitoring clinical improvement and deterioration in real-time.

Keywords: pediatrics, severity of illness, machine learning, pediatric intensive care unit, dynamic modeling, criticality index, mortality risk

Introduction

Mortality risk is the traditional measure of severity of illness for children in intensive care units (ICUs).. Static mortality risk measures, including the Pediatric Risk of Mortality (PRISM) score and the Pediatric Index of Mortality (PIM), were developed with a focus on quality assessment and, therefore, focus on the early portion of an ICU stay (1, 2). These were not designed for nor widely adopted for use in individual patients, at least in part because rapid changes in severity and therapies and the frequent acquisition of data made risk assessments obsolete soon after they were obtained. The Pediatric Logistic Organ Failure (PELOD) score can be collected serially, but PELOD is updated only daily and the relatively simplistic framework limits the potential to enhance clinician assessment (3).

Early identification of ICU patients at increasing or high risk of in-hospital mortality might improve clinical and operational decision-making and potentially improve outcomes. Dynamic mortality risk assessment, objectively tracking changes in mortality risk, could support the clinical decision making of health care providers, especially those with less experience or expertise. Machine learning methods applicable to the continuous flow of ICU data could assess improvement or deterioration of individual patients. A recent “proof of concept” report showed potential for this approach (4). Recently, we developed the Criticality using machine learning, a new pediatric severity index for pediatric inpatients based on physiology, therapies, and intensity of care (5–7). This framework follows the prominent threads of critical care severity research including early qualitative assessments, identification and quantification of therapies, and physiological profiles (8–12). The Criticality Index was computed every 6 hours and was calibrated to the probability of ICU care. As the Criticality Index increased for ICU patients, the intensity and complexity of care increased. Therefore, we hypothesized that we could use the Criticality Index to estimate probability of hospital death in ICU patients (Criticality Index – Mortality (CI-M)) using 6 hour time periods to measure dynamic changes in mortality risk. Prior to use in individuals, predictive models developed from populations that purport to measure changing clinical status should maximize predictive performance, especially calibration at the different times used to assess change because, if they are not well-calibrated, change cannot be reliably assessed. Additionally, other measures of construct validity should be used to assess “real-life” scenarios.

The overall aim of this study was the assessment of a machine learning method of serially updating mortality risk for children in ICUs. In this analysis, we estimated mortality risk using the Criticality Index and used serial models over time to maximize performance for individual admissions. We hypothesized that this method would perform well in terms of discrimination, calibration and other performance metrics and would reflect serial risk changes for survivors and deaths. A priori, we anticipated trajectories of mortality risk as follows: the risks of high-risk deaths would stay high or increase over time; the risks of high-risk survivors would decrease; the risks of low-risk deaths would increase; and the risks of low-risk survivors would stay low. We also anticipated that deaths would, in general, have a more volatile clinical course than survivors when assessed with serial mortality risks.

Methods

Sample

The dataset was derived from Health Facts® (Cerner Corporation, Kansas City, MO), a voluntary, comprehensive de-identified clinical database on admissions from U.S. hospitals with a Cerner data-use agreement. Data are obtained from the electronic health record (EHR), are date- and time-stamped, and include admission and demographic information, laboratory results, medications information, diagnostic and procedure codes, vital signs, respiratory data, and hospital outcome. Not all data are available for all admissions. Health Facts® is representative of the United States (13) and used in previous care assessments, including the APACHE score (14), the Criticality Index (5–7), and pediatric medication use (15, 16).

Details on data cleaning and definitions, medication classification, and diagnostic classifications have been published (5, 6, 17). For this analysis, we emphasized the multi-system nature of ICU disease by categorizing all systems of dysfunction based on the discharge ICD-9 and ICD-10 classifications. Inclusion criteria included age <22 years (18) and care in the ICU from 2009 to 2018. Exclusion criteria included hospital length of stay (LOS) >100 days, ICU LOS >30 days, or neonatal ICU care. There were 88 hospitals with an average of 311 admissions/hospital. The dataset included hospitals contributing both small and large samples to expand the generalizability of the methodology. The study was approved by the Children’s National institutional review board (protocol #9282). The information and methodology conform to the “Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD)” guidelines and are included in the Supplemental Digital Data (SDD) 1.

Variables

The Criticality Index independent variables consist of 6 vital signs, 30 routinely measured laboratory variables, and medications classified into 143 categories, and also included the number of measurements/time period (5). The variables, definitions, and statistics have been reported in detail (5–7). Detailed data on the independent variables and imputed values are shown in SDD 1. The primary diagnosis was not used for modeling because it was determined at discharge. In the initial studies, positive pressure ventilation was used only to classify high-intensity ICU care. For this study, we used positive pressure ventilation as an independent variable and categorized an admission as “Yes” for ventilation for all time periods following its implementation. Criteria for positive pressure were continuous positive airway pressure, positive end expiratory pressure, and/or peak inspiratory pressure Consistent with other machine learning models, we imputed laboratory results and vital signs using the last known result(4, 19–21). These imputed values have been reported (5, 6) and were identified by setting the count equal to zero (see below).

The outcome for all CI-M models (see below) was hospital outcome as survival or death. Time to death was not included in the model since a majority of deaths in many pediatric ICUs are associated with brain death or withdrawal and limitations of care (22).

Machine Learning Methodology and Statistical Analysis

The hospital course was discretized into consecutive 6-hour time periods. Models were independently developed for each time period (n=29) from the second (hour 6) to the 30th (hour 180). Modeling was truncated at 180 hours to ensure an appropriate sample for goodness-of-fit (GOF) testing since the test sample size was reduced to <750 admissions and 30 deaths due to discharges and deaths. For each time period, 87% of admissions were randomly used for model development and 13% for testing. We used individual machine leaning models for each time interval to maximize predictive performance because the eventual intent is to apply these methods to individual admissions.

We computed each admission’s Criticality Index for each time period using the previously published machine learning methodology and detailed in SDD 1. (5, 6, 7). Previously, we demonstrated that as the Criticality Index increases, the intensity and complexity of care increases. For each time period, we added a final step that calibrated the Criticality Index and positive pressure ventilation variables to hospital outcome of survival or death using generalized thin plate splines for binary outcomes (23, 24). The performance of the individual CI-M models and the composite of all models was assessed using a) discrimination (area under the receiver operating characteristic curve [AUROC]) and calibration and b) specificity, precision, F1 score, Mathew’s correlation coefficient (MCC), and negative predictive value at sensitivities of 0.85, 0.90, and 0.95. Calibration was assessed using two approaches. First, we computed the Hosmer-Lemeshow GOF test for each time period using risk interval with at least 250 admissions. Second, we assessed calibration plots of the observed and expected proportion of deaths with linear regression using at least 10 risk intervals with equal numbers of admissions. Performance metrics for the calibration plots included the regression line slope, intercept, and the coefficient of determination (R²). We also compared the observed to predicted number of outcomes for each of the calibration plot risk intervals and reported the percentage of intervals with no statistical evidence of difference (p > .05) (25–28). For perfect calibration, the intersect would be zero, the slope would be 1, and the R² would be 1. We expected approximately 5% of the observed versus expected outcome intervals in the calibration plots would be statistically different. Since real-time application would be dependent, in part, on the worst performance at any time period, we examined the worst performing models for each of the calibration methods using the AUROCs, the GOF tests, the calibration plots, and observed versus expected proportion of deaths, and compared these data to the first and the last time period models.

Construct validity was assessed using population trajectories and mortality risk changes in consecutive time periods (clinical volatility) for individuals. First, we plotted the clinical trajectories (6) in the following groups based on the mortality risk determined from the first time period: Deaths in the highest risk decile of deaths (high-risk deaths), survivors in the highest risk decile of survivors (high-risk survivors), deaths in the lower risk deciles of deaths (low-risk deaths), and survivors in the lower risk deciles of survivors (low-risk survivors). A priori, we expected that the mortality risk of high-risk deaths would remain high risk or increase over time, the risk of high-risk survivors would decrease, the risk of low-risk deaths would increase, and the risk of low-risk survivors would remain low. Second, we computed the change in mortality risk for individuals over consecutive time intervals and evaluated all changes, and the maximum positive (clinical deterioration) and maximum negative (clinical improvement) change per admission. A priori, we expected that deaths would have larger changes representing more clinical volatility (instability). These comparisons used the Mann–Whitney U test.

Results

A total of 27,354 admissions were included (Table 1). The mortality rate was 1.8% and the median ICU length of stay was 53 hours (25^th, 75^th percentiles: 24, 117 hours). Respiratory, cardiovascular, and neurological dysfunction occurred in 34.8% (n=9515), 31.2% (n=8541), and 25.7% (n=7024) of admissions, respectively. The test sample started with 3453 admissions and 97 deaths with a decrease due to discharges and deaths of 6–7% per time period during the first 48 hours in the ICU and 3–6% per time period after 48 hours. A total of 47.5% (n=46) of the deaths occurred in the first 48 hours.

Table 1.

Population characteristics of children in intensive care units.

Characteristic	Data
N	27,354
Female^b	12,477 (45.6%)
Race^b
Black	7,516 (27.45%)
Caucasian	14,445 (52.8%)
Other-Unknown	5,392 (19.7%)
Age (Months)^a	155.5 (26.6, 224.8)
Hospital LOS (hours)^a	86 (45, 171)
ICU LOS (hours)^a	53 (24, 117)
Hospital Mortality^b	501(1.8%)
Positive Pressure Ventilation^b,c	3,881 (14.2%)
Systems of Dysfunction^b,d
Respiratory System	9,515 (34.78%)
Nervous System	7,024 (25.68%)
Cardiovascular System	8,541 (31.22%)
Infectious and Parasitic Diseases	3,047 (11.14%)
Gastrointestinal System	6,464 (23.63%)
Hematology	5,790 (21.17%)
Musculoskeletal System	2,832 (10.35%)
Endocrine, Nutritional, Metabolic Diseases	9,281 (33.93%)
Neoplasms	1,099 (4.02%)
Injury and Poisonings	9,035 (33.03%)
Mental Disorders	7,379 (26.98%)

Open in a new tab

Median (25^th percentile, 75^th percentile)

N (%)

Criteria for positive pressure were continuous positive airway pressure, positive end expiratory pressure, and/or peak inspiratory pressure.

Categorization based on all discharge ICD-9 and ICD-10 data.

The AUROC assessing discrimination ranged from 0.797 to 0.894 for individual CI-M models (Figure 1a) and 0.852 (95% CI: 0.843, 0.861) for all time periods (Figure 1b). Calibration assessed by GOF testing revealed that all models except one at 72 hours of ICU care had p-values of >.05 (see below). The calibration plots for all models for all time periods (Figure 2a) had intercepts ranging from −0.002 (hour 60) to 0.009 (hour 126), slopes ranging from 0.867 (hour 108) to 1.415 (hour 60), and R²’s ranging from 0.862 (hour 72) to 0.989 (hour 96). For all models combined (Figure 2b), the GOF significance level was 0.195, the intercept was 0.010, the slope was 0.903, and the R² was 0.862. Comparison of the observed versus expected proportion of deaths in all calibration plot risk interval found that 290 of 294 risk intervals (98.6%) were not statistically different. The overall and individual performance metrics at sensitivities of 0.85, 0.90, and 0.95 are shown in SDD 2. Overall, at a sensitivity of 0.90, specificity = 0.630 (95% CI: 0.625, 0.634), precision = 0.069 (95% CI: 0.065, 0.072), negative predictive value = 0.995 (95% CI: 0.995, 0.996), MCC = 0.184 (95% CI: 0.177, 0.191), and F1 = 0.127 (95% CI: 0.123, 0.132). Individual models were similar.

Figure 2. — Calibration plot performance metrics for all Criticality Index – Mortality models and for all models combined

a. Separate models were developed for 29 6-hour time periods from hour 6 to hour 180. Calibration plot metrics include the regression line intercepts (black), slopes (tan) and R²’s (blue). The shaded areas are the 95% confidence intervals.

b. Composite calibration plot. The dashed line is the line of identity and the solid line is the regression line. The plot includes 294 risk intervals from all models. The comparison of observed proportions to expected proportions of outcome for each interval were not statistically different (p > 0.05) for 290 (98.64%) of the risk intervals. Each interval had ≥ 170 admissions. The Hosmer-Lemeshow goodness-of-fit test for the overall calibration was 0.195.

The two worst performing time periods were hour 72, the only time period when the GOF was <0.05, and hour 60 when the calibration plot regression line slope = 1.415 and R² = 0.862. The AUROCs, calibration plots, and observed versus expected comparisons for the risk intervals of these time periods and comparison data from the first and last time periods are shown in SDD 2, . Neither hour 72 nor hour 60 were associated with other indicators of poor performance.

Construct validity for potential clinical application used population trajectories and clinical volatility. The mortality risk trajectories (Figure 3) demonstrated the a priori expectations were correct. The high-risk death cohort had the highest mortality risks, that remained high throughout the ICU course. The high-risk survivor cohort and the low-risk death cohort had similar mortality risks for the first 48 hours but the survivor cohort risks improved and reached the level of the low-risk survivor group at approximately 5 days while the low-risk deaths cohort had mortality risks that rose slightly over the ICU course. The low-risk survivors’ mortality risks remained very low. Deaths demonstrated more clinical volatility than survivors (Figure 4a). The average increase in mortality risk in consecutive time periods was 0.021 for deaths and 0.006 for survivors (p<0.001) and the average decrease was 0.022 for deaths and 0.008 for survivors (p<0.001). Figures 4b and 4c show the maximum deterioration and maximum improvement for survivors and deaths, illustrating the increased higher volatility of deaths compared to survivors. The average maximum deterioration for deaths was 0.050 and 0.015 for survivors (p<0.001) and the average maximum improvement was 0.063 for deaths and 0.022 for survivors (p<0.001).

Figure 3. — Trajectories for high risk deaths and survivors and low risk deaths and survivors. Risk was computed with the Criticality Index – Mortality. High risk indicates the highest mortality risk decile determined from the initial time period. The shaded areas are the 95% confidence intervals. The mortality risks cutpoints for the high risk cohorts were 15.1% for the deaths and 11.3% for the survivors. The trajectories were constructed from the total sample with the following sample sizes: low risk survivors: 24,168, admissions, high risk survivors: 2,685 admissions, low risk deaths: 451 admissions, high risk deaths: 50 patients.

Figure 4. — Change in mortality risk for individual admissions in consecutive time periods. Risk was computed with the Criticality Index – Mortality. Data are shown for survivors (tan) and deaths (black). The frequencies (vertical axis) are standardized such that the added areas under the bars are 1 for each outcome group in each figure.

a. Change in mortality risk for all consecutive time intervals. Deaths demonstrate increased clinical volatility compared to survivors. The average increase in mortality was 0.021 for deaths and 0.006 for survivors (p<0.001) and the average decrease was 0.022 for deaths and 0.008 for survivors (p<0.001).

b. Maximum mortality risk increase (clinical deterioration). The average maximum deterioration for deaths was 0.050 and 0.015 for survivors (p<0.001)

c. Maximum mortality risk decrease (clinical improvement). The average maximum decrease was 0.063 for deaths and 0.022 for survivors (p<0.001).

Discussion

This study demonstrated the applicability of the CI-M for assessments of serial changes in mortality risk for individuals. The Criticality Index was initially calibrated to probability of ICU care and has been applied to determining future care needs for hospitalized children (7). This study expands its use by recalibrating it to mortality risk. We used individual machine leaning models for 6-hour time intervals from 6 hours to 180 hours of ICU care to maximize predictive performance. Overall, model performance metrics were very good. The composite AUROC was 0.852 and, perhaps more important for potential individual applications, calibration was excellent with 28 of 29 Hosmer-Lemeshow GOF tests p > .05, observed versus expected outcomes in 290 of 294 (98.6%) risk intervals of the individual models not statistically different, and calibration plot metrics were very good for the overall performance and the individual models. We evaluated in detail the two time periods with the worst calibration metrics and found that, while single metrics were notable, the assessment with multiple metrics indicated that the calibration metrics were not consistently reduced. Thus, the methodology consistently performed well over the first seven and a half days of ICU care. Clinical validity including clinical trajectories and clinical volatility supported the potential for use in individuals by capturing the anticipated patterns of illness in survivors and deaths. Eventually, however, the methodology must be implemented and assessed in real-world use including comparison to other severity assessment methods.

Experienced intensivists are excellent at assessing patients using clinical snapshots (29). The flow and amount of ICU data for patient assessment is substantial. For example, over 200 variables have been estimated as useful for care of ventilated patients. (30). The ability to successfully integrate this large amount of changing information on a continuous basis lies beyond the capabilities of most knowledgeable, and perceptive care givers. (29). Less experienced and less skilled providers will integrate this information less well. Therefore, the addition of continuous or frequently updated risk assessments for children in ICUs could result in detection of clinical deterioration or improvement that might have been unappreciated, providing an opportunity for earlier interventions and the potential for improved outcomes.

The CI-M has a strong conceptual framework based in physiology, therapeutics, and therapeutic intensity. It is currently calibrated to six-hour time intervals but could be calibrated for continuous data. Single time period machine learning models for ICU mortality prediction have also performed very well (21, 31, 32). Our results are consistent with the recent single site “proof of concept” analysis demonstrating the potential for a machine learning approach to follow changes in clinical status (4). However, there are notable differences between the studies. First, our analysis is based on a multi-centered dataset, demonstrating the methodology can be applied widely. Second, our method uses a transparent set of variables (5, 6). Since these methodologies should be expected to supplement physician judgement, we excluded variables that might have included clinician’s prognosis. Third, the neural networks were different. Since both approaches have positive attributes, future studies will be needed to assess the best approach. Fourth, we took special effort to assess its potential for use in individuals including both GOF and calibration plots at 6 hour intervals, comparison of observed versus expected outcomes in 294 risk intervals, and assessments of mortality risk changes over time. Notably, we detected substantial differences in clinical volatility between deaths and survivors that could be useful for bedside use.

At this time, it is not clear if machine learning models developed on a multi-institutional dataset should be applied to individual sites, or if individual sites should apply the experience and machine learning approach from multi-institutional research to their sites. It is likely that models developed in individual units using a conceptual framework such as the Criticality Index or that of Aczon et al. supplemented with site-specific data and local decision making cutpoints will have improved performance and clinical applicability (4, 33). Optimizing performance is important if the application is intended for individual patients.

There are limitations to this analysis. First, we used a retrospective EHR dataset. While we have utilized this dataset for multiple pediatric analyses, (5–7, 15, 16) prospective data collection could add additional data elements. Second, a more extensive exploration of machine learning methods might have uncovered better performing models. Our methodology was primarily designed to evaluate our overall aims by assessing the potential for eventual clinical use, but not necessarily optimizing it for clinical use. Third, we did not evaluate the relative importance of individual data elements. Previously, we analyzed the global factors associated with prediction in our models and found that a relatively limited dataset composed primarily of physiological data and medication classes may be sufficient (34). Finally, prior to use as a patient-level assessment method, this methodology or any other will need real-world validation, including background or silent use and correlations of changes in clinical status with changes in the mortality risk computed by the models, and analyses of their usefulness vis-a-vis providers of different experience and expertise.

Conclusion

Changing mortality risks for pediatric ICU patients can be measured with machine learning models based on the Criticality Index. Discrimination and calibration for all CI-M models was very good, and clinical validity was demonstrated using clinical trajectories and clinical volatility. The CI-M framework and modeling method are potentially applicable to monitoring patient improvement and deterioration in real-time.

Supplementary Material

Supplemental Data File (.doc, .tif, pdf, etc.)_1

NIHMS1772580-supplement-Supplemental_Data_File___doc___tif__pdf__etc___1.docx^{(37.9KB, docx)}

Supplemental Data File (.doc, .tif, pdf, etc.)_2

NIHMS1772580-supplement-Supplemental_Data_File___doc___tif__pdf__etc___2.docx^{(154.4KB, docx)}

Conflicts of Interest and Source of Funding.

Supported by philanthropic support from Mallinckrodt LLC, and award numbers UL1TR001876 from the NIH National Center for Advancing Translational Sciences, and KL2TR001877 from the NIH National Center for Advancing Translational Sciences (Anita Patel). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Center for Advancing Translational Sciences or the National Institutes of Health.

Copyright form disclosure: Dr. Trujillo Rivera’s institution received funding from Childrens National Medical Center. Drs. Chamberlain, Patel, Morizona, and Pollack’s institutions received funding from Mallinckrodt. Drs. Patel and Morizona’s institutions received funding from Awards Ul1TR001876 and KL2TR001877 from the National Institutes of Health (NIH), National Center for Advancing Translational Sciences. Drs. Patel, Morizona, and Pollack received support for article research from the NIH. Dr. Morizona disclosed having a 16% share as a founder of Cogthera LLC, a company that will develop drugs for cognitive impairment. Dr. Pollack’s institution received funding from NIH. Dr. Heneghan disclosed that she does not have any potential conflicts of interest.

Footnotes

The authors have no conflicts of interest.

Code availability: The data extraction was done in SQL and R with custom code. The data preparation and exploration, model development and evaluation, generation of tables, plots, and results were done in R with custom code. Code for specific tasks is available upon request.

Article tweet: Dynamic mortality risk assessments for children in intensive care units can track changes in severity of illness.

Contributor Information

Eduardo A. Trujillo Rivera, George Washington University School of Medicine and Health Sciences, Washington, DC..

James M. Chamberlain, Department of Pediatrics, Division of Emergency Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences, Washington, DC..

Anita K. Patel, Department of Pediatrics, Division of Critical Care Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences, Washington, DC..

Hiroki Morizono, Children’s National Research Institute, Associate Research Professor of Genomics and Precision Medicine, George Washington University School of Medicine and Health Sciences, Washington, DC.

Julia A. Heneghan, Department of Pediatrics, Division of Critical Care Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences, Washington, DC. Current affiliation: Department of Pediatrics, Division of Critical Care Medicine, University of Minnesota Masonic Children’s Hospital, Minneapolis, MN..

Murray M. Pollack, Department of Pediatrics, Division of Critical Care Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences, Washington, DC..

References

1.Pollack MM, Holubkov R, Funai T, et al. The Pediatric Risk of Mortality Score: Update 2015. Pediatr Crit Care Med. 2016;17(1):2–9. doi: 10.1097/PCC.0000000000000558 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Straney L, Clements A, Parslow RC, et al. Paediatric index of mortality 3: an updated model for predicting mortality in pediatric intensive care. Pediatr Crit Care Med. 2013;14(7):673–681. doi: 10.1097/PCC.0b013e31829760cf [DOI] [PubMed] [Google Scholar]
3.Leteurtre S, Duhamel A, Salleron J, Grandbastien B, Lacroix J, Leclerc F; Groupe Francophone de Réanimation et d’Urgences Pédiatriques (GFRUP). PELOD-2: an update of the PEdiatric logistic organ dysfunction score. Crit Care Med. 2013. Jul;41(7):1761–73. doi: 10.1097/CCM.0b013e31828a2bbd. PMID: 23685639.doi:10.1016/S0140–6736(03)13908–6 [DOI] [PubMed] [Google Scholar]
4.Aczon MD, Ledbetter DR, Laksana E, Ho LV, Wetzel RC. Continuous Prediction of Mortality in the PICU: A Recurrent Neural Network Model in a Single-Center Dataset. Pediatr Crit Care Med. 2021;22(6):519–529. doi: 10.1097/PCC.0000000000002682 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Rivera EAT, Patel AK, Chamberlain JM, et al. Criticality: A New Concept of Severity of Illness for Hospitalized Children. Pediatr Crit Care Med. 2021;22(1):e33–e43. doi: 10.1097/PCC.0000000000002560 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Rivera EAT, Patel AK, Zeng-Treitler Q, et al. Severity Trajectories of Pediatric Inpatients Using the Criticality Index. Pediatr Crit Care Med. 2021;22(1):e19–e32. doi: 10.1097/PCC.0000000000002561 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Trujillo Rivera EA, Chamberlain JM, Patel AK, et al. Predicting Future Care Requirements Using Machine Learning for Pediatric Intensive and Routine Care Inpatients. Crit Care Explor. 2021;3(8):e0505. Published 2021 Aug 10. doi: 10.1097/CCE.0000000000000505 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Cullen DJ, Civetta JM, Briggs BA, Ferrara LC. Therapeutic intervention scoring system: a method for quantitative comparison of patient care. Crit Care Med. 1974;2(2):57–60. [PubMed] [Google Scholar]
9.Keene AR, Cullen DJ. Therapeutic Intervention Scoring System: update 1983. Crit Care Med. 1983;11(1):1–3. doi: 10.1097/00003246-198301000-00001. [DOI] [PubMed] [Google Scholar]
10.Yeh TS, Pollack MM, Holbrook PR, Fields AI, Ruttiman U. Assessment of pediatric intensive care--application of the Therapeutic Intervention Scoring System. Crit Care Med. 1982;10(8):497–500. doi: 10.1097/00003246-198208000-00002 [DOI] [PubMed] [Google Scholar]
11.Proulx F, Gauthier M, Nadeau D, Lacroix J, Farrell CA. Timing and predictors of death in pediatric patients with multiple organ system failure. Crit Care Med. 1994;22(6):1025–1031. doi: 10.1097/00003246-199406000-00023 [DOI] [PubMed] [Google Scholar]
12.Pollack MM, Ruttimann UE, Getson PR. Accurate prediction of the outcome of pediatric intensive care. A new quantitative method. N Engl J Med. 1987;316(3):134–139. doi: 10.1056/NEJM198701153160304 [DOI] [PubMed] [Google Scholar]
13.DeShazo JP, Hoffman MA. A comparison of a multistate inpatient EHR database to the HCUP Nationwide Inpatient Sample. BMC Health Serv Res. 2015;15:384. Published 2015 Sep 15. doi: 10.1186/s12913-015-1025-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Bryant C, Johnson A, Henson K, et al. : Apache Outcomes Acriss Venues Predicing Inpatient Mortality Using Electronic Medical Record Data. Critical Care Medicine 2018, 46:8. doi: 10.1097/01.ccm.0000528072.62631.7f [DOI] [Google Scholar]
15.Heneghan JA, Trujillo Rivera EA, Zeng-Treitler Q, et al. Medications for Children Receiving Intensive Care: A National Sample. Pediatr Crit Care Med. 2020;21(9):e679–e685. doi: 10.1097/PCC.0000000000002391 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Patel AK, Trujillo-Rivera E, Faruqe F, et al. Sedation, Analgesia, and Neuromuscular Blockade: An Assessment of Practices From 2009 to 2016 in a National Sample of 66,443 Pediatric Patients Cared for in the ICU. Pediatr Crit Care Med. 2020;21(9):e599–e609. doi: 10.1097/PCC.0000000000002351 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Fung KW, Kapusnik-Uner J, Cunningham J, Higby-Baker S, Bodenreider O. Comparison of three commercial knowledge bases for detection of drug-drug interactions in clinical decision support. J Am Med Inform Assoc. 2017;24(4):806–812. doi: 10.1093/jamia/ocx010 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Hardin AP, Hackell JM; COMMITTEE ON PRACTICE AND AMBULATORY MEDICINE. Age Limit of Pediatrics. Pediatrics. 2017;140(3):e20172151. doi: 10.1542/peds.2017-2151 [DOI] [PubMed] [Google Scholar]
19.Ma J, Lee DKK, Perkins ME, Pisani MA, Pinker E. Using the Shapes of Clinical Data Trajectories to Predict Mortality in ICUs. Crit Care Explor. 2019;1(4):e0010. Published 2019 Apr 17. doi: 10.1097/CCE.0000000000000010 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Mohamadlou H, Panchavati S, Calvert J, et al. Multicenter validation of a machine-learning algorithm for 48-h all-cause mortality prediction. Health Informatics J. 2020;26(3):1912–1925. doi: 10.1177/1460458219894494 [DOI] [PubMed] [Google Scholar]
21.Ho LV, Aczon M, Ledbetter D, Wetzel R. Interpreting a recurrent neural network’s predictions of ICU mortality risk. J Biomed Inform. 2021;114:103672. doi: 10.1016/j.jbi.2021.103672 [DOI] [PubMed] [Google Scholar]
22.Meert KL, Keele L, Morrison W, et al. End-of-Life Practices Among Tertiary Care PICUs in the United States: A Multicenter Study. Pediatr Crit Care Med. 2015;16(7):e231–e238. doi: 10.1097/PCC.0000000000000520 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Gu C Smoothing Spline ANOVA Models. Springer Series in Statistics; 2002. [Google Scholar]
24.Gu C Smoothing Spline ANOVA Models: R Package gss. Journal of Statistical Software 2014; 58(5):1–25. doi: 10.18637/jss.v058.i05 [DOI] [Google Scholar]
25.Barnard G A New Test for 2×2 Tables. Nature. 1945; 156:177. doi: 10.1038/156177a0 [DOI] [Google Scholar]
26.Calhoun P Exact: Unconditional Exact Test. R package version 2.1. 2020. https://cran.r-project.org/web/packages/Exact/ [Google Scholar]
27.Martin Andres A, Silva Mato A. Choosing the optimal unconditioned test for comparing two independent proportions. Computational Statistics & Data Analysis. 1994; 17(5): 555–574. [Google Scholar]
28.Mehrotra DV, Chan IS, Berger RL. A cautionary note on exact unconditional inference for a difference between two independent binomial proportions. Biometrics. 2003;59(2):441–450. doi: 10.1111/1541-0420.00051 [DOI] [PubMed] [Google Scholar]
29.Gutierrez G Artificial Intelligence in the Intensive Care Unit. Crit Care. 2020;24(1):101. Published 2020 Mar 24. doi: 10.1186/s13054-020-2785-y [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Morris AH. Human Cognitive Limitations. Broad, Consistent, Clinical Application of Physiological Principles Will Require Decision Support. Ann Am Thorac Soc 2018;15(Suppl 1):S53–S56. doi: 10.1513/AnnalsATS.201706-449KV [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Kim SY, Kim S, Cho J, et al. A deep learning model for real-time mortality prediction in critically ill children. Crit Care. 2019;23(1):279. Published 2019 Aug 14. doi: 10.1186/s13054-019-2561-z [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Lee B, Kim K, Hwang H, et al. Development of a machine learning model for predicting pediatric mortality in the early stages of intensive care unit admission. Sci Rep. 2021;11(1):1263. Published 2021 Jan 13. doi: 10.1038/s41598-020-80474-z [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Brajer N, Cozzi B, Gao M, et al. Prospective and External Evaluation of a Machine Learning Model to Predict In-Hospital Mortality of Adults at Time of Admission. JAMA Netw Open. 2020;3(2):e1920733. Published 2020 Feb 5. doi: 10.1001/jamanetworkopen.2019.20733 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Ahmad MA, Trujillo-Rivera EA, Pollack MM, Eckert C, Patel A, Teredesai A. (2021) “Machine Learning Approaches for Patient State Prediction in Pediatric ICUs.” IEEE International Conference on Healthcare Informatics. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data File (.doc, .tif, pdf, etc.)_1

NIHMS1772580-supplement-Supplemental_Data_File___doc___tif__pdf__etc___1.docx^{(37.9KB, docx)}

Supplemental Data File (.doc, .tif, pdf, etc.)_2

NIHMS1772580-supplement-Supplemental_Data_File___doc___tif__pdf__etc___2.docx^{(154.4KB, docx)}

[R1] 1.Pollack MM, Holubkov R, Funai T, et al. The Pediatric Risk of Mortality Score: Update 2015. Pediatr Crit Care Med. 2016;17(1):2–9. doi: 10.1097/PCC.0000000000000558 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Straney L, Clements A, Parslow RC, et al. Paediatric index of mortality 3: an updated model for predicting mortality in pediatric intensive care. Pediatr Crit Care Med. 2013;14(7):673–681. doi: 10.1097/PCC.0b013e31829760cf [DOI] [PubMed] [Google Scholar]

[R3] 3.Leteurtre S, Duhamel A, Salleron J, Grandbastien B, Lacroix J, Leclerc F; Groupe Francophone de Réanimation et d’Urgences Pédiatriques (GFRUP). PELOD-2: an update of the PEdiatric logistic organ dysfunction score. Crit Care Med. 2013. Jul;41(7):1761–73. doi: 10.1097/CCM.0b013e31828a2bbd. PMID: 23685639.doi:10.1016/S0140–6736(03)13908–6 [DOI] [PubMed] [Google Scholar]

[R4] 4.Aczon MD, Ledbetter DR, Laksana E, Ho LV, Wetzel RC. Continuous Prediction of Mortality in the PICU: A Recurrent Neural Network Model in a Single-Center Dataset. Pediatr Crit Care Med. 2021;22(6):519–529. doi: 10.1097/PCC.0000000000002682 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Rivera EAT, Patel AK, Chamberlain JM, et al. Criticality: A New Concept of Severity of Illness for Hospitalized Children. Pediatr Crit Care Med. 2021;22(1):e33–e43. doi: 10.1097/PCC.0000000000002560 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Rivera EAT, Patel AK, Zeng-Treitler Q, et al. Severity Trajectories of Pediatric Inpatients Using the Criticality Index. Pediatr Crit Care Med. 2021;22(1):e19–e32. doi: 10.1097/PCC.0000000000002561 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Trujillo Rivera EA, Chamberlain JM, Patel AK, et al. Predicting Future Care Requirements Using Machine Learning for Pediatric Intensive and Routine Care Inpatients. Crit Care Explor. 2021;3(8):e0505. Published 2021 Aug 10. doi: 10.1097/CCE.0000000000000505 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Cullen DJ, Civetta JM, Briggs BA, Ferrara LC. Therapeutic intervention scoring system: a method for quantitative comparison of patient care. Crit Care Med. 1974;2(2):57–60. [PubMed] [Google Scholar]

[R9] 9.Keene AR, Cullen DJ. Therapeutic Intervention Scoring System: update 1983. Crit Care Med. 1983;11(1):1–3. doi: 10.1097/00003246-198301000-00001. [DOI] [PubMed] [Google Scholar]

[R10] 10.Yeh TS, Pollack MM, Holbrook PR, Fields AI, Ruttiman U. Assessment of pediatric intensive care--application of the Therapeutic Intervention Scoring System. Crit Care Med. 1982;10(8):497–500. doi: 10.1097/00003246-198208000-00002 [DOI] [PubMed] [Google Scholar]

[R11] 11.Proulx F, Gauthier M, Nadeau D, Lacroix J, Farrell CA. Timing and predictors of death in pediatric patients with multiple organ system failure. Crit Care Med. 1994;22(6):1025–1031. doi: 10.1097/00003246-199406000-00023 [DOI] [PubMed] [Google Scholar]

[R12] 12.Pollack MM, Ruttimann UE, Getson PR. Accurate prediction of the outcome of pediatric intensive care. A new quantitative method. N Engl J Med. 1987;316(3):134–139. doi: 10.1056/NEJM198701153160304 [DOI] [PubMed] [Google Scholar]

[R13] 13.DeShazo JP, Hoffman MA. A comparison of a multistate inpatient EHR database to the HCUP Nationwide Inpatient Sample. BMC Health Serv Res. 2015;15:384. Published 2015 Sep 15. doi: 10.1186/s12913-015-1025-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Bryant C, Johnson A, Henson K, et al. : Apache Outcomes Acriss Venues Predicing Inpatient Mortality Using Electronic Medical Record Data. Critical Care Medicine 2018, 46:8. doi: 10.1097/01.ccm.0000528072.62631.7f [DOI] [Google Scholar]

[R15] 15.Heneghan JA, Trujillo Rivera EA, Zeng-Treitler Q, et al. Medications for Children Receiving Intensive Care: A National Sample. Pediatr Crit Care Med. 2020;21(9):e679–e685. doi: 10.1097/PCC.0000000000002391 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Patel AK, Trujillo-Rivera E, Faruqe F, et al. Sedation, Analgesia, and Neuromuscular Blockade: An Assessment of Practices From 2009 to 2016 in a National Sample of 66,443 Pediatric Patients Cared for in the ICU. Pediatr Crit Care Med. 2020;21(9):e599–e609. doi: 10.1097/PCC.0000000000002351 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Fung KW, Kapusnik-Uner J, Cunningham J, Higby-Baker S, Bodenreider O. Comparison of three commercial knowledge bases for detection of drug-drug interactions in clinical decision support. J Am Med Inform Assoc. 2017;24(4):806–812. doi: 10.1093/jamia/ocx010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Hardin AP, Hackell JM; COMMITTEE ON PRACTICE AND AMBULATORY MEDICINE. Age Limit of Pediatrics. Pediatrics. 2017;140(3):e20172151. doi: 10.1542/peds.2017-2151 [DOI] [PubMed] [Google Scholar]

[R19] 19.Ma J, Lee DKK, Perkins ME, Pisani MA, Pinker E. Using the Shapes of Clinical Data Trajectories to Predict Mortality in ICUs. Crit Care Explor. 2019;1(4):e0010. Published 2019 Apr 17. doi: 10.1097/CCE.0000000000000010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Mohamadlou H, Panchavati S, Calvert J, et al. Multicenter validation of a machine-learning algorithm for 48-h all-cause mortality prediction. Health Informatics J. 2020;26(3):1912–1925. doi: 10.1177/1460458219894494 [DOI] [PubMed] [Google Scholar]

[R21] 21.Ho LV, Aczon M, Ledbetter D, Wetzel R. Interpreting a recurrent neural network’s predictions of ICU mortality risk. J Biomed Inform. 2021;114:103672. doi: 10.1016/j.jbi.2021.103672 [DOI] [PubMed] [Google Scholar]

[R22] 22.Meert KL, Keele L, Morrison W, et al. End-of-Life Practices Among Tertiary Care PICUs in the United States: A Multicenter Study. Pediatr Crit Care Med. 2015;16(7):e231–e238. doi: 10.1097/PCC.0000000000000520 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Gu C Smoothing Spline ANOVA Models. Springer Series in Statistics; 2002. [Google Scholar]

[R24] 24.Gu C Smoothing Spline ANOVA Models: R Package gss. Journal of Statistical Software 2014; 58(5):1–25. doi: 10.18637/jss.v058.i05 [DOI] [Google Scholar]

[R25] 25.Barnard G A New Test for 2×2 Tables. Nature. 1945; 156:177. doi: 10.1038/156177a0 [DOI] [Google Scholar]

[R26] 26.Calhoun P Exact: Unconditional Exact Test. R package version 2.1. 2020. https://cran.r-project.org/web/packages/Exact/ [Google Scholar]

[R27] 27.Martin Andres A, Silva Mato A. Choosing the optimal unconditioned test for comparing two independent proportions. Computational Statistics & Data Analysis. 1994; 17(5): 555–574. [Google Scholar]

[R28] 28.Mehrotra DV, Chan IS, Berger RL. A cautionary note on exact unconditional inference for a difference between two independent binomial proportions. Biometrics. 2003;59(2):441–450. doi: 10.1111/1541-0420.00051 [DOI] [PubMed] [Google Scholar]

[R29] 29.Gutierrez G Artificial Intelligence in the Intensive Care Unit. Crit Care. 2020;24(1):101. Published 2020 Mar 24. doi: 10.1186/s13054-020-2785-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Morris AH. Human Cognitive Limitations. Broad, Consistent, Clinical Application of Physiological Principles Will Require Decision Support. Ann Am Thorac Soc 2018;15(Suppl 1):S53–S56. doi: 10.1513/AnnalsATS.201706-449KV [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Kim SY, Kim S, Cho J, et al. A deep learning model for real-time mortality prediction in critically ill children. Crit Care. 2019;23(1):279. Published 2019 Aug 14. doi: 10.1186/s13054-019-2561-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Lee B, Kim K, Hwang H, et al. Development of a machine learning model for predicting pediatric mortality in the early stages of intensive care unit admission. Sci Rep. 2021;11(1):1263. Published 2021 Jan 13. doi: 10.1038/s41598-020-80474-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Brajer N, Cozzi B, Gao M, et al. Prospective and External Evaluation of a Machine Learning Model to Predict In-Hospital Mortality of Adults at Time of Admission. JAMA Netw Open. 2020;3(2):e1920733. Published 2020 Feb 5. doi: 10.1001/jamanetworkopen.2019.20733 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Ahmad MA, Trujillo-Rivera EA, Pollack MM, Eckert C, Patel A, Teredesai A. (2021) “Machine Learning Approaches for Patient State Prediction in Pediatric ICUs.” IEEE International Conference on Healthcare Informatics. [Google Scholar]

PERMALINK

Dynamic Mortality Risk Predictions for Children in Intensive Care Units: Development and Validation of Machine Learning Models

Eduardo A Trujillo Rivera, PhD

James M Chamberlain, MD

Anita K Patel, MD

Hiroki Morizono, Ph.D.

Julia A Heneghan, MD

Murray M Pollack, MD

Abstract

Objective:

Design:

Setting:

Patients:

Interventions:

Main Outcome:

Measurements and Main Test Results.

Conclusion and Relevance:

Introduction

Methods

Sample

Variables

Machine Learning Methodology and Statistical Analysis

Results

Table 1.

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Discussion

Conclusion

Supplementary Material

Conflicts of Interest and Source of Funding.

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases