Abstract
Objectives:
To validate the conceptual framework of criticality, a new pediatric inpatient severity measure based on physiology, therapy, and therapeutic intensity calibrated to care intensity, operationalized as intensive care unit (ICU) care.
Design:
Deep neural network analysis of a pediatric cohort from the Health Facts® (Cerner Corporation, Kansas City, MO) national database.
Setting:
Hospitals with pediatric routine inpatient and ICU care.
Patients:
Children cared for in the ICU (n = 20,014), and in routine care units without an ICU admission (n = 20,130) from 2009 – 2016. All patients had laboratory, vital sign, and medication data.
Interventions:
None.
Measurements and Main Results:
A calibrated, deep neural network used physiology (laboratory tests and vital signs), therapy (medications), and therapeutic intensity (number of physiology tests and medications) to model care intensity, operationalized as ICU (versus routine) care every 6 hours of a patient’s hospital course. The probability of ICU care is termed the Criticality Index. First, the model demonstrated excellent separation of criticality distributions from a severity hierarchy of 5 patient groups: routine care, routine care for those who also received ICU care, transition from routine to ICU care, ICU care, and high intensity ICU care. Second, model performance assessed with statistical metrics was excellent with an AUC for the ROC of 0.95 for 327,189 6-hour time periods, excellent calibration, sensitivity = 0.817, specificity = 0.892, accuracy = 0.866, and precision = 0.799. Third, the performance in individual patients with >1 care designation indicated 88.03% (95% CI: 87.72, 88.34) of the Criticality Indices in the more intensive locations were higher than the less intense locations.
Conclusion and Relevance:
The Criticality Index is a quantification of severity of illness for hospitalized children using physiology, therapy, and care intensity. This new conceptual model is applicable to clinical investigations and predicting future care needs.
Keywords: pediatrics, severity of illness, intensive care, machine learning, pediatric intensive care unit, dynamic modeling
Introduction
Severity measures are fundamental to many aspects of medicine including quality assessment and case-mix adjustments for clinical research.1,2 Quantifying severity has been central for pediatric studies exploring clinical and administrative relationships, including the relationship between physiological derangements and mortality, morbidity, quality factors, and efficiency.3,4 Physiology-based severity measures are currently used in various care locations. For example, pediatric early warning systems use bedside variables to predict adverse events in routine care areas.5 The Pediatric Risk of Mortality (PRISM) score is used in intensive care, primarily for quality assessments and severity of illness adjustment in research studies.4 The Apgar score is used in delivery rooms to assess risk in the newborn period.6 Therapy-based severity measures are also common. For example, therapeutic intensity is measured by the vasoactive inotrope score, and quantity and intensity of therapy is measured with the Therapeutic Intervention Scoring System (TISS) score.7,8 Measures of therapeutic intensity currently include assessments of staffing needs.9 These methods have not been calibrated to short-term clinical status that could influence clinical decision making or predict clinical trajectories.
We consider the concept of criticality as the combination of physiologic profiles, therapies, and therapeutic intensity. As care for a patient progresses, severity may be reflected by changes in different combinations of these measures. For example, early in a patient’s course, when therapy is being initiated, severity is primarily reflected in physiologic variables. If therapy successfully stabilizes the patient’s physiology, severity is reflected primarily in the specific therapies and therapeutic intensity. If therapy is unsuccessful or only partially successful, severity is reflected in the physiologic status, the specific therapies, and therapeutic intensity. Throughout the course of clinical illness, clinical concern is reflected in therapeutic intensity, as indicated by factors including the number of therapies, the frequency of vital signs, and the number and frequency of laboratory measurements. This framework is consistent with prominent threads of severity of illness research, including the early qualitative assessments of severity of illness that used care location as an indicator of severity,10 methods of quantifying therapeutic intensity,11,12 the definitions of multiple organ system failures that included both physiology and therapies,13–15 and physiological profiles.4,16 Characterization of severity of illness using these clinical elements is important in the prediction of short-term clinical status or clinical trajectories for individual patients.
The central aim of this investigation was the development and validation of the Criticality Index for hospitalized children to enhance future efforts to predict short term clinical status and clinical trajectories. This model is calibrated to care intensity, operationalized as ICU care, because ICU care is a relevant gauge of severity for inpatients, and mortality is uncommon in children in ICUs4 and rare in non-ICU areas. The probability of intensive care is termed the Criticality Index. We developed the Criticality Index using a calibrated, deep neural network to be applicable to consecutive time periods during hospitalization. A companion analysis assesses another important aspect of validation, the ability of the Criticality Index to reflect clinical course.17 These analyses are presented as prerequisites to dynamic models using the Criticality Index to predict a patient’s short term clinical status and clinical trajectory.
Methods
Database
The dataset was derived from the Health Facts® database (Cerner Corporation, Kansas City, MO) that collects comprehensive clinical data on patient encounters from hospitals in the United States with a Cerner data use agreement. Health Facts® is a voluntary program to facilitate the capture of patient data from the electronic health record since 2000 from more than 500 hospitals. Health Facts® provides episodic and longitudinal, date and time-stamped data from affiliated patient care locations including admission and demographic data, care-setting characteristics, laboratory results, medication data derived from pharmacy records, diagnostic and procedure codes, vital signs, respiratory data, and hospital outcome, providing a temporal relationship between treatment patterns and clinical information. Cerner Corporation has established Health Insurance Portability and Accountability Act compliance operating policies to establish de-identification of Health Facts®. Notably, not all data are available for all patients. Health Facts® has been assessed as representative of the United States18 and used in previous care assessments including the APACHE score,19 and medication assessments for children in ICUs.20
Details on preparing and using the database are in the Supplemental Digital Content 1: Appendix. Appendix A details the data cleansing and data definitions, Appendix B contains the information for the medications and medication classification, Appendix C details the data elements for laboratory variables, and Appendix D details the vital signs and respiratory variables. Medication data used for modeling were determined from pharmacy record using start and discontinuation times. Drug categories were categorized by Multum™ using the National Drug Code (NDC).21,22 If multiple doses of a medication or medication class occurred concomitantly, the length of administration was determined by the start and end time. Other data included age and diagnostic information. Diagnoses were categorized based on the ICD9 and ICD10 classifications.23,24 The primary diagnosis was used for descriptive purposes but not for modeling because it was determined at discharge. Positive pressure ventilation, determined from respiratory parameters, was used to indicate high-intensity ICU care, but not for modeling.
Sample
Inclusion criteria included age <22 years,25 laboratory, vital signs, and medication data, and care in routine care units or ICUs from January 2009 to June 2016. Exclusion criteria included hospital length of stay >100 days, ICU length of stay > 30 days and care in the neonatal ICU. All patients receiving ICU care were included, as was a randomly selected sample of patients receiving only routine care approximately equal in size to the ICU sample.
The hospital course was discretized into consecutive 6-hour time periods. Each 6-hour time period was categorized into the mutually exclusive groups of a criticality hierarchy in an ascending order of severity primarily based on care area: routine care patients (routine (non-ICU)), routine care patients eventually progressing to ICU care (routine (ICU)), the immediate transition time period when patients are transferring from routine to ICU care and therefore were cared for in both the routine care area and the ICU in the same time period (transition), ICU care, and ICU care with positive pressure ventilation (excluding high flow oxygenation or continuous positive airway pressure, ICU high intensity). Using the criticality hierarchy, we hypothesized the following: Patients care for in the ICU would have high criticality and those cared for in routine care areas would have low criticality. Within the ICU, patients receiving positive pressure ventilation would have higher criticality than those without this intervention. In routine care units, patients who would receive ICU care would have a higher criticality than those who never received ICU care. Patients transitioning from routine to intensive care would, in general, have criticality intermediate between the routine and ICU care.
Independent Variables
The variable set, definitions, and the statistics for each variable type used for modeling are shown in Table 1. The machine learning methodology required laboratory and vital sign measurements in each time period, requiring imputation for missing data. We imputed laboratory results and vital signs using the last known result because, in general, physicians use the last measured values and repeat measurements when change is anticipated, when required for clinical care, or when results are acquired routinely. If during the first six-hour time period there were missing values, these values were set to the median of the first six hour time periods using 9 age groups (Table 1). These imputed values are shown in Appendix E. All are either in the normal range or have minor deviations from normal. This imputation scheme is similar to other severity scores, which assume normal values for unmeasured variables.4,15 The imputed values were identified in the modeling (below) by setting the count equal to zero. We explored the possibility that imputation induced a systematic biased by pairwise comparison of distributions of laboratory and vital signs with and without imputation.26,27 No bias was evident.
Table 1.
Independent Variables.1
| Lab Variables1,3,4 | Vital Signs1,3,4 | Medications2,3 | Other | |||
|---|---|---|---|---|---|---|
| Albumin | Bilirubin Indirect | Hemoglobin | Platelets | BP-systolic | 1113 individual medications | Age Sex |
| ALT | Bilirubin Total | Hematocrit | Potassium | BP- diastolic | 143 medication categories (6) | |
| Arterial Lactate | BUN | INR | Protime | Heart Rate | ||
| PO2 (arterial) | Calcium | Glucose | Sodium | Respiratory Rate | ||
| AST | Calcium Ionized | PTT | Total Protein | Temperature | ||
| Base Excess | Chloride | PCO2 (5) | Venous Lactate | Coma Score | ||
| Bicarbonate | Creatinine | pH (5) | WBC | |||
| Bilirubin Direct | Fibrinogen | |||||
Summarized for modeling with the following statistics for each variable: the count, sample mean, sample standard deviation (0 if the count was <2), maximum, and minimum. There were a total of 934 variables used for modeling: 300 derived from the 30 laboratory variables, 60 from the 6 vital signs, and 572 from the 143 medication categories, sex and age of patient at admission.
Summarized for modeling with the following statistics: the 6-hour sum per medication category of the number of medications given each hour; 2) the count of the previous time periods per medication category that the patient received one or more medications; 3) the proportion of the previous time periods per medication category that the patient received one or more medications.
Therapeutic intensity is reflected in the number of vital sign and laboratory measurements and medications.
If during the first six-hour time period there were missing values, these values were adjusted to the median of the first six hour time periods adjusted to the following age groups: <1week, 1week-<4weeks, 4weeks-<3months, 3months-<1year, 1year-<2years, 2years-<3years, 3years-<8years, 8years-<12years, 12years-<22years.
Arterial, venous, capillary.
Machine Learning Methodology
For each patient group, we randomly chose 64% for training, 8% for validation, and 28% for testing. This distribution was chosen to maximize the test sample. Random selection was at the patient level.28–31 The training set was used for model development, the validation set was used for minor model changes to avoid model overfitting,32,33 and the test sample was used to evaluate model performance and calibration on patients not used for model development. The training and validations sets were combined for calibration.34,35 Transition time periods were not used for model development because a single care location could not be assigned to the time periods. Similarly, the transition time period was not used in the statistical evaluation of model performance when a single care location was required.
First, we estimated the probability of ICU care for each 6-hour time period using a deep neural network using 934 variables. The neural network was fully-connected and designed to maximize accuracy for the binary task of classifying patients as being in the ICU or not. The modelling software used the R-package keras.36,37 Since deep neural networks have a tendency to be over-confident in classifications, we added an additional calibration step.38 Calibration was accomplished by using the neural network output for B-spline polynomials39,40 as covariates in a linear logistic regression with the outcome of ICU care. This calibration method is similar to the Platt scaling method for support vector machines.41
Model performance for correctly classifying care as routine or ICU was assessed on the non-transition time-periods of the test patients. Performance metrics include the area under the receiver operating characteristic curve (AUC), and confusion matrix derivations42 at a classification cut-point of 0.5, and over the full spectrum of classification cutpoints, and performance of the criticality hierarchy in individual patients. Model calibration over the full range of Criticality Index was evaluated using the differences between the observed and expected proportions of ICU outcomes for 2000 ascending Criticality Index intervals in the test sample.35,41 All Criticality Index intervals for calibration had a Criticality Index range of less than .0025 and >99% of the intervals had at least 106 6-hour time periods. We assessed the criticality hierarchy in individual patients by comparing the pairs of Criticality Indices from the different care locations.
Results are presented for the 28% test sample not used for model development or calibration. Criticality Index distributions were described with the median, 25th and 75th percentiles, mean, and the 95th confidence intervals (CI) for the mean and median. We displayed the distribution of patients in the transition time period using violin plots. We computed Pearson Chi-square tests for differences of proportions, and the Mann–Whitney U test for stochastic differences for comparisons between the ICU and non-ICU patients.
Results
There were 20,014 patients with an ICU stay, and 20,130 randomly selected patients cared for in routine care areas only (Table 2). Compared to patients with ICU stays, routine care patients were older (132.2 vs. 28.0 months, p<0.0001), had shorter hospital stays (90 hours vs 129 hours, p<0.0001), and had a lower mortality rate (0.13% vs. 3.15%, p < 0.0001) (Table 2). Diagnostic distributions except respiratory diseases, infectious and parasitic diseases, and nervous system diseases also differed between the two patient groups (p<.0001). The number of 6-hour time periods per patient also differed (p<.0001). Supplemental Digital Content 2: Supplement Table 1 displays the distributions of the training, validation, and testing time periods and patient numbers.
Table 2.
Population Characteristics.
| Characteristics | All Patients (N = 40,144) | ICU Patients (N = 20,014) | Routine Care Patients (N = 20,130) | p-value (ICU vs Routine) |
|---|---|---|---|---|
| Age (months) (median) | 96 | 28 | 132 | <0.0001 |
| Age Groups (N (%)) | <0.0001 | |||
| < 2 years | 11823 (29.5%) | 9772 (48.8%) | 2051 (10.2%) | |
| 2 years - <6 years | 6038 (15.0%) | 1641 (8.2%) | 4397 (21.8%) | |
| 6 years - <13 years | 7106 (17.7%) | 2172 (10.9%) | 4934 (24.5%) | |
| 13 years - <22 years | 15177 (37.8%) | 6429 (32.1%) | 8748 (43.5%) | |
| Gender (N (% Male)) | 20545 (51.2%) | 11101 (55.5%) | 9444 (46.9%) | <0.0001 |
| Race (N (%)) | <0.0001 | |||
| African American | 9354 (23.3%) | 5256 (26.3%) | 4098 (20.4%) | |
| Caucasian | 19048 (47.5%) | 10335 (51.6%) | 8713 (43.3%) | |
| Other/Unknown | 11742 (29.5%) | 4423 (22.1%) | 7319 (36.4%) | |
| Hospital Length of Stay (hours) (50th (25th, 75th) | 108 (60, 210) | 129 (66, 276) | 90 (60, 162) | <0.0001 |
| ICU Length of Stay (hours) (50th (25th, 75th) | 0 (0, 74) | 75 (33, 169) | 0 (0, 0) | <0.0001 |
| 6-hour Periods/Pt (hours) (50th (25th, 75th) | 16 (9, 32) | 20 (10, 41) | 14 (9, 24) | <0.0001 |
| Hospital Mortality (N (%)) | 657 (1.6%) | 631 (3.2%) | 26 (0.1%) | <0.0001 |
| Positive Pressure Support (n (%)) (1) | 6131 (15.3%) | 5313 (26.6%) | 818 (4.1%) | <0.0001 |
| Diagnostic Categories | ||||
| Respiratory system | 3614 (13.8%) | 1563 (14.3%) | 2051 (13.5%) | 0.0590 |
| Endocrine nutritional, metabolic and immunity disorders | 3158 (12.1%) | 1624 (14.8%) | 1534 (10.1%) | <0.0001 |
| Not otherwise specified | 2731 (10.4%) | 1735 (15.9%) | 996 (6.5%) | <0.0001 |
| Diseases of the digestive system | 2556 (9.8%) | 503 (4.6%) | 2053 (13.5%) | <0.0001 |
| Infectious and parasitic diseases | 2377 (9.1%) | 962 (8.8%) | 1415 (9.3%) | 0.1760 |
| Injury and poisoning | 2261 (8.6%) | 1390 (12.7%) | 871 (5.7%) | <0.0001 |
| Nervous system | 1856 (7.1%) | 788 (7.2%) | 1068 (7.0%) | 0.5666 |
| Blood forming organs | 1758 (6.7%) | 324 (3.0%) | 1434 (9.4%) | <0.0001 |
| Neoplasms | 1636 (6.3%) | 231 (2.1%) | 1405 (9.2%) | <0.0001 |
| Circulatory system | 1196 (4.6%) | 735 (6.7%) | 461 (3.0%) | <0.0001 |
| Genitourinary system | 973 (3.7%) | 340 (3.1%) | 633 (4.2%) | <0.0001 |
| Mental disorders | 863 (3.3%) | 251 (2.3%) | 612 (4.0%) | <0.0001 |
| Musculoskeletal system and connective tissue | 766 (2.9%) | 245 (2.2%) | 521 (3.2%) | <0.0001 |
| Skin and subcutaneous tissue | 436 (1.7%) | 252 (2.3%) | 184 (1.2%) | <0.0001 |
Abbreviations: IQR = Interquartile range; Pt = patient.
Criteria for positive pressure ventilation were continuous positive airway pressure, positive end-expiratory pressure, peak inspiratory pressure.
The criticality hierarchy was demonstrated with increasing Criticality Index values (median (25th – 75th percentiles)) from routine care (non-ICU) (0.00, (0.00, 0.03)), routine care (ICU) (0.22, (0.05, 0.58)), transition routine to ICU (0.61, (0.27, 0.81)), ICU (0.83, (0.58, 0.95)), and ICU high intensity (0.97, (0.90, 0.99)) (Figure 1). The data were similar for the training and validation samples (Supplemental Digital Content 3: Supplemental Table 2). The confusion matrix results for correctly classifying patients as routine or ICU are shown in Table 3. Important derivatives include an accuracy of 0.866, sensitivity of 0.817, specificity of 0.892, and precision of 0.799. The results of the confusion matrix derivations were relatively stable across a wide range of cutpoints (Supplemental Digital Content 4: Supplemental Figure 1). The AUC for the ROC (Supplemental Digital Content 5: Supplemental Figure 2) was 0.95 (95% CI: 0.95, 0.95) for 327,189 6-hour time periods. The calibration of the model for routine and ICU care classification was evaluated by comparing the observed and expected proportions of ICU time periods within Critical Index intervals (Figure 2). Most time periods occur at the high and low ends of the criticality spectrum, where the model performs exceptionally well. There is a small tendency to underestimate Criticality Index in the lower mid-range of criticality intervals.
Figure 1.
Criticality Index for Survivors, Deaths, and All Patients for Each of the Patient Categories in the Test Sample. Note that the horizontal axis uses the logit function. The diamond and bars indicate the mean and 95 percent confidence interval. The boxes indicate the median, 25th-75th quantiles. Routine indicates patients in the routine care units, some of whom will be admitted to the ICU (ICU patient) and some were not (Non ICU Patient). ICU indicates patients in the ICU and ICU High Intensity indicates patients receiving positive pressure ventilation. Transition (Trans.) indicates the 6-hour time period when patients transitioned from Routine to ICU and were in both cares during the time period.
Table 3:
Confusion Matrix Derivations with Classification Cut Point 0.5 on the Test Sample. Classification as routine care versus ICU care is assessed.
| Derivation | Value |
|---|---|
| Accuracy | 0.866 |
| F1 score | 0.808 |
| False discovery rate | 0.201 |
| False negative rate | 0.183 |
| False omission rate | 0.097 |
| False positive rate | 0.108 |
| Informedness | 0.709 |
| Matthews correlation coefficient (MCC) | 0.706 |
| Neg. pred. value | 0.903 |
| Precision (Positive Predictive Value) | 0.799 |
| Sensitivity (Recall, True Positive Rate) | 0.817 |
| Specificity (True Negative Rate) | 0.892 |
Figure 2.
Model calibration. The y axis shows the expected proportion of ICU admission based on the Criticality Index and the x-axis shows the observed proportion. The line of identity is the dashed line. The circles indicate the observed divided by the expected proportions of ICU 6-hour time periods over 2000 ascending Criticality Index intervals. More than 99% of the intervals had at least 106 time periods. All Criticality Index interval lengths had a Critical Index range of less than .0025.
The performance of the Criticality Index in patients with multiple care locations (Table 4) confirms the results in Figure 1 and Supplemental Digital Content 3: Supplement Table 2 and strongly preserves the criticality hierarchy. The Criticality Index values within an individual patient’s course fit into the criticality hierarchy. For example, for patients with both routine care and ICU high intensity care time periods, 94.4% of the ICU high intensity Criticality Index values were higher than the routine care indices. Other pairwise comparisons were similar. Overall, 88.03% (95% CI: 87.72, 88.34) of the Criticality Indices in the more intensive locations were higher than the less intense locations.
Table 4:
Pairwise Comparisons of Criticality Indices Within Patients with Multiple Care Locations. Data are the proportions of pairwise comparisons for which the criticality hierarchy is preserved.
| Comparison Care Location | Routine (ICU Patient)1 | Transition Routine to ICU1 | ICU1 | ICU High Intensity1 |
|---|---|---|---|---|
| Routine (ICU Patient) | - | 73.1, 89.7(50.0, 100.0) | 84.7, 95.0(77.9, 100.0) | 94.4, 100.0(99.6, 100.0) |
| Transition Routine to ICU | 26.9, 10.3 (0.0, 50.00) | - | 53.7, 53.5(14.8, 100.0) | 75.1, 100.0(50.0, 100.0) |
| ICU | 15.3, 5.0(0.0, 22.1) | 46.3, 46.5(0.0, 85.2) | - | 82.0, 91.2(75.0, 99.5) |
| ICU High Intensity | 5.6, 0.0(0.0, 0.4) | 24.9, 0.0(0.0, 50.0) | 18.0, 8.8(0.5, 25.0) | - |
Data are the average, median (25th and 75th percentiles) of the proportion of pairwise comparisons for which the criticality hierarchy is preserved.
We investigated the distribution of the Criticality Index in the transition time periods (Figure 1). For transitions from routine care to ICU high intensity, the distribution was skewed toward higher Criticality Index, consistent with need for active therapies. In contrast, for transition from routine care to ICU, the Criticality Index was relatively constant over a wide range, consistent with a mix of patients needing monitoring services as well as active therapies (Supplemental Digital Content 6: Supplemental Figure 3). This is consistent with the pairwise comparisons in individual patients (Table 4) where the transition time period was, in general, an intermediate level of criticality. Transition Criticality Index values were higher than regular care for 73.1% of the pairwise comparisons. Similarly 53.7% and 75.1% of ICU and ICU high intensity criticality indices, respectively, were higher than the transition indices. Violin plots for the transition time period categorized by diagnoses and procedures (Supplemental Digital Data 7: Supplemental Figure 4; Supplemental Digital Data 8: Supplemental Figure 5) did not indicate specific diagnoses or procedures that dominated these patterns.
Discussion
This analysis focused on developing and validating the conceptual foundation of criticality, our severity model that uses care intensity as the outcome and data elements from physiology, therapy, and therapeutic intensity as the predictor variables. Criticality may be an important aspect of predicting short-term clinical needs, clinical trajectories, and dynamic assessment of severity. A high Criticality Index indicates high physiological dysfunction and/or therapeutic intensity while a low Criticality Index indicates physiological stability and low therapeutic intensity. The conceptual foundation of the Criticality Index is based on historical milestones in critical care severity of illness research, including the importance of care locations, care intensity, therapies, and physiological profiles.4,10–15 Our data suggest this severity framework could be useful in dynamic patient-level predictors.43
We evaluated the criticality model from four perspectives. First, we evaluated the criticality hierarchy where the Criticality Index demonstrated an ascending order of criticality from routine care to high intensity ICU care and there were large separations in the Criticality Index between the groups. In general, the patient groups except the transition group experienced low variance and unimodal criticality. The variance in the transition group is consistent with both the clinical experience that some patients are transferred to the ICU for life support while others are admitted as part of routine protocols (e.g. post-operative care) or monitoring and the inter-institutional variability in care protocols. Second, we assessed model performance with statistical metrics for classification of care areas as routine or ICU. Discrimination was excellent, with an AUC = 0.95. Other statistical assessments, e.g. accuracy, precision were also very good. Third, the calibration of the criticality index was excellent in 2000 risk intervals. Fourth, the criticality hierarchy was preserved in individual patients where the Criticality Index was similar to the population data and consistent with the AUC. A companion study demonstrates that the Criticality Index is also excellent at tracking the expected clinical course of patient groups.17 There are several factors that favored accurate classification in this study. First, many of the time periods are easy to classify, such as the routine care time periods for patients never receiving ICU care. Second, most severity models predict a future state (i.e. mortality), while this model classifies the current state (care intensity operationalized as care location). These analyses are presented as prerequisites to dynamic models using the Criticality Index that anticipate a patient’s clinical trajectory. We expect the performance indicators for predicting future criticality states to be aligned with other high performing models.
This study has several limitations. First, while Health Facts® is an excellent dataset to develop and test the conceptual framework, the current database did not allow utilization of the full spectrum of data elements in the bedside electronic health record (EHR) or enterprise data warehouse. Second, while we did confirm the conceptual validity and clinical potential for criticality, final models should be tested using data from a production EHR. Third, we did not explore the full range of potential machine learning models because our main focus was on supporting the conceptual framework. Improved performance might be possible with different machine learning methods. Fourth, we used 6-hour time periods in anticipation of predicting clinical trajectories; other time periods could have been used. Fifth, we did not include diagnostic information because only discharge data were available in this data set. Diagnostic categorization could significantly improve model performance. Computing risk using quantified physiological profiles is improved when diagnostic information is included.4,44,45 Lastly, the outcome of care intensity, operationalized as ICU care, is variable both between institutions, where admission and discharge criteria vary, and within institutions, where criteria vary depending on resource strain such as census, bed availability, and clinical needs.46
We used machine learning for model development for several reasons. First, as large data sets become more available, machine learning is particularly adept at handling the large number of data elements available in the EHR.34,47,48 Additionally, machine learning may improve predictive performance over more traditional bio-statistical techniques in some situations.49 In this analysis we used a feed forward fully connected deep neural network.50 As a consequence, we did not explicitly use the Criticality Index from previous time periods or the order of the time periods within patients. Other machine learning methods may hold promise for future applications.50,51 In particular, machine learning has the ability to account for non-linearities and complex, multidimensional interactions without explicit pre-specification, making it an attractive method for dynamic predictions.32,52 Although machine learning methods have these advantages, the deep neural network models are not transparent. The clinical importance of individual or sets of variables is difficult to ascertain.
Conclusion
Criticality is a quantification of severity of illness for pediatric inpatients using physiology, therapy and therapeutic intensity that estimates care intensity operationalized as ICU care during 6-hour time intervals. This new method of quantifying severity of illness has direct applicability to predicting short-term clinical status, clinical trajectories and dynamic assessments of care needs.
Supplementary Material
Supplemental Digital Content 4. Supplemental Figure 1. Confusion Matrix Derivations with Different Classification (Routine Care versus ICU Care) Decision Values.
Supplemental Digital Content 5. Supplemental Figure 2. Area Under the Receiver Operating Characteristic Curve for the Test Sample Classifying Care as Routine or ICU.
Supplemental Digital Content 6. Supplemental Figure 3. Violin Plot of the 6-hour Time Periods of Patients Transitioning from Routine to ICU Care or High Intensity ICU Care During the Same Time Period. The violin plot displays the relative sample size of the sample for different Criticality Indices. ICU Hi indicates high intensity ICU care defined by positive pressure ventilation.
Supplemental Digital Content 7. Supplemental Figure 4. Violin Plot of the 6-hour Time Periods of Patients Transitioning from Routine to ICU Care or High Intensity ICU Care During the Same Time Period for the Diagnostic Categories. The violin plot displays the relative sample size of the sample for different Criticality Indices. ICU Hi indicates high intensity ICU care defined by positive pressure ventilation.
Supplemental Digital Content 8. Supplemental Figure 5. Violin Plot of the 6-hour Time Periods of Patients Transitioning from Routine to ICU Care or High Intensity ICU Care During the Same Time Period for the Procedure Categories. The violin plot displays the relative sample size of the sample for different Criticality Indices. ICU Hi indicates high intensity ICU care defined by positive pressure ventilation.
Acknowledgments
Research Support
Supported by philanthropic support from Mallinckrodt LLC, and award numbers UL1TR001876 from the NIH National Center for Advancing Translational Sciences, and KL2TR001877 from the NIH National Center for Advancing Translational Sciences (Anita Patel). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Center for Advancing Translational Sciences or the National Institutes of Health.
Copyright form disclosure: Drs. Rivera, Patel, Chamberlain, Morizono, Kim, Bost, and Pollack’s institutions received funding from Mallinckrodt LLC. Drs. Patel, Workman, Morizono and Pollack received support for article research from the National Institutes of Health (NIH). Dr. Patel’s institution received funding from Awards Ul1TR001876 and KL2TR001877 from the NIH, National Center for Advancing Translational Sciences (NCATS). Drs. Workman and Pollack’s institution received funding from the NIH. Dr. Workman received funding from IEEE. Dr. Morizono’s institution received funding from the NIH NCATS; he received funding from Cogthera LLC; and he received support for article research from Mallinckrodt. The remaining authors have disclosed that they do not have any potential conflicts of interest.
Footnotes
Article tweet: A new scoring system for hospitalized children including those in the ICU is applicable to dynamic assessments of clinical status.
No reprints
Contributor Information
Eduardo A. Trujillo Rivera, George Washington University School of Medicine and Health Sciences.
Anita K. Patel, Department of Pediatrics, Division of Critical Care Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences.
James M. Chamberlain, Department of Pediatrics, Division of Emergency Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences.
T. Elizabeth Workman, George Washington University School of Medicine and Health Sciences.
Julia A. Heneghan, Department of Pediatrics, Division of Critical Care Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences. Current affiliation Department of Pediatrics, Division of Critical Care Medicine, University of Minnesota Masonic Children’s Hospital..
Douglas Redd, George Washington University School of Medicine and Health Sciences.
Hiroki Morizono, Children’s National Research Institute, Associate Research Professor of Genomics and Precision Medicine, GWU School of Medicine and Health Sciences
Dongkyu Kim, Children’s National Hospital and George Washington University School of Medicine and Health Sciences
James E. Bost, Children’s National Hospital and George Washington University School of Medicine and Health Sciences.
Murray M. Pollack, Department of Pediatrics, Division of Critical Care Medicine, Children’s National Hospital and George Washington University School of Medicine and Health Sciences.
References
- 1.Gross PA, Beyt BE Jr, Decker MD, et al. Description of case-mix adjusters by the Severity of Illness Working Group of the Society of Hospital Epidemiologists of America (SHEA). Infect Control Hosp Epidemiol. 1988;9(7):309–316. [DOI] [PubMed] [Google Scholar]
- 2.Bennett CE, Wright RS, Jentzer J, et al. Severity of illness assessment with application of the APACHE IV predicted mortality and outcome trends analysis in an academic cardiac intensive care unit. Journal of critical care. 2019;50:242–246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Pollack MM, Cuerdon TT, Patel KM, Ruttimann UE, Getson PR, Levetown M. Impact of quality-of-care factors on pediatric intensive care unit mortality. Jama. 1994;272(12):941–946. [PubMed] [Google Scholar]
- 4.Pollack MM, Holubkov R, Funai T, et al. Simultaneous Prediction of New Morbidity, Mortality, and Survival Without New Morbidity From Pediatric Intensive Care: A New Paradigm for Outcomes Assessment. Crit Care Med. 2015;43(8):1699–1709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Parshuram CS, Dryden-Palmer K, Farrell C, et al. Effect of a Pediatric Early Warning System on All-Cause Mortality in Hospitalized Pediatric Patients: The EPOCH Randomized Clinical Trial. Jama. 2018;319(10):1002–1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Casey BM, McIntire DD, Leveno KJ. The continuing value of the Apgar score for the assessment of newborn infants. The New England journal of medicine. 2001;344(7):467–471. [DOI] [PubMed] [Google Scholar]
- 7.Musick MA, Loftis LL, Kennedy CE. Comparing Vasoactive-Inotropic Score Reporting Strategies in the PICU Relative to Mortality Risk. Pediatric critical care medicine : a journal of the Society of Critical Care Medicine and the World Federation of Pediatric Intensive and Critical Care Societies. 2018;19(12):1130–1136. [DOI] [PubMed] [Google Scholar]
- 8.Trope R, Vaz S, Zinger M, Sagy M. An Updated Therapeutic Intervention Scoring System for Critically Ill Children Enables Nursing Workload Assessment With Insight Into Potential Untoward Events. Journal of intensive care medicine. 2015;30(6):344–350. [DOI] [PubMed] [Google Scholar]
- 9.Nieri AS, Manousaki K, Kalafati M, et al. Validation of the nursing workload scoring systems “Nursing Activities Score” (NAS), and “Therapeutic Intervention Scoring System for Critically Ill Children” (TISS-C) in a Greek Paediatric Intensive Care Unit. Intensive & critical care nursing. 2018;48:3–9. [DOI] [PubMed] [Google Scholar]
- 10.Cullen DJ, Civetta JM, Briggs BA, Ferrara LC. Therapeutic intervention scoring system: a method for quantitative comparison of patient care. Crit Care Med. 1974;2(2):57–60. [PubMed] [Google Scholar]
- 11.Keene AR, Cullen DJ. Therapeutic Intervention Scoring System: update 1983. Crit Care Med. 1983;11(1):1–3. [DOI] [PubMed] [Google Scholar]
- 12.Yeh TS, Pollack MM, Holbrook PR, Fields AI, Ruttiman U. Assessment of pediatric intensive care--application of the Therapeutic Intervention Scoring System. Crit Care Med. 1982;10(8):497–500. [DOI] [PubMed] [Google Scholar]
- 13.Proulx F, Joyal JS, Mariscalco MM, Leteurtre S, Leclerc F, Lacroix J. The pediatric multiple organ dysfunction syndrome. Pediatr Crit Care Med. 2009;10(1):12–22. [DOI] [PubMed] [Google Scholar]
- 14.Proulx F, Gauthier M, Nadeau D, Lacroix J, Farrell CA. Timing and predictors of death in pediatric patients with multiple organ system failure. Crit Care Med. 1994;22(6):1025–1031. [DOI] [PubMed] [Google Scholar]
- 15.Leteurtre S, Martinot A, Duhamel A, et al. Validation of the paediatric logistic organ dysfunction (PELOD) score: prospective, observational, multicentre study. Lancet (London, England). 2003;362(9379):192–197. [DOI] [PubMed] [Google Scholar]
- 16.Ko M, Shim M, Lee SM, Kim Y, Yoon S. Performance of APACHE IV in Medical Intensive Care Unit Patients: Comparisons with APACHE II, SAPS 3, and MPM0 III. Acute and critical care. 2018;33(4):216–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rivera EAT, Patel AK, Zeng-Treitler Q, et al. Severity Trajectories of Pediatric Inpatients Using the Criticality Index. . Pediatric Critical Care Medicine. 2020;Submitted. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.DeShazo JP, Hoffman MA. A comparison of a multistate inpatient EHR database to the HCUP Nationwide Inpatient Sample. BMC health services research. 2015;15:384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Bryant C, Johnson A, Henson K, Freeseman-Freeman, Stark M, Higgins T. Apache Outcomes Acriss Venues Predicing Inpatient Mortality Using Electronic Medical Record Data. Critical Care Medicine. 2018;46:8. [DOI] [PubMed] [Google Scholar]
- 20.Patel AK, Trujillo-Rivera E, Faruqe F, et al. Sedation, Analgesia, and Neuromuscular Blockade: An Assessment of Practices From 2009 to 2016 in a National Sample of 66,443 Pediatric Patients Cared for in the ICU. Pediatr Crit Care Med. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fung KW, Kapusnik-Uner J, Cunningham J, Higby-Baker S, Bodenreider O. Comparison of three commercial knowledge bases for detection of drug-drug interactions in clinical decision support. J Am Med Inform Assoc. 2017;24(4):806–812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Stamm C, Kreutzer C, Zurakowski D, et al. Forty-one years of surgical experience with congenital supravalvular aortic stenosis. Journal of Thoracic & Cardiovascular Surgery. 1999;118(5):874–885. [DOI] [PubMed] [Google Scholar]
- 23.Centers for Disease Control and Prevention. ICD-9-CM official guidelines for coding and reporting. Atlanta, GA:: Centers for Medicare & Medicaid Services; 2011. [Google Scholar]
- 24.Centers for Medicare and Medicaid Services. ICD-10-CM official guidelines for coding and reporting FY 2018. 2018. [Google Scholar]
- 25.Hardin AP, Hackell JM, Committee On P, Ambulatory M. Age Limit of Pediatrics. Pediatrics. 2017;140(3). [DOI] [PubMed] [Google Scholar]
- 26.Kowarik A, Templ M. Imputation with the R Package VIM. Journal of Statistical Software. 2016;74(7):1–16. [Google Scholar]
- 27.Zhang Z Missing data exploration: highlighting graphical presentation of missing pattern. Ann Transl Med. 2015;3(22):356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Tandon R, Adak S, Kaye JA. Neural networks for longitudinal studies in Alzheimer’s disease. Artificial Intelligence in Medicine. 2006;36(3):245–255. [DOI] [PubMed] [Google Scholar]
- 29.Zhao J, Feng Q, Wu P, et al. Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Scientific reports. 2019;9(1):1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Klein Klouwenberg PMC, Spitoni C, van der Poll T, Bonten MJ, Cremer OL, consortium M. Predicting the clinical trajectory in critically ill patients with sepsis: a cohort study. Critical care (London, England). 2019;23(1):408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Duan H, Sun Z, Dong W, He K, Huang Z. On Clinical Event Prediction in Patient Treatment Trajectory Using Longitudinal Electronic Health Records. IEEE Journal of Biomedical and Health Informatics. 2019. [DOI] [PubMed] [Google Scholar]
- 32.Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. New England Journal of Medicine. 2019;380(14):1347–1358. [DOI] [PubMed] [Google Scholar]
- 33.Avati A, Jung K, Harman S, Downing L, Ng A, Shah NH. Improving palliative care with deep learning. BMC medical informatics and decision making. 2018;18(4):122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Shah ND, Steyerberg EW, Kent DM. Big Data and Predictive Analytics: Recalibrating Expectations. JAMA. 2018;320(1):27–28. [DOI] [PubMed] [Google Scholar]
- 35.Mozafari AS, Gomes HS, Leão W, Janny S, Gagné C. Attended Temperature Scaling: A Practical Approach for Calibrating Deep Neural Networks. arXiv preprint arXiv:181011586. 2018. [Google Scholar]
- 36.Allaire JJ, Chollet F. Keras: R Interface to ‘Keras’. R package version 2.2.5.0. 2019; https://CRAN.R-project.org/package=keras [cran.r-project.org] Accessed March 12, 2020.
- 37.Chollet F JJA Deep Learning with R. . Manning Publications Co; 2018. [Google Scholar]
- 38.Tchircoff A The mostly complete chart of Neural Networks, explained. Towards Data Science. 2017:1–29. [Google Scholar]
- 39.De Boor C On calculating with B-splines. Journal of Approximation theory. 1972;6(1):50–62. [Google Scholar]
- 40.Eilers PH, Marx BD. Flexible smoothing with B-splines and penalties. Statistical science. 1996:89–102. [Google Scholar]
- 41.Platt J Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers. 1999;10(3):61–74. [Google Scholar]
- 42.Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Information processing & management. 2009;45(4):427–437. [Google Scholar]
- 43.Trujillo-Rivera EA PA, Heneghan JA, Zeng Q, Bost JE, Morizono H, Kim D, Chamberlain JC, Pollack MM. Predicting Severity of Illness in Pediatric and PICU Inpatients with Medications, Laboratory Tests, Vital Signs and Neural Networks. Pediatric Academic Societies; 2019; Baltimore, MD. [Google Scholar]
- 44.Kramer AA, Higgins TL, Zimmerman JE. Comparing observed and predicted mortality among ICUs using different prognostic systems: why do performance assessments differ? Crit Care Med. 2015;43(2):261–269. [DOI] [PubMed] [Google Scholar]
- 45.Chamberlain JM, Patel KM, Pollack MM. The Pediatric Risk of Hospital Admission score: a second-generation severity-of-illness score for pediatric emergency patients. Pediatrics. 2005;115(2):388–395. [DOI] [PubMed] [Google Scholar]
- 46.Halpern SD. ICU capacity strain and the quality and allocation of critical care. Current opinion in critical care. 2011;17(6):648–657. [DOI] [PubMed] [Google Scholar]
- 47.Miller DD, Brown EW. Artificial Intelligence in Medical Practice: The Question to the Answer? The American journal of medicine. 2018;131(2):129–133. [DOI] [PubMed] [Google Scholar]
- 48.Seymour CW, Kennedy JN, Wang S, et al. Derivation, Validation, and Potential Treatment Implications of Novel Clinical Phenotypes for Sepsis. Jama. 2019;321(20):2003–2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Han D, Kolli KK, Gransar H, et al. Machine learning based risk prediction model for asymptomatic individuals who underwent coronary artery calcium score: Comparison with traditional risk prediction approaches. J Cardiovasc Comput Tomogr. 2019. [DOI] [PubMed] [Google Scholar]
- 50.Van Veen F The neural network zoo. The Asimov Institute. 2016. [Google Scholar]
- 51.Che Z, Purushotham S, Cho K, Sontag D, Liu Y. Recurrent neural networks for multivariate time series with missing values. Scientific reports. 2018;8(1):6085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Chen JH, Asch SM. Machine Learning and Prediction in Medicine - Beyond the Peak of Inflated Expectations. N Engl J Med. 2017;376(26):2507–2509. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental Digital Content 4. Supplemental Figure 1. Confusion Matrix Derivations with Different Classification (Routine Care versus ICU Care) Decision Values.
Supplemental Digital Content 5. Supplemental Figure 2. Area Under the Receiver Operating Characteristic Curve for the Test Sample Classifying Care as Routine or ICU.
Supplemental Digital Content 6. Supplemental Figure 3. Violin Plot of the 6-hour Time Periods of Patients Transitioning from Routine to ICU Care or High Intensity ICU Care During the Same Time Period. The violin plot displays the relative sample size of the sample for different Criticality Indices. ICU Hi indicates high intensity ICU care defined by positive pressure ventilation.
Supplemental Digital Content 7. Supplemental Figure 4. Violin Plot of the 6-hour Time Periods of Patients Transitioning from Routine to ICU Care or High Intensity ICU Care During the Same Time Period for the Diagnostic Categories. The violin plot displays the relative sample size of the sample for different Criticality Indices. ICU Hi indicates high intensity ICU care defined by positive pressure ventilation.
Supplemental Digital Content 8. Supplemental Figure 5. Violin Plot of the 6-hour Time Periods of Patients Transitioning from Routine to ICU Care or High Intensity ICU Care During the Same Time Period for the Procedure Categories. The violin plot displays the relative sample size of the sample for different Criticality Indices. ICU Hi indicates high intensity ICU care defined by positive pressure ventilation.


