ABSTRACT
Objective/Introduction: Sequential vital-sign information and trends in vital signs are useful for predicting changes in patient state. This study aims to predict latent shock by observing sequential changes in patient vital signs. Methods: The dataset for this retrospective study contained a total of 93,194 emergency department (ED) visits from January 1, 2016, and December 31, 2020, and Medical Information Mart for Intensive Care (MIMIC)-IV-ED data. We further divided the data into training and validation datasets by random sampling without replacement at a 7:3 ratio. We carried out external validation with MIMIC-IV-ED. Our prediction model included logistic regression (LR), random forest (RF) classifier, a multilayer perceptron (MLP), and a recurrent neural network (RNN). To analyze the model performance, we used area under the receiver operating characteristic curve (AUROC). Results: Data of 89,250 visits of patients who met prespecified criteria were used to develop a latent-shock prediction model. Data of 142,250 patient visits from MIMIC-IV-ED satisfying the same inclusion criteria were used for external validation of the prediction model. The AUROC values of prediction for latent shock were 0.822, 0.841, 0.852, and 0.830 with RNN, MLP, RF, and LR methods, respectively, at 3 h before latent shock. This is higher than the shock index or adjusted shock index. Conclusion: We developed a latent shock prediction model based on 24 h of vital-sign sequence that changed with time and predicted the results by individual.
KEYWORDS: Shock, clinical decision support system, emergency department, artificial intelligence
INTRODUCTION
Shock is a physiologic state that progresses continuously unless treated. Without proper management, patients in a shock state eventually progress to end-stage organ dysfunction (1,2). If shock continues without proper management, the patient will die. However, patients in the compensation stage of shock are difficult to detect in the emergency department (ED) based on a single measurement of vital signs at triage. Clinicians use several scoring tools and triage systems to detect patients in the early stage of shock to prevent progression (3–5).
Considerable effort has been devoted to classifying the causes of shock. Scoring tools such as the Early Warning Score (MEWS), shock index, and quick Sequential Organ Failure Assessment (qSOFA) or Sequential Organ Failure Assessment (SOFA) are widely used in EDs (6–8). In addition, machine learning–based shock-prediction models are being studied (9–12), and vital-sign information can be obtained in real time. A previous study reported the use of serial vital signs in predicting patient status in intensive care units (ICUs) or wards (13). However, few studies have explored the use of sequential vital-sign information in EDs (14).
The early phase of shock is difficult to define clinically or to predict because, as a compensatory-response state, signs of organ failure can be absent and difficult to detect, based on laboratory results (15). Scoring and shock predictions have been based on information collected at one specific time, hindering detection of the early phase (11). To predict or detect shock at the early stage, not only the values of vital signs, but also their trends are important (1,13,16–20).
This study aims to predict progress to latent shock among patients in the ED by observing sequential vital-sign changes.
METHODS
This study was approved by the Institutional Review Board (IRB) of Samsung Medical Center. The need for informed consent was waived due to the retrospective, observational, and anonymous nature of the study (IRB no. IRB 2022-04-063-001).
Study setting
This retrospective study was carried out in an ED of a tertiary teaching hospital in a metropolitan city. The hospital holds 1,975 beds, and its ED contains 69 beds and treats an average of 75,000 to 80,000 patients each year.
Study population
Patients who visited the ED between January 1, 2016, and December 31, 2020, were included in our study. Patients younger than 18 years, dead on arrival (DOA), in cardiac arrest, left without being seen, or without an available initial Korean Triage Acuity Scale (KTAS) score were excluded from the population. The KTAS is a five-level rating of disease/injury severity used in South Korea. Patients with fewer than three vital-sign recordings within the 24 h before the measured outcome were excluded. Patients who were hypotensive or on inotropic medication at the time of ED admission were also excluded, as they were already on high alert and clinically diagnosed by the ED clinician as being in shock.
We used the MIMIC-IV-ED database, a publicly available database sourced from the electronic health record of the Beth Israel Deaconess Medical Center, for external validation. This dataset is freely available and contains ED visit information about triage, medication, discharge, and diagnosis (21,22).
Outcome
The primary outcome measure of this study is latent shock. A latent shock patient is defined as one whose initial vital signs were normal but was later prescribed inotropic agents (dobutamine, dopamine, epinephrine, norepinephrine, or vasopressin) or who had two consecutive recordings of mean blood pressure (MBP) less than 65 mm Hg during the ED visit. The other patients were defined as non–latent shock patients.
Predictors
Vital-sign recordings of diastolic blood pressure (DBP; mm Hg), systolic blood pressure (SBP; mm Hg), pulse rate (PR; beats per minute), body temperature (BT; °C), respiratory rate (RR; breath per minute), and peripheral capillary oxygen saturation (SpO2; %) were used as predictors. Vital-sign measurements within 24 h before latent shock or the last vital-sign measurements in non–latent shock patients were included (Fig. 1). Vital-sign records were grouped by hour, and recordings were averaged when vital signs were measured more than once an hour.
Data preparation and missing data handling
We split the dataset into training, validation, and test subsets. Training and validation datasets (73,067 ED visits) were collected from January 1, 2016, to December 31, 2019. We further divided the training and validation datasets by random sampling without replacement at a ratio of 7:3. The training dataset was used to train our prediction models, and the validation dataset was used to tune the hyperparameters. The test dataset (20,127 ED visits) was collected from January 1, 2020, to December 31, 2020, comprised data not used during the training phase, and was used only to determine the final model performance.
Because the vital-sign data in electronic health records (EHRs) vary over time, data manipulation techniques were needed to appropriately represent patient conditions. First, we calculated the average value to create only one value for each time point. Because there are six vital signs and 24 timeframes, each visit is represented as a 6 × 24 matrix. Second, we forward-filled the null values for times without a value.
For example, if the sequence had values of [1, 2, missing, 3], we carried-forward the most recent value for imputation. With the given example, the output would be [1–3]. However, there was a case in which carry-forward imputation could not be implemented. In this case, we applied three other imputation methods: impute with −1, impute with 0, and impute with average. In a given sequence [missing, missing, 1, 2, 3], the first two missing values do not have an associated latest value to impute for carry-forward imputation. Instead, we impute with −1 to create the output sequence [−1, −1, 1, 2, 3], with 0 to produce output sequence [0, 0, 1, 2, 3], or with the average to produce [1–3]. We compare the performance among imputation methods.
The dimensionalities of the input data differed by prediction model. For logistic regression (LR), random forest (RF), and multilayer perceptron (MLP) methods, we flattened the 6 × 24 matrix into a vector with 144 values. For recurrent neural network (RNN), the dimensionality was unchanged (Fig. 1). Following the development of the prediction model, we performed external validation using the MIMIC-IV-ED database.
Machine learning
Our prediction model included LR, RF classifier, MLP, and RNN. The baseline model was LR with L2 regularization. For the RF classifier, we used 100 trees in the forest and a Gini impurity to measure the quality of the split. In our MLP, we used eight hidden layers, each with 128 neurons. For RNN, we used one linear embedding layer with 300 neurons, four LSTM layers with 64 neurons, and four linear layers with 64 neurons, followed by four LSTM layers.
Model evaluation
To analyze model performance, we used the AUROC approach. Model performance was reported on the test set using 1,000 bootstrapped samples to calculate the mean and 95% confidence interval. As this model predicts latent shock, we compared the results with actual outcomes in the ED.
We compared the model detection time and the time of actual initial bolus hydration for management of shock (1,7,23,24). As this was a retrospective study, we examined bolus hydration time to indirectly evaluate the time of clinician recognition of and management time of shock. Bolus hydration time was defined as that at which a specific amount of fluid was ordered to be infused in less than 1 h. We also compared the AUROC of the model with a shock index and an adjusted shock index.
Frequency of vital-sign assessment
The frequency of vital-sign assessment for latent shock and non–latent shock patients was analyzed by hour before outcome. Average vital-sign recordings for latent shock and non–latent shock patients were recorded by hour before outcome.
Statistical analysis
Patient characteristics were described using descriptive statistics. Demographic data, vital-sign data, KTAS scores, initial nurse assessments, vital-sign reassessment time gaps, and frequency of vital-sign reassessment for latent shock and non–latent shock patients were compared using t test and chi-square test at a 0.01 significance level. Repeated vital-sign measurements within the 24 h timeframe were described as median and IQR.
RESULTS
Study population and demographics
Patients younger than 18 years (n = 60,122), DOA, in cardiac arrest, missing KTAS information (n = 10,634), left without being seen (n = 20,480), with a mean blood pressure lower than 65 (n = 8,984), or with a missing or abnormal vital-sign record (n = 111) were excluded from the study population. Data of 89,250 visits of patients who met the prespecified criteria were included in the final analysis. Among these visits, 4% (3,650) involved latent shock patients (Fig. 2, Table 1).
Table 1.
Latent shock | Non–latent shock | P* | |
---|---|---|---|
SMC | (n = 3,650) | (n = 85,600) | |
Age, mean ± SD | 66 [56;75] | 62 [50;72] | <0.001 |
Sex, n (%) | 0.151 | ||
Female | 1,666 (45.6%) | 40,121 (46.9%) | |
Male | 1,984 (54.4%) | 45,479 (53.1%) | |
KTAS, n (%) | <0.001 | ||
1 (most urgent) | 69 (1.9%) | 282 (0.3%) | |
2 | 699 (19.2%) | 6,925 (8.1%) | |
3 | 2,360 (64.7%) | 51,312 (59.9%) | |
4 | 499 (13.7%) | 25,113 (29.3%) | |
5 (least urgent) | 23 (0.6%) | 1,968 (2.3%) | |
Length of stay, h | 17.1 [8.8;24.6] | 12.0 [7.1;21.9] | <0.001 |
MIMIC | (n = 825) | (n = 141,425) | |
Age, mean ± SD | 65.0 ± 17.6 | 56.2 ± 20.2 | <0.01 |
Sex, n (%) | 0.384 | ||
Female | 436 (52.8%) | 76,983 (54.4%) | |
Male | 389 (47.2%) | 64,442 (45.6%) | |
Acuity, n (%) | <0.01 | ||
5 (most urgent) | 0 (0.0%) | 47 (0.0%) | |
4 | 1 (0.1%) | 2,374 (1.7%) | |
3 | 119 (14.4%) | 72,461 (51.2%) | |
2 | 415 (50.3%) | 60,006 (42.4%) | |
1 (least urgent) | 290 (35.2%) | 6,537 (4.6%) |
KTAS, Korean Triage and Acuity Scale.
*P values were calculated using t test or chi-square test based on variable type.
The MIMIC-IV database was used to externally validate our latent shock prediction model. Data of 142,250 visits of patients satisfied the same inclusion criteria. Of the total patients, 825 were included as the latent shock group. The patient demographics are described in Table 1.
Vital-sign records
Results for latent shock and non–latent shock patients are shown in Figure 3. The mean (standard deviation) values of vital-sign reassessment time gap and frequency of reassessment for latent shock and non–latent shock patients were 1.0 (1.2) h and 2.0 (1.7) h and 7.0 (4.2) and 6.0 (3.2) times, respectively. Generally, latent shock patients had lower SBP, DBP, and SpO2 and higher RR, PR, and BT (Fig. 3).
Model performance
Figure 4 and Table S1, http://links.lww.com/SHK/B729, show the AUROC values for prediction of latent shock. These findings were 0.822, 0.841, 0.852, and 0.830 with RNN, MLP, RF, and LR methods, respectively, at 3 h before latent shock. The average AUROC of predictions for latent shock was greater than 0.7 at 12 h before and greater than 0.85 at 1 h before latent shock using the four methods. The AUROC values of shock index and adjusted shock index were between 0.49 and 0.73 (Table S2, http://links.lww.com/SHK/B729). Table S3, http://links.lww.com/SHK/B729, shows the sensitivity analysis results of the RF model.
Figure S1, http://links.lww.com/SHK/B729, shows the interval between bolus hydration and the hypotensive shock event. The number is the time between the hypotensive shock event and hydration. Most hydration, which might refer to the time that clinician shock recognition and response occur, is performed when shock occurs or after sock occurs, whereas the latent shock prediction model shows more than 0.8 AUROC 3 h before latent shock occurs.
Table S3, http://links.lww.com/SHK/B729, shows variation in vital signs. The factors with the largest difference between latent shock and non–latent shock patients were PR and SBP. Similar trends were observed in the MIMIC-IV-ED dataset. Generally, latent shock patients had lower SBP, DBP, and SpO2 and higher RR, PR, and BT. Table S4, http://links.lww.com/SHK/B729, shows variation of repeated measures for each vital sign. The proportion of vital-sign information provided without missing was reported in Table S5, http://links.lww.com/SHK/B729.
DISCUSSION
This study predicted latent shock with a greater than 0.8 AUROC at 3 h before shock and showed better performance than traditional tools, such as shock index or adjusted shock index, using sequential vital signs in the ED (11,12). Also, this study conducted external validation with MIMIC-IV-ED data, which had no relationship with the original data, to show applicability with other databases.
Our study tries to predict latent shock to allow physiologic compensation in the initial stage in the ED. If properly managed, the patient might recover and avoid shock. Viewing shock as a continuous process, detection of its early stage can help to prevent its occurrence. Few studies have used continuous data to predict shock along a continuum or to define a preshock stage or latent shock progression.
Also, as our latent shock prediction model was based on ED vital-sign records, which can be recoded both manually and automatically, it can produce real-time predictions in the presence of more than three measures of each vital sign.
Machine learning–based latent shock prediction models outperform shock indexes and age-adjusted shock indexes, and their performance increases as the time to shock approaches (Table S1, http://links.lww.com/SHK/B729). This might be because the shock index only considers period information of vital signs, while the machine learning model uses continuous vital-sign information to monitor serial changes. In addition, the latent shock prediction model incorporates all vital-sign data, including BT, RR, and SpO2, which are not included in the shock index. These additional vital indicators can be used to predict shock more accurately. Since our prediction method is based only on vital signs, the latent shock prediction model based on machine learning might be used instead of the shock or adjusted shock index.
As shown in Figure S1, http://links.lww.com/SHK/B729, the first bolus hydration for shock management was administered typically after shock recognition. Here, we demonstrate that our latent shock prediction model can accurately predict latent shock within a few hours before onset. The latent shock prediction model based on MLP, RF, and LR methods showed a greater than 0.8 AUROC at 3 h before latent shock. This suggests that our model may be able to predict latent shock based on vital-sign trends before it can be recognized and managed clinically (18,23,24). This could allow early intervention to prevent progression to a shock state.
As this study was performed in patients in the ED, which is a very different environment from the ICU, patient information might not be consistent and detailed. Therefore, patients had variable numbers of vital-sign measurements, and some had only a short sequence of vital signs. Despite this, the latent shock prediction model showed a high prediction rate for latent shock, supporting its use in the ED to produce diverse amounts of vital-sign information by individual.
Also, the newly available MIMIC-IV-ED database was used for external validation. Despite being based on a single tertiary hospital, our presented model showed high accuracy in external validation with MIMIC-IV-ED. As MIMIC-IV-ED data are from a different nation from the original data, their agreement suggests that the model can be used in other fields. Prospective study is needed before clinical use of this model.
This study has several limitations. First is its retrospective nature. However, several dataset-based validations were performed to minimize bias. We performed external validation using an open-source database, MIMIC-IV-ED. Second, this study was based on limited segmented and unstructured vital-sign information rather than continuous data. This is related to the ED environment, which is different from that of the ICU, where intensive monitoring is possible in selected patients. The ED is a crowded and dynamic place where patients are continuously moving in and out with different time stamps and regularity. Therefore, it is also hard to get frequent vital-sign information as much as ward or ICU. Frequency and regularity of vital-sign measurements vary depending on patient condition. For this reason, we tried many methods for calculating missing vital signs, such as carry-forward, carry-backward, and imputation, as explained above. Third, for this same reason, there were several missing values during imputation of vital-sign records (Table S5, http://links.lww.com/SHK/B729). As explained above, we performed imputation in several ways to produce a prediction model. In addition, we performed external validation. Fourth, there was no distinct shock category. Although we attempted to define and categorize shock based on its variety of physiological underpinnings and causes, the cause of shock was not always obvious and sometimes was multifactorial. In addition, since this was a retrospective study, it was impossible to identify every cause of shock with a medical record. We simplified the prediction outcome using only vital signs, vasopressors, and inotropic usage. Finally, the loaded volume was not considered because we attempted to concentrate on the time of shock detection rather than its management. In addition, bolus hydration cannot represent the exact time of clinician recognition of shock in the real world. However, we performed this study to evaluate the time of recognition of shock in a retrospective setting. We evaluated this to determine clinician recognition and time of indirect management of shock, as this was a retrospective study.
CONCLUSION
In conclusion, this study well predicted latent shock using cumulative 24 h sequential vital-sign information and showed better performance than traditional tools using sequential vital-sign data in the ED. Prospective study is needed, and individual prediction must be validated before use in the field.
Supplementary Material
Footnotes
All the authors made substantial contributions to the concept and design of the article. H.C. and W.J. contributed equally to this work. H.C. and J.H. contributed to the conceptualization and methodology. W.J., J.H., and J.Y.Y. contributed to the validation and formal analysis. H.C., J.H.H., and W.J. contributed to the data curation. H.C. and W.J. contributed to the writing–original draft preparation and visualization. H.C., S.H., and T.K. contributed to the writing–review and editing. G.T.L., J.E.P., S.U.L., S.Y.H., H.Y., W.C.C., T.G.S., and T.K. contributed to the supervision. All the authors have read and given their final approval for publication of this version of the article.
The authors report no conflicts of interest.
The authors received no financial support for the research, authorship, and/or publication of this article.
The datasets generated and analyzed during the current study are not publicly available because they include some patient information. However, the datasets are available from the corresponding author on reasonable request.
Supplemental digital content is available for this article. Direct URL citation appears in the printed text and is provided in the HTML and PDF versions of this article on the journal’s Web site (www.shockjournal.com).
Contributor Information
Hansol Chang, Email: briquet90@naver.com.
Weon Jung, Email: angela.weon@gmail.com.
Juhyung Ha, Email: ha2399@gmail.com.
Jae Yong Yu, Email: icalust@naver.com.
Sejin Heo, Email: silversh06@naver.com.
Gun Tak Lee, Email: zenky07@naver.com.
Jong Eun Park, Email: jebbfirst@gmail.com.
Se Uk Lee, Email: seukemmd@gmail.com.
Sung Yeon Hwang, Email: romblon@naver.com.
Hee Yoon, Email: wildhi.yoon@gmail.com.
Won Chul Cha, Email: docchaster@gmail.com.
Tae Gun Shin, Email: tackles@naver.com.
REFERENCES
- 1.Cannon JW: Hemorrhagic shock. N Engl J Med 378(4):370–379, 2018. [DOI] [PubMed] [Google Scholar]
- 2.Kakihana Y Ito T Nakahara M, et al. : Sepsis-induced myocardial dysfunction: pathophysiology and management. J Intensive Care 4:22, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Fernandes M Vieira SM Leite F, et al. : Clinical decision support systems for triage in the emergency department using intelligent systems: a review. Artif Intell Med 102:101762, 2020. [DOI] [PubMed] [Google Scholar]
- 4.Partovi SN, Nelson BK, Bryan ED, Walsh MJ: Faculty triage shortens emergency department length of stay. Acad Emerg Med 8(10):990–995, 2001. [DOI] [PubMed] [Google Scholar]
- 5.Abdulwahid MA, Booth A, Kuczawski M, Mason SM: The impact of senior doctor assessment at triage on emergency department performance measures: systematic review and meta-analysis of comparative studies. Emerg Med J 33(7):504–513, 2016. [DOI] [PubMed] [Google Scholar]
- 6.Park H Shin TG Kim WY, et al. : A quick sequential organ failure assessment-negative result at triage is associated with low compliance with sepsis bundles: a retrospective analysis of a multicenter prospective registry. Clin Exp Emerg Med 9(2):84–92, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Khwannimit B, Bhurayanontachai R, Vattanavanit V: Comparison of the accuracy of three Early Warning Scores with SOFA score for predicting mortality in adult sepsis and septic shock patients admitted to intensive care unit. Heart Lung 48(3):240–244, 2019. [DOI] [PubMed] [Google Scholar]
- 8.Kim I Song H Kim HJ, et al. : Use of the National Early Warning Score for predicting in-hospital mortality in older adults admitted to the emergency department. Clin Exp Emerg Med 7(1):61–66, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Giannini HM Ginestra JC Chivers C, et al. : A machine learning algorithm to predict severe sepsis and septic shock: development, implementation, and impact on clinical practice. Crit Care Med 47(11):1485–1492, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kim J Chang H Kim D, et al. : Machine learning for prediction of septic shock at initial triage in emergency department. J Crit Care 55:163–170, 2020. [DOI] [PubMed] [Google Scholar]
- 11.Fleuren LM Klausch TLT Zwager CL, et al. : Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med 46(3):383–400, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Darwiche A, Mukherjee S: Machine Learning Methods for Septic Shock Prediction. AIVR 2018: Proceedings of the 2018 International Conference on Artificial Intelligence and Virtual Reality. 104–110. 10.1145/3293663.3293673. [DOI] [Google Scholar]
- 13.Yoon JH Jeanselme V Dubrawski A, et al. : Prediction of hypotension events with physiologic vital sign signatures in the intensive care unit. Crit Care 24(1):661, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Chang H, Cha WC: Artificial intelligence decision points in an emergency department. Clin Exp Emerg Med 9(3):165–168, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chien S: Role of the sympathetic nervous system in hemorrhage. Physiol Rev 47(2):214–288, 1967. [DOI] [PubMed] [Google Scholar]
- 16.Kumar A Roberts D Wood KE, et al. : Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med 34(6):1589–1596, 2006. [DOI] [PubMed] [Google Scholar]
- 17.Brekke IJ Puntervoll LH Pedersen PB, et al. : The value of vital sign trends in predicting and monitoring clinical deterioration: a systematic review. PloS One 14(1):e0210875, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rhodes A Evans LE Alhazzani W, et al. : Surviving Sepsis campaign: international guidelines for Management of Sepsis and Septic Shock: 2016. Intensive Care Med 43(3):304–377, 2017. [DOI] [PubMed] [Google Scholar]
- 19.Harjola VP Lassus J Sionis A, et al. : Clinical picture and risk prediction of short-term mortality in cardiogenic shock. Eur J Heart Fail 17(5):501–509, 2015. [DOI] [PubMed] [Google Scholar]
- 20.Shoemaker WC: Temporal physiologic patterns of shock and circulatory dysfunction based on early descriptions by invasive and noninvasive monitoring. New Horiz 4(2):300–318, 1996. [PubMed] [Google Scholar]
- 21.Johnson A Bulgarelli L Pollard T, et al. : MIMIC-IV-ED (version 1.0). PhysioNet. 2021. 10.13026/77z6-9w59. [DOI] [Google Scholar]
- 22.Goldberger AL Amaral LA Glass L, et al. : PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220, 2000. [DOI] [PubMed] [Google Scholar]
- 23.Singer M Deutschman CS Seymour CW, et al. : The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315(8):801–810, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hofmeyr GJ, Mohlala BK: Hypovolaemic shock. Best Pract Res Clin Obstet Gynaecol 15(4):645–662, 2001. [DOI] [PubMed] [Google Scholar]