Abstract
Background:
We used machine learning models incorporating rich electronic medical record (EMR) data to predict neurological outcomes after venoarterial extracorporeal membrane oxygenation (VA-ECMO).
Methods:
This was a retrospective review of adult (≥ 18 years) patients undergoing VA-ECMO between 6/2016 and 4/2022 at a single center. The primary outcome was good neurological outcome, defined as a modified Rankin Scale score of 0 to 3, evaluated at hospital discharge. We extracted every measurement of 74 vital and laboratory values, as well as circuit and ventilator settings, from 24 h before cannulation through the entire duration of ECMO. An XGBoost model with Shapley Additive Explanations was developed and evaluated with leave-one-out cross-validation.
Results:
Overall, 194 patients undergoing VA-ECMO (median age 58 years, 63% male) were included. We extracted more than 14 million individual data points from the EMR. Of 194 patients, 39 patients (20%) had good neurological outcomes. Three models were generated: model A, which contained only pre-ECMO data; model B, which added data from the first 48 h of ECMO; and model C, which included data from the entire ECMO run. The leave-one-out cross-validation area under the receiver operator characteristics curves for models A, B, and C were 0.72, 0.81, and 0.90, respectively. The inclusion of on-ECMO physiologic, laboratory, and circuit data greatly improved model performance. Both modifiable and nonmodifiable variables, such as lower body mass index, lower age, higher mean arterial pressure, and higher hemoglobin, were associated with good neurological outcome.
Conclusions:
An interpretable machine learning model from EMR-extracted data was able to predict neurological outcomes for patients undergoing VA-ECMO with excellent accuracy.
Keywords: ECMO, Neurological outcomes, Machine learning
Introduction
Venoarterial extracorporeal membrane oxygenation (VA-ECMO) is an advanced life support modality capable of mechanical blood circulation and extracorporeal gas exchange for patients experiencing cardiorespiratory failure refractory to standard treatments [1]. Although VA-ECMO can be lifesaving, it carries a major risk for patient morbidity and mortality, with acute brain injury (ABI) being one of the most frequent complications, affecting 8% to 30% of patients during VA-ECMO support [2-4]. Because ECMO use is expanding and ABI is one of the leading causes of mortality, predicting neurological outcomes using precannulation and pericannulation variables—including “modifiable” factors—is critical in improving outcomes in this high-risk population.
To this end, noninvasive multimodal neuromonitoring protocols and prognostic models for survival capturing either pre-ECMO or peri-ECMO variables have been developed [5-11]. The Survival After Venoarterial-ECMO score and Simplified Acute Physiology Score II have been validated to predict long-term mortality after VA-ECMO [5]; however, their discriminatory power leaves substantial room for improvement [12]. As such, recent studies have started to employ the use of machine learning (ML) algorithms to develop prognostic models but have been limited to pre-ECMO variables and result in modest areas under the curve (AUCs), [13] possibly due to the lack of on-ECMO data in their databases. Previous studies using traditional statistical methods have revealed individual predictors for ABI during VA-ECMO, including high precannulation partial pressure of carbon dioxide in arterial blood (PaCO2), severe hyperoxia, and early low pulse pressure (< 20 mm Hg). Pre-ECMO hemoglobin levels, serum lactic acid levels, and hypothermia have also been identified as contributors to functional neurological outcomes [14-18]. Still, there exists a need for a modern comprehensive prediction model for neurological outcomes to better guide patient selection and management.
The critical illness and complex physiology of patients undergoing VA-ECMO, as well as frequently changing mechanical support factors, poses a challenge in modeling outcomes. Thus, we aimed to develop an ML framework of a new prognostic model for neurological outcomes after VA-ECMO, using granular data both prior to initiation of ECMO support and during the treatment course. We hypothesize an ML model incorporating rich electronic medical record (EMR) of both pre-ECMO and on-ECMO variables can be constructed with clinically relevant accuracy and predictive capability. This model may advise the proper use of VA-ECMO in the appropriate patient populations (i.e., those who have a good chance of neurological recovery) and inform clinicians on modifiable factors associated with good neurological outcomes that can be further studied in clinical trials.
Methods
Study Population
This study was approved by the Institutional Review Board (no. 00216321) of the Johns Hopkins University School of Medicine on October 22, 2019, with a waiver of informed consent. We retrospectively reviewed a cohort of consecutive adult patients supported by ECMO at the Johns Hopkins Hospital between June 2016 and April 2022. Patients who received venovenous (VV) ECMO or extracorporeal cardiopulmonary resuscitation (ECPR) support were excluded (Fig. 1). All patients undergoing ECMO at our institution were housed in the cardiovascular surgical intensive care unit and received standardized noninvasive multimodal neurological monitoring, including serial neurologic examinations, transcranial Doppler, electroencephalogram, somatosensory evoked potentials, and neuroimaging as clinically indicated [19]. Patients with preexisting significant neurologic issues is an exclusion criterion for VA-ECMO at our institution; however, patients with a history of ABI without permanent and significant deficits may be considered eligible on a case-by-case basis.
Fig. 1.

Patient cohort selection. A total of 310 adult patients underwent ECMO support between June 2016 and April 2022. Patients with VV-ECMO and ECPR (n = 102) were excluded, and an additional 14 patients undergoing VA-ECMO without available laboratory data or determination of neurological outcome were excluded. The final cohort was 194 patients undergoing VA-ECMO, of whom 39 (20%) had a good neurological outcome. ECMO, extracorporeal membrane oxygenation, ECPR, extracorporeal cardiopulmonary resuscitation, mRS, modified Rankin Scale, VV, venovenous, VA, venoarterial
Data Collection and Preprocessing
For this cohort, we prospectively collected data on baseline characteristics, past medical history, pre-ECMO variables, on-ECMO clinical events, survival to discharge, and neurological outcome. Further, we a priori compiled a list of time series variables that are electronically recorded as standard-of-care for all patients undergoing ECMO. All time series laboratory values, ECMO and ventilator settings, medications, and vitals during hospitalization were retrieved through an automated data query of the EMR system (Table 1). In total, 74 variables were collected (Fig. 2), yielding a total of more than 14 million individual data points. Vitals, including blood pressure, heart rate, oxygen saturation (SpO2), temperature, and central venous pressure were recorded into the EMR at least every 15 min. Blood pressure was measured from a right radial arterial line whenever possible. Arterial blood gas is collected every 2 to 4 h as clinically indicated, with more frequent collections if necessary. We also calculated pericannulation changes in the partial pressure of oxygen and carbon dioxide (“delta PaCO2” and “deltaPaCO2”), which was defined as the difference between the mean arterial oxygen and carbon dioxide, respectively, in the 12 h before and after cannulation. Ventilator data were recorded at any time a change was made to ventilator settings. ECMO settings were recorded at every change in actual flow (L/min) or pump speed; at minimum, a value was recorded every 2 h if there were no measurable changes in ECMO parameters. Near-infrared spectroscopy data were recorded every hour. Dosing of medications on a continuous infusion were recorded at least every 15 min or whenever there was a change in dosing, whichever was more frequent. Finally, all other laboratory measurements, including complete blood counts, metabolic panels, coagulation labs, and arterial or venous blood gasses, were collected when clinically indicated.
Table 1.
Baseline demographics, medical history, mechanical support characteristics, and incidence of acute brain injury in patients with good (mRS 0-3) and poor (mRS 4-6) neurological outcomes after venoarterial extracorporeal membrane oxygenation
| Total (N = 194) | Good Neurological Outcome (n = 39, 20%) |
Poor Neurological Outcome (n = 155, 80%) |
p value | |
|---|---|---|---|---|
| Demographics | ||||
| Age | 58.0 (48.0, 68.0) | 47.0 (34.0, 57.5) | 60.0 (50.0, 69.5) | < 0.001 |
| BMI | 29.2 (25.0, 34.8) | 25.6 (23.7, 28.9) | 30.7 (25.2, 36.2) | < 0.001 |
| Male | 123 (63.4%) | 25 (64.1%) | 98 (63.2%) | 1.000 |
| Race | ||||
| White | 117 (60.3%) | 21 (53.8%) | 96 (61.9%) | 0.117 |
| Black | 52 (26.8%) | 16 (41.0%) | 36 (23.2%) | |
| Asian | 10 (5.2%) | 0 (0%) | 10 (6.5%) | |
| Hispanic | 3 (1.5%) | 0 (0%) | 3 (1.9%) | |
| Others | 12 (6.2%) | 2 (5.1%) | 10 (6.5%) | |
| Past medical history | ||||
| Hypertension | 136 (70.1%) | 21 (53.8%) | 115 (74.2%) | 0.022 |
| Hyperlipidemia | 101 (52.1%) | 14 (35.9%) | 87 (56.1%) | 0.037 |
| Diabetes | 63 (32.5%) | 10 (25.6%) | 53 (34.2%) | 0.408 |
| Congestive heart failure | 60 (30.9%) | 10 (25.6%) | 50 (32.3%) | 0.545 |
| Chronic kidney disease | 27 (13.9%) | 3 (7.7%) | 24 (15.5%) | 0.318 |
| Atrial fibrillation | 49 (25.3%) | 3 (7.7%) | 46 (29.7%) | 0.009 |
| Prior ischemic stroke | 16 (8.2%) | 2 (5.1%) | 14 (9.0%) | 0.641 |
| Prior intracranial hemorrhage | 3 (1.5%) | 0 (0%) | 3 (1.9%) | 0.881 |
| ECMO variables | ||||
| Indications | ||||
| Cardiogenic shock | 94 (48.5%) | 24 (61.5%) | 70 (45.2%) | 0.114 |
| Post cardiotomy shock | 87 (44.8%) | 11 (28.2%) | 76 (49.0%) | |
| Others | 13 (6.7%) | 4 (10.3%) | 9 (5.8%) | |
| Cannulation strategy | ||||
| Central | 85 (43.8%) | 7 (17.9%) | 78 (50.3%) | < 0.001 |
| Peripheral | 109 (56.2%) | 32 (82.1%) | 77 (49.7%) | |
| ECMO duration (hours) | 120 (64.4, 209) | 136 (71.6, 176) | 117 (62.5, 214) | 0.603 |
| Outcomes | ||||
| Acute brain injury | 67 (34.5%) | 10 (25.6%) | 57 (36.8%) | 0.263 |
| Ischemic stroke | 18 (9.3%) | 5 (12.8%) | 13 (8.3%) | 0.807 |
| Hypoxic ischemic brain injury | 11 (5.7%) | 1 (2.7%) | 10 (6.5%) | 0.449 |
| Diffuse cerebral edema | 7 (3.6%) | 1 (2.7%) | 6 (3.9%) | 0.931 |
| Seizure | 7 (3.6%) | 2 (5.1%) | 5 (3.2%) | 1.000 |
| Intraparenchymal hemorrhage | 5 (2.6%) | 0 (0%) | 5 (3.2%) | 0.487 |
| Subarachnoid hemorrhage | 4 (2.1%) | 1 (2.7%) | 3 (1.9%) | 1.000 |
| Subdural hemorrhage | 6 (3.1%) | 1 (2.7%) | 5 (3.2%) | 1.000 |
Good neurological outcomes are defined as a modified Rankin Scale (mRS) of 0 to 3 at hospital discharge; poor outcomes are defined as mRS of 4 to 6. Numerical values are reported as median (interquartile range/IQR) and binary values are reported as n (%). Abbreviations: BMI-body mass index; ECMO-extracorporeal membrane oxygenation
Fig. 2.

Variables included in model training. Electronic medical record query and chart review yielded 14 million individual measurements available for analysis. BMI, body mass index, BP, blood pressure, BUN, blood urea nitrogen, CAM-ICU, Confusion Assessment Method-Intensive Care Unit, ECMO, extracorporeal membrane oxygenation, MCH, mean corpuscular hemoglobin, MCHC, mean corpuscular hemoglobin concentration, MCV, mean corpuscular volume, RASS, Richmond Agitation-Sedation Scale, RBC, red blood cell, RDW, red blood cell distribution width, WBC, white blood cell
Data merging and preprocessing was done using packages in R (version 4.2.2). For every time series variable, the median, minimum, maximum, standard deviation (as a measure of variability), and slope were separately calculated for the following time periods: 24 h before ECMO cannulation, 0 to 6 h on ECMO, 0 to 12 h on ECMO, 0 to 24 h on ECMO, 0 to 48 h on ECMO, and for the entire ECMO course. The vasopressor dosage equivalence (VDE) score for each patient was computed by the following [14]: (norepinephrine dosage (μg/kg/min) × 100) + (epinephrine Dosage (μg/kg/min) × 100) + (phenylephrine dosage (μg/kg/min) × 10) + (dopamine dosage (μg/kg/min) × 1) + (vasopressin dosage (U/min) × 250) + (angiotensin II dosage (μg/kg/min) × 1000) + (metaraminol dosage (μg/kg/min) × 12.5). The maximum vasopressor dosage equivalence score was taken for each of the previously mentioned time periods.
Categorical variables were one-hot encoded. Variables with > 10% missingness were excluded. Of the remaining included variables, there was 5.3% missingness across the entire dataset. These were typically specialized measurements, such as thromboelastography or left and right cerebral near-infrared oximetry measurements. We handled the remaining missing values by imputation: binary variables were imputed by taking patients with the shortest Hamming distance from the patients with the missing value and replacing it with the mode from the reference patients [20], whereas continuous variables were imputed by median imputation.
Model Timing
The resulting imputed file was then split into three files for algorithm fitting: variables within the 24 h before ECMO cannulation (model A), within 24 hours before cannulation up to 48 h after cannulation (model B), and within 24 hours before cannulation through the entire duration of the ECMO run (model C). We chose to fit these three models at different time points because they represent three distinct clinical questions. Model A, which only uses data available before cannulation, is meant to evaluate if ML can assist in determining which patients may be good ECMO candidates from a neurologic outcomes perspective. Model B adds a substantial amount of mechanical and physiologic information collected after cannulation and is meant assist in continuing prognostication of neurologic outcomes for patients recently cannulated. Finally, we evaluated model C, which captures the most comprehensive amount of data from 24 h before cannulation to throughout the entire ECMO run, to investigate potentially modifiable variables associated with neurologic outcomes.
Outcome
The primary outcome variable was neurological status at hospital discharge classified as “good neurological outcome” for those with a modified Rankin Scale (mRS) score of 0 to 3 and as “poor neurological outcome” for those with an mRS score of 4 to 6. These outcomes were independently evaluated by two clinicians and further adjudicated in times of disagreement.
ML Algorithm
Using Python, we assessed the suitability of 4 ML algorithms in predicting neurological outcomes from our dataset containing variables from the entire hospitalization: random forest, CatBoost, LightGBM, and XGBoost. Hyperparameter tuning for each model was performed using a Bayesian optimization for each algorithm across the dataset randomly divided into training (70%) and testing (30%). Each algorithm was then applied to our entire cohort using a leave-one-out-cross-validation (LOOCV) approach. In the LOOCV, a single observation is removed from a dataset with n observations to act as the test set, and the remainder of the dataset is used to train the ML model; the process is repeated for the entirety of the dataset such that each observation is used as the test set once and a total of n models are trained and then tested on the holdout n observations, which can be amalgamated to form a single testing set of size n from which metrics including receiver operating characteristic (ROC) AUC and precision-recall AUC can be calculated. Given no substantial differences between the four algorithms’ performances, measured by their respective AUC-ROC from the LOOCV, XGBoost was chosen as the gradient boosted decision tree algorithm for this study because it required the least time for training and testing compared with the other models.
ML Pipeline
For each dataset, an XGBoost framework was trained to predict neurological outcomes. Hyperparameters for the XGBoost models were selected using a Bayesian optimization algorithm applied to the desired dataset, randomly divided into a training (70%) and testing (30%) subset. For each dataset, an XGBoost model with the specified hyperparameters was then trained and tested on the respective train and test subsets. Given the relatively low sample size of our dataset (n = 194), our model framework was also tested on each dataset using an LOOCV approach. The LOOCV approach reduces the risk of bias by testing the ML algorithm on the entirety of the dataset and ensuring replicability of our results.
We assessed the predictive performance of our XGBoost models, either trained and tested on the train and test subset or the LOOCV approach, by calculating the AUC-ROC, AUC precision-recall, and a Brier score from our models’ performance on the testing set (n = 194). After choosing a threshold that maximizes the F1 score (a score that combines precision and recall), further model metrics including sensitivity, specificity, negative predictive value (NPV), and positive predictive value were calculated. The F1 score is a measure of the harmonic mean between precision and recall with a score of 1 indicating perfect precision and recall and 0 indicating the opposite. A threshold to maximize the F1 score was chosen to maximize predictive power of the models for potential clinical applications. However, there is no clear consensus on the best methodology for model evaluation in a highly specialized use situation such as the one we are describing.
We also sought to understand how our models were making their decisions and which variables were most important in predicting neurological outcomes. Across the XGBoost models trained for LOOCV on a particular dataset, we analyzed the ranked feature importance, which reveals the contribution of features in the construction of the boosted decision trees within the models, and the Shapley Additive Explanations (SHAP) values, which reveal the contribution of a feature on the models’ predictions. For each variable in the SHAP summary plot, the values are indicated as being high (red) or low (blue) relative to each variable distribution; each dot, which represents the feature attribution value of each patient, is plotted as a SHAP value on the x-axis. SHAP values measure the predictive impact and those that exceed zero represent a higher likelihood of having a good neurological outcome at discharge. Thus, the SHAP dependence plot can be used to assess how each feature contributes across prognostic models for each dataset. Finally, we also graphically demonstrate the relative importance of each variable by ranking them according to the mean absolute value of the SHAP value. In other words, variables are ranked according to the magnitude of their impact to the model’s output. Although all available variables are fed into the model, certain variables may not provide any incremental discriminatory ability and are subsequently given zero weight in the final model. Both feature importance and SHAP values add interpretability to the XGBoost framework and reveal pertinent clinical variables associated with positive and negative neurological outcomes. All model training, testing, and analysis were done in Python 3.10.9.
Results
Patient Characteristics
We collected data on a total of 194 adults who underwent VA-ECMO cannulation. We identified 155 (79.9%) patients as having “poor” neurological outcomes (mRS score 4–6) and 39 (20.1%) patients as having “good” neurological outcomes (mRS score 0–3). Overall, patients were a median age of 56.3 years (interquartile range 48–68 years), mostly male (63.4%), and had a median ECMO duration of 120 h, with 34.5% of patients experiencing ABI. Patients with poor neurological outcomes had significantly higher median age (60 vs. 47 years, p < 0.001) and body mass index (30.7 vs. 25.6, p < 0.001), and were more likely to have preexisting hypertension (74% vs. 54%, p = 0.022) and atrial fibrillation (30% vs. 8%, p = 0.009) (Table 2). The most common indication for VA-ECMO was medical cardiogenic shock, which was not statistically different between those with good and poor neurological outcome (49% vs. 62%, p = 0.11). Patients with poor neurological outcome were more likely to be centrally cannulated compared peripherally cannulated (50% vs. 18%, p < 0.001).
Table 2.
Full model metrics derived from leave-one-out cross-validation for model A (using only variables available before extracorporeal membrane oxygenation), model B (adding in data from the first 48 h of extracorporeal membrane oxygenation), and model C (adding in data from the entire duration of extracorporeal membrane oxygenation support)
| Model A | Model B | Model C | |
|---|---|---|---|
| Accuracy (gross) | 0.78 | 0.78 | 0.84 |
| AUC-ROC | 0.72 | 0.81 | 0.90 |
| Brier score | 0.63 | 0.33 | 0.33 |
| True positive rate | 0.46 | 0.64 | 0.87 |
| True negative rate | 0.86 | 0.81 | 0.83 |
| False positive rate | 0.14 | 0.19 | 0.17 |
| False negative rate | 0.54 | 0.36 | 0.13 |
| Positive predictive value | 0.45 | 0.46 | 0.56 |
| Negative predictive value | 0.86 | 0.9 | 0.96 |
| Precision | 0.46 | 0.47 | 0.56 |
| Recall | 0.49 | 0.67 | 0.9 |
| F1 score | 0.48 | 0.55 | 0.69 |
| Thresholds | 0.64 | 0.54 | 0.53 |
AUC-ROC, area under the receiver operator characteristics curve
After the data collection, processing, and imputation, model A yielded a total of 73 variables, model B yielded 656 variables, and model C yielded 826 variables for model development.
Model Performance
The LOOCV approach was used for all model evaluations. Using only variables available before ECMO initiation, model A achieved an AUC-ROC of 0.72, Brier score of 0.63, accuracy of 78%, and NPV of 86% (Fig. 3, Table 2). Model B, incorporating variables gathered within the first 48 h of the ECMO run, performed well with an AUC-ROC of 0.81, a Brier score of 0.33, an accuracy of 78%, and an NPV of 90%. Finally, after incorporating all available data throughout the entirety of the ECMO course, model C achieved the highest AUC-ROC of 0.90 with a Brier score of 0.33, an accuracy of 84%, and an NPV of 96%. Our results indicate a graded increase in model performance as more variables over the course of the ECMO run were included in algorithm training. Notably, the NPV was notably high for all three models (NPV > 0.85, Table 2), regardless of which time point data evaluation was limited to. A confusion matrix can be found in the Supplemental material (Table S1).
Fig. 3.

Model performance. Receiver operator characteristics curves for leave-one-out-cross-validation models A (green line, only pre-ECMO features), B (orange line, with data from the first 48 h of ECMO support), and C (blue line, with data from the entire ECMO run). AUC, area under the curve, ECMO, extracorporeal membrane oxygenation
Feature Importance
To better understand the clinical application of the prognostic models, SHAP summary plots were used to identify and visualize the features that contributed most to the models’ predictions. The top 20 variables ranked from most to least impactful on model outcomes across the LOOCV models were plotted for each of the three datasets (Fig. 4).
Fig. 4.

Most important features for each model. Ranked Shapley values for the 20 variables in each machine learning model that provide the most discriminative value. Each dot is a patient, with red and blue values representing higher and lower numerical values, respectively, of the listed variable. Dots to the right and left of the vertical line indicate that the variable was associated with a model prediction of “good” or “poor” neurological outcome, respectively. Variables are vertically ranked by their relative importance to model predictions, calculated by the mean absolute value of the Shapley values. As such, variables with the largest spread along the x-axis are ranked higher, indicating their higher importance in distinguishing between good and poor neurologic outcomes. BP, blood pressure, ECMO, extracorporeal membrane oxygenation, IABP, intraaortic balloon pump, MAP, mean arterial pressure, pRBC, packed red blood cell, RBC, red blood cell, TV, tidal volume
The features with highest importance in model A include many demographic variables; low body mass index, younger age, Black and White race, and no history of congestive heart failure or atrial fibrillation being among the top 20 predictors for good neurological outcome (Fig. 4). Among the clinical variables acquired before ECMO cannulation, higher platelet levels tended to predict a good neurological outcome. Additionally, variables related to SpO2—in particular, higher median SpO2 with lower SpO2 variability—were among the top 10 features of this model (Fig. 4). Higher precannulation PaCO2 was associated with poor neurological outcome. Two individual patients’ SHAP plots are highlighted in Fig. 5a and b. Features in red drove the model to predict a good neurological outcome (as illustrated by arrows on the line graph pointing in the positive direction), whereas features in blue drove the prediction toward a poor outcome. In both of these patients, the model predicted a poor neurological outcome, although certain individual features were protective (such as having an ejection fraction above 50% and a higher nadir SpO2).
Fig. 5.

Shapley force plots for two individual example patients. Features in red drove the model to predict a good neurological outcome (as illustrated by arrows on the line graph pointing in the positive direction) while features in blue drove the prediction toward a poor outcome. The “f(x)” value denotes the strength of the final prediction, with negative and positive values representing a predicted poor and good outcome, respectively. Please note that the x-axis scales are different between the two presented patients. BMI, body mass index, Max, maximum, pRBC, packed red blood cell, SpO2, oxygen saturation
For model B, higher exhaled tidal volumes within the first 48 h of ECMO initiation were predictive of good neurological outcome, as well as higher nadir hemoglobin and red blood cell counts. Of note, younger age and higher platelet count—two high-importance features also observed in model A—remained highly predictive of outcomes even after ECMO initiation.
Lastly, model C similarly identified higher median and higher nadir exhaled tidal volume from the entire ECMO run as major predictors of good neurological outcomes (Fig. 4). Additionally in model C, higher core body temperature, higher nadir mean arterial pressure, lower nadir peak airway pressure, lower ECMO pump flow, up-trending pulse pressure, and younger age were associated with good neurological outcome.
Discussion
Our study used 14 million individual measurements on 194 patients collected during standard care to develop the first prognostic model to predict neurological outcomes after VA-ECMO support using ML. The use of XGBoost, a bootstrapping algorithm, with LOOCV for model testing was intended to provide a robust estimate of model performance and minimize selection bias. These ML algorithms demonstrate the ability to predict neurological outcomes from pre-ECMO variables, on-ECMO variables within the first 48 h of cannulation, and variables from the entire duration of ECMO support with high NPVs (Table 2). As such, our model A can help inform the practice of determining which patients may not be optimal candidates for VA-ECMO due to a predicted poor neurological outcome even prior to cannulation. However, further analysis demonstrated that the addition of on-ECMO variables early in cannulation—which many previous scoring systems do not incorporate—greatly bolsters predictive power. The decision to cannulate is frequently emergent without time for model evaluation in a real-world clinical environment, and model B was also developed to better inform the chances of meaningful neurological recovery by incorporating early on-ECMO variables. Model C was developed to elucidate potentially modifiable risk factors associated with neurologic outcomes when examining the entire duration of ECMO support.
Our motivation to build models for patients undergoing VA-ECMO stems from clinical challenges faced during patient selection and management. Because of the indications for and invasiveness of VA-ECMO, patients are at high risk for mortality and complications, notably neurological injury. With in-hospital mortality up to 56% and ABI estimated to occur between 15 and 33%, there are significant limitations to this potentially lifesaving therapy [4, 19]. As such, many studies have been conducted investigating the contributing factors to the development of ABI during ECMO to mitigate these risks and reduce mortality, identifying variables such as longer ECMO duration, renal replacement therapy, and higher pre-ECMO PaCO2 [4, 15]. However, despite elucidating the associated risk factors, there remains a lack of prevention and management strategies in practice [16]. For this reason, our institutional neuromonitoring protocol was developed to increase the sensitivity of ABI detection and has been validated to have an association with increased ABI, contributing to timely intervention and improvements in neurological outcomes [17]. Yet, even with the development of protocols for proactive neuromonitoring, predicting neurological outcomes after ECMO support remains challenging [21, 22].
Multiple studies have attempted to build predictive models to elucidate indicators that are prognostic of patient outcomes. Although all currently published models report on in-hospital mortality as the primary outcome, they still serve as important comparisons in developing future prognostication tools for cohorts of patients undergoing VA-ECMO. The popularized Survival After Venoarterial-ECMO score, developed using the international Extracorporeal Life Support Organization Registry, includes pre-ECMO variables in a logistic regression to determine the likelihood of survival but only achieves modest discriminatory power (AUC = 0.62) [5]. Similarly, the ECMO-ACCEPTS score derived from multivariable Cox-regression analysis of pre-ECMO variables only achieves an accuracy of 54.7% [11]. Systematic validation of all current prognostic scores for in-hospital mortality further revealed limited generalizability and only modest predictive power, indicating the need to incorporate variables collected during ECMO support and for use of artificial intelligence–modeling techniques in place of traditional regression models [12, 18, 19]. More specifically, limited work has been done in the space of ML applications to predictive modeling in VA-ECMO.
Currently, the only previously published literature employing the use of ML in predicting mortality in patients on VA-ECMO support combined on-ECMO variables with deep neural networks and dimensionality reduction to achieve an AUC-ROC of 0.92 and accuracy of 82% [13]. However, the methods from the study by Ayers et al. [13] are susceptible to selection bias due to the application of a train-test set on their small sample size. Our models mitigate this issue by applying the LOOCV method to reduce variability in the test set selection by ensuring every individual in our dataset is represented. Although estimation of univariate feature importance or use of Gini importance scores aid with identifying the variables that contribute most to the model’s predictive power [13, 23], it lacks the granularity to inform clinical management. Thus, to elucidate what has traditionally been a black box model, we used SHAP summary plots to allow for better visualization of how the characteristics of each feature influence the primary outcome, which is critical in understanding the potential modifiable and nonmodifiable factors before and after cannulation that can improve functional neurological status for patients receiving VA-ECMO support. The SHAP analysis identified many features that align with what is already clinically accepted. Characteristics such as younger age, low pre-ECMO PaCO2, higher temperature, up-trending pulse pressures, better functional status as indicated by higher exhaled tidal volumes, and high mean arterial pressure have been shown to produce better neurological outcomes [15, 17, 25].
This study has several limitations. As with any singlecenter retrospective study, our model was restricted by the variables that were able to be queried from the EMRs and omits variables that are not measured or recorded in a flowsheet. Similarly, different institutions, and certainly individual clinicians, may have different criteria for ECMO eligibility based on center experience, and the models we have generated reflect a single institution’s practice pattern [24]. Additionally, given that the model was generated using summary variables of otherwise time series data, it does not contribute a real-time risk score; however, our primary outcome was not time dependent. Although some physiological factors typically associated with clinical status (such as SpO2 and hemoglobin levels) were important in the feature analysis, associations in the exact relationships between these variables and neurological outcome are confounded by the ability of providers to directly influence these numbers. However, we attempted to account for this by including variables such as fraction of inspired oxygen and number of packed red blood cell transfusions. We also recognize that the unique implementation of a neuromonitoring protocol specific to our institution improves our ground truth but limits the model’s generalizability. Notably, a total of 19 patients in our cohort did experience some form of intracerebral hemorrhage or ischemic stroke prior to initiation of ECMO. We do not have data on the severity and timing of these ABIs or these patients’ pre-ECMO mRS, which could confound their post-ECMO mRS. However, it is unlikely that there was significant permanent functional neurologic impairment from these previously diagnosed ischemic or hemorrhagic strokes, as patients would have generally been deemed ineligible for ECMO initiation. Furthermore, our model only assessed neurological status at discharge because it is more reliably recorded and can serve as a direct determination of ECMO outcomes. However, future modeling of long-term neurological outcomes after ECMO support should be explored. Lastly, given the small study population limited to a single center, external validation of our model is warranted using a large cohort such as the Extracorporeal Life Support Organization Registry, but implementation is likely to be difficult given our database’s high granularity typically not seen in registry data.
Conclusions
In summary, we have developed the first interpretable prognostic models using ML techniques capable of predicting neurological outcomes in patients receiving VA-ECMO support with high discriminatory power and accuracy. This framework may be used to identify the variables before and after cannulation that are important for risk assessment of neurological dysfunction to help improve patient prognosis and maximize resource allocation. Implementation and external validation of these findings may identify modifiable variables that could improve neurological outcomes in VA-ECMO.
Supplementary Material
The online version contains supplementary material available at https://doi.org/10.1007/s12028-025-02233-0.
Source of support
SMC is supported by the National Heart, Lung, and Blood Institute (1K23HL157610). All other authors report no funding sources.
Funding
National Heart, Lung, and Blood Institute, 1K23HL157610, Sung-Min Cho.
Footnotes
Electronic supplementary material
The online version of this article (https://doi.org/https://doi.org/10.1007/s12028-025-02233-0) contains supplementary material.
Conflicts of interest
The authors declare no conflict of interest.
Ethical approval/informed consent
IRB00216321, titled: “Retrospective Analysis of Outcomes of Patients on Extracorporeal Membrane Oxygenation” approved on October 22, 2019. Procedures were conducted in line with the ethical standards of the institution and with the Helsinki Declaration of 1975. Study-specific informed consent was waived by the institutional review board.
References
- 1.Rao P, Khalpey Z, Smith R, Burkhoff D, Kociol RD. Venoarterial extracorporeal membrane oxygenation for cardiogenic shock and cardiac arrest. Circ Heart Fail. 2018;11(9): e004905. 10.1161/CIRCHEARTFAILURE.118.004905. [DOI] [PubMed] [Google Scholar]
- 2.Zangrillo A, Landoni G, Biondi-Zoccai G, et al. A meta-analysis of complications and mortality of extracorporeal membrane oxygenation. Crit Care Resusc. 2013;15(3):172–8. [PubMed] [Google Scholar]
- 3.Lorusso R, Barili F, Mauro MD, et al. In-hospital neurologic complications in adult patients undergoing venoarterial extracorporeal membrane oxygenation: results from the extracorporeal life support organization registry. Crit Care Med. 2016;44(10):e964–972. 10.1097/CCM.0000000000001865. [DOI] [PubMed] [Google Scholar]
- 4.Cho SM, Canner J, Chiarini G, et al. Modifiable risk factors and mortality from ischemic and hemorrhagic strokes in patients receiving venoarterial extracorporeal membrane oxygenation: results from the extracorporeal life support organization registry. Crit Care Med. 2020;48(10):e897–905. 10.1097/CCM.0000000000004498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Schmidt M, Burrell A, Roberts L, et al. Predicting survival after ECMO for refractory cardiogenic shock: the survival after veno-arterial-ECMO (SAVE)-score. Eur Heart J. 2015;36(33):2246–56. 10.1093/eurheartj/ehv194. [DOI] [PubMed] [Google Scholar]
- 6.Akin S, Caliskan K, Soliman O, et al. A novel mortality risk score predicting intensive care mortality in cardiogenic shock patients treated with veno-arterial extracorporeal membrane oxygenation. J Crit Care. 2020;55:35–41. 10.1016/j.jcrc.2019.09.017. [DOI] [PubMed] [Google Scholar]
- 7.Smith M, Vukomanovic A, Brodie D, Thiagarajan R, Rycus P, Buscher H. Duration of veno-arterial extracorporeal life support (VA ECMO) and outcome: an analysis of the extracorporeal life support organization (ELSO) registry. Crit Care. 2017;21(1):45. 10.1186/s13054-017-1633-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Chen WC, Huang KY, Yao CW, et al. The modified SAVE score: predicting survival using urgent veno-arterial extracorporeal membrane oxygenation within 24 hours of arrival at the emergency department. Crit Care. 2016;20(1):336. 10.1186/s13054-016-1520-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wengenmayer T, Duerschmied D, Graf E, et al. Development and validation of a prognostic model for survival in patients treated with venoarterial extracorporeal membrane oxygenation: the PREDICT VA-ECMO score. Eur Heart J Acute Cardiovasc Care. 2019;8(4):350–9. 10.1177/2048872618789052. [DOI] [PubMed] [Google Scholar]
- 10.Peigh G, Cavarocchi N, Keith SW, Hirose H. Simple new risk score model for adult cardiac extracorporeal membrane oxygenation: simple cardiac ECMO score. J Surg Res. 2015;198(2):273–9. 10.1016/j.jss.2015.04.044. [DOI] [PubMed] [Google Scholar]
- 11.Becher PM, Twerenbold R, Schrage B, et al. Risk prediction of in-hospital mortality in patients with venoarterial extracorporeal membrane oxygenation for cardiopulmonary support: the ECMO-ACCEPTS score. J Crit Care. 2020;56:100–5. 10.1016/j.jcrc.2019.12.013. [DOI] [PubMed] [Google Scholar]
- 12.Schrutka L, Rohmann F, Binder C, et al. Discriminatory power of scoring systems for outcome prediction in patients with extracorporeal membrane oxygenation following cardiovascular surgery. Eur J Cardio-Thorac Surg. 2019;56(3):534–40. 10.1093/ejcts/ezz040. [DOI] [PubMed] [Google Scholar]
- 13.Ayers B, Wood K, Gosev I, Prasad S. Predicting survival after extracorporeal membrane oxygenation by using machine learning. Ann Thorac Surg. 2020;110(4):1193–200. 10.1016/j.athoracsur.2020.03.128. [DOI] [PubMed] [Google Scholar]
- 14.Shou BL, Wilcox C, Florissi I, et al. Early low pulse pressure in VA-ECMO is associated with acute brain injury. Neurocrit Care. 2023;38(3):612–21. 10.1007/s12028-022-01607-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shou BL, Ong CS, Zhou AL, et al. Arterial carbon dioxide and acute brain injury in venoarterial extracorporeal membrane oxygenation. ASAIO J. 2022;68(12):1501–7. 10.1097/MAT.0000000000001699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ryu JA, Cho YH, Sung K, et al. Predictors of neurological outcomes after successful extracorporeal cardiopulmonary resuscitation. BMC Anesthesiol. 2015;15:26. 10.1186/s12871-015-0002-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Al-Kawaz M, Shou B, Prokupets R, Whitman G, Geocadin R, Cho SM. Mild hypothermia and neurologic outcomes in patients undergoing venoarterial extracorporeal membrane oxygenation. J Card Surg. 2022;37(4):825–30. 10.1111/jocs.16308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Shou BL, Ong CS, Premraj L, et al. Arterial oxygen and carbon dioxide tension and acute brain injury in extracorporeal cardiopulmonary resuscitation patients: analysis of the extracorporeal life support organization registry. J Heart Lung Transplant. 2023;42(4):503–11. 10.1016/j.healun.2022.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ong CS, Etchill E, Dong J, et al. Neuromonitoring detects brain injury in patients receiving extracorporeal membrane oxygenation support. J Thorac Cardiovasc Surg. 2023;165(6):2104–2110.e1. 10.1016/j.jtcvs.2021.09.063. [DOI] [PubMed] [Google Scholar]
- 20.Subasi MM, Subasi E, Anthony M, Hammer PL. A new imputation method for incomplete binary data. Discrete Appl Math. 2011;159(10):1040–7. 10.1016/j.dam.2011.01.024. [DOI] [Google Scholar]
- 21.Giordano L, Francavilla A, Bottio T, et al. Predictive models in extracorporeal membrane oxygenation (ECMO): a systematic review. Syst Rev. 2023;12(1):44. 10.1186/s13643-023-02211-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Pladet LCA, Barten JMM, Vernooij LM, et al. Prognostic models for mortality risk in patients requiring ECMO. Intens Care Med. 2023;49(2):131–41. 10.1007/s00134-022-06947-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Goradia S, Sardaneh AA, Narayan SW, Penm J, Patanwala AE. Vasopressor dose equivalence: a scoping review and suggested formula. J Crit Care. 2021;61:233–40. 10.1016/j.jcrc.2020.11.002. [DOI] [PubMed] [Google Scholar]
- 24.Cho SM, Farrokh S, Whitman G, Bleck TP, Geocadin RG. Neurocritical care for extracorporeal membrane oxygenation patients. Crit Care Med. 2019;47(12):1773–81. 10.1097/CCM.0000000000004060. [DOI] [PubMed] [Google Scholar]
- 25.Lee YI, Ko RE, Yang JH, Cho YH, Ahn J, Ryu JA. Optimal mean arterial pressure for favorable neurological outcomes in survivors after extracorporeal cardiopulmonary resuscitation. J Clin Med. 2022;11(2):290. 10.3390/jcm11020290. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
