Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Mar 28.
Published in final edited form as: IEEE/ACM Trans Comput Biol Bioinform. 2022 Dec 8;19(6):3595–3603. doi: 10.1109/TCBB.2021.3122405

OnAI-Comp: An Online AI Experts Competing Framework for Early Sepsis Detection

Anni Zhou 1, Raheem Beyah 2, Rishikesan Kamaleswaran 3
PMCID: PMC10975783  NIHMSID: NIHMS1856430  PMID: 34699366

Abstract

Sepsis has always been a main public concern due to its high mortality, morbidity, and financial cost. There are many existing works of early sepsis prediction using different machine learning models to mitigate the outcomes brought by sepsis. In the practical scenario, the dataset grows dynamically as new patients visit the hospital. Most existing models, being “offline” models and having used retrospective observational data, cannot be updated and improved using the new data. Incorporating the new data to improve the offline models requires retraining the model, which is very computationally expensive. To solve the challenge mentioned above, we propose an Online Artificial Intelligence Experts Competing Framework (OnAI-Comp) for early sepsis detection using an online learning algorithm called Multi-armed Bandit. We selected several machine learning models as the artificial intelligence experts and used average regret to evaluate the performance of our model. The experimental analysis demonstrated that our model would converge to the optimal strategy in the long run. Meanwhile, our model can provide clinically interpretable predictions using existing local interpretable model-agnostic explanation technologies, which can aid clinicians in making decisions and might improve the probability of survival.

Keywords: Early Sepsis Detection, Multi-armed Bandit, Online Learning

1. Introduction

1.1. Backgrounds

SEPSIS is a significant cause of mortality and morbidity among critically ill patients admitted to the intensive care unit (ICU) [1], [2]. Sepsis is “life-threatening organ dysfunction” which will occur due to dysfunctional host response to infection-causing organ failure, tissue damage, or death [3], [4]. A recent epidemiological report suggests that up to a fifth of all global deaths in 2017 were attributed to sepsis, with the heaviest burden falling amongst low-resourced countries [5]. The majority of the deaths reported were immunocompromised, children under the age of 5, and the elderly. Sepsis has long been associated with disparities of health, including among the racial and socioeconomically disadvantaged [6].

In the United States, the Centers for Disease Control and Prevention1 estimates that there are at least 1.7 million adults who develop sepsis in the United States each year. It is reported that one-third of the people who die in a hospital have sepsis. Moreover, the financial costs caused by sepsis have risen significantly. From 2012 to 2018, the total number of fee-for-service beneficiaries with an inpatient hospital admission (IHA) related to sepsis rose from 811,644 to 1,136,889 [7]. Thus, sepsis remains a common and expensive life-threatening condition with significant mortality for beneficiaries.

1.2. Challenges

Machine learning methods have long been used to improve predictions for sepsis [8], [9]. Indeed, a recent international challenge [4] was conducted to improve modeling strategies and prediction times before clinical recognition of sepsis. A major source of data for sepsis prediction has come from structured data derived from electronic medical records (EMR) [10]–[12], bed-side monitoring [13]–[15], or cytokine/biomarker data [16], [17].

In Ref. [18], S. Nemati et al. proposed an interpretable machine learning model based on Artificial Intelligence Sepsis Expert (AISE) for Sepsis prediction in ICUs. In Ref. [19], Q. Mao et al. applied a machine learning-based algorithm using only six vital signs to detect and predict three sepsis-related gold standards. Their experiments were based on a mixed-ward retrospective dataset. M. Yang et al. [20], [21] proposed an Explainable Artificial-intelligence Sepsis Predictor (EASP) applying XGBoost Learning and Bayesian Optimization, which had won first place in the PhysioNet/Computing in Cardiology Challenge 2019 [4]. In Ref. [22], T. Vicar et al. solved the sepsis prediction problem using a recurrent neural network (RNN).

While these approaches incorporate rich clinical and physiological aspects of the disorder, most of these algorithms use retrospective observational data [23]. In other words, the events to be studied have already occurred before the data collection starts, neglecting valuable information from the new data. In addition, retraining the model with a new dataset will lead to high computational costs. As a result, we need a novel methodology that takes advantage of the new data and avoids retraining the model. One opportunity may rest in integrating online frameworks, such as the multi-armed bandit (MAB) algorithm to augment observational learning with interactive and adaptive training to solve the challenge mentioned above.

MAB is originally a particular type of Reinforcement Learning (RL) model. Due to its wide popularity, it has become a major field of interest. MAB extends RL by ignoring the state and more objectively tries to balance between exploration and exploitation. In general, the goal of RL is to maximize policy, while the optimal policies might not provide you best rewards in each instance.

1.3. Multi-armed Bandit Problems

The MAB problem (or Bandit problem) can be defined as [24]: “A sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations.” One interesting thing is that the original motivation to study MAB problems came from a clinical application in which one must decide which treatment policy to adopt for each patient [24], [25]. The payoff gained at each round (i.e., time step) is usually defined as reward. At each round, a parameter called regret is used to evaluate the performance of the algorithm. Specifically, regret is defined as the gap between the optimal reward and the actual reward gained.

Based on the assumptions of the reward (i.e., payoff), the fundamental formalization of MAB problems falls into three different categories: stochastic [26], adversarial [27], and Markovian [28] MAB. In this paper, we focus on the stochastic MAB problem. In a stochastic MAB setting, each arm is associated with an unknown probability distribution restricted on [0,1]. At each round, the reward is drawn independently associated with the selected arm. The stochastic MAB problem by nature has an exploration-exploitation dilemma. Specifically, the decision-maker needs to strike a balance between the exploration of alternative arms that might yield higher rewards and the exploitation of arms that seem to generate the highest rewards based on current empirical knowledge [24], [29], [30]. Upper Confidence Bound (UCB) [29], [31] strategy is one of the widely applied solutions for this dilemma, which can deal with exploration and exploitation synchronously.

1.4. Proposed Solutions

While most algorithms use retrospective observational data to construct models, in this work we apply pre-trained models in an online environment to evaluate performance in the context of predicting sepsis. This is important because, in acute and highly dynamic conditions such as sepsis, there may be significant variation in the clinical status of the patient. Therefore, a static model, which is trained on population level data might not fully capture the saliency in such a critical and highly complex state. In this work, we utilize MAB to model the process of a third trusted party selecting among a group of pretrained Machine Learning (ML) algorithms, which are called AI experts in the framework, i.e., the arms in MAB setting. The system infrastructure could be represented as shown in Fig. 1.

Fig. 1:

Fig. 1:

System Infrastructure.

As shown in Fig. 2, we model the sepsis prediction process as an online expert competing game. We have a list of candidate artificial intelligence (AI) experts which have been well-trained in prior. Those experts can evaluate the risk of developing sepsis based on the personal information of this patient (i.e., basic demographic information, vital signs monitored and recorded, fluid bolus administered, etc.). As aforementioned, the dataset dynamically accumulates as new patients arrive. Applying UCB in the online learning module, we can learn and update “on the fly” using new data while striking an exploitation-exploration balance (will be explained more in Eq. (4)).

Fig. 2:

Fig. 2:

System Overview.

Our contribution can be concluded as follows:

  • We propose an online learning-based sepsis prediction framework, which can deal with dynamically increasing datasets and strike a balance between exploration and exploitation.

  • Our experimental results show that our proposed methodology converges to the optimal strategy in the long run.

  • Our model is not limited to sepsis prediction tasks and can be extended to other applications as long as all the experts and performance metrics are normalized appropriately.

2. Problem Formulation

For clarity, we first provide a brief introduction to the MAB problem in this section. In the most basic setting, a MAB problem (i.e., K-armed bandit problem) is defined by a series of random variables 𝓡i,n(1iK,n1), in which i is the unique ID of an AI expert2 (i.e., an arm) and n is the number of rounds. When arm i is played successively at each round, it yields rewards Ri,1,Ri,2,,Ri,T, where T is total number of rounds.

In the basic setting, Ri,1,Ri,2,,Ri,T are always assumed to be independent and identically distributed (i.i.d) based on an unknown law of unknown expectation ξi [29]. However, this is not consistent with the practical scenario, where the reward gained at round t1 and t from the same patient are probably correlated. For example, the cardio output recorded at the moment is usually correlated with the measurement recorded one hour ago. Therefore, as in [29], we do not expect the rewards for each arm to be i.i.d.. Instead, rewards only need to satisfy the following weaker assumption3.

Assumption 1.

For 1iK and 1tT, we assume

E[Ri,tRi,1,Ri,1,,Ri,t1]=ξi and Ri,t[0,1]. (1)

In other words, Ri,t and Rj,s might be dependent for 1i,jK and 1s,tT. This very weak assumption is guaranteed in most applications. Since there is no collaboration among experts, decision made by individual experts will not impact the decisions of other experts.

3. Methods

3.1. System Overview

We can view the whole decision process as an AI expert competing game or an AI experts recommendation system. As shown in Fig. 2, the system is composed of the following components:

  • Patient. Suppose that at each round t, we have a patient Pt arriving at the hospital/expert recommendation system. We have t={1,2,T} and each t is associated with an unique timestamp. We view the patient arriving at each round as a unique patient, even if the patient is the same person.

  • AI Expert. An AI expert can be any trusted third party that provides ML as a Service (MLaS) aiming to help clinicians analyze EMRs. In this system, there are a group of AI experts 𝓔K={E1,E2,,EK}, where K is the number of total AI experts. Each AI expert is a well-trained offline model using the same training dataset.

  • Trusted third party (TTP). A TTP can be a hospital that has hired a group of AI experts or clinicians to analyze EMRs.

In this work, we assume that all the objects in Fig. 2 is not malicious and will not give away the sensitive information of patients. However, in the practical scenario, other agents in the system might be malicious or might be attacked by malicious attackers. So, a TTP to provide input for all records during the training process is required.

3.2. Online AI Experts Competing Framework (OnAI-Comp)

Reward.

At the end of each round, the system calculates a reward based on the performance of AI experts, which is evaluated by a utility function. For a specific expert Ek, let rt be the reward calculated at round t and uk(t) be the utility score achieved by AI expert Ek at round t. We calculate rt as rt=max1kKuk(t), which is the maximum utility score received by AI experts at round t.

Let tpredict be the specific round associated with the first hour during which Ek reports sepsis, and tsepsis be the round associated with the first hour during which patient Pt actually develops sepsis. If tpredict[tsepsisΔearly,tsepsis+Δlate], expert Ek is qualified for positive rewards. Herein, we set Δearly=12,Δlate=3. Δearly and Δlate are based on the parameters provided by the Physionet sepsis challenge [4]. We used the same parameters so that we can apply the standard evaluation function provided by Physionet. There are four different scenarios: false negative (FN), false positive (FP), true positive (TP), and true negative (TN). Similar to [4], we define uk(t) as follows:

uk(t)={0FN for sepsis patientPt0FP for non-sepsis patientPt1TN for non-sepsis patientPt|tsepsistpredict|Δearly+ΔlateTP for sepsis patientPt (2)

Regret.

At each round, based on the afterwards feedback from the patients, the system calculates the difference between the optimal strategy and the selected strategy called regret. In this work, we evaluate the performance by average regret (AR). Specifically, we denote ΔT as the AR up till round T:

ΔT=1Tt=1T(roptimalrt), (3)

where roptimal is the reward achieved by the expected optimal strategy. In this work, we formalize all the parameters so that roptimal=1.

Algorithm 1.

OnAI-Comp.

Initialization:
t1;
for j={1,2,,K} do
μ^j(t)0
end for
 1: procedure OnAI-Comp(t,Pt,K,𝓔K)
 2: for t=1,2,,n do
 3:   Patient data fed to each AI expert;
 4:   All the AI experts provide their predictions;
 5:   for j={1,2,,K} do
 6:    uj(t)Get_Utility(t,Pt,Ej)
 7:    Update_hat_mu(t,μ^j(t),uj(t))
 8:    Update_UCB(t,α,μ^j(t),Nj(t))
 9:    jargmaxmUCBm(t)
 10:    Nj(t)Nj(t1)+1
 11:   end for
 12:   rtuj(t)
 13:   𝒩(t){N1(t),N2(t),,NK(t)}
 14:   μ˜(t){μ^1(t),μ^2(t),,μ^K(t)}
 15:   Calculate AR based on Eq. (3)
 16: end for

Upper confidence bound (UCB).

For each round t, we keep a UCB for each expert given by:

UCBj(t)=μ^j(t)+αlog(t)Nj(t),

where μ^j(t) is the empirical average utility score achieved by expert Ej,Nj(t) is the number of times expert Ej has been selected up till round t, and α is a constant. For each round, the expert with the highest UCB is selected.

The first term in Eq. (4) is the average empirical utility score received by expert j up till round t, which can evaluate the performance of expert j based on exploitation of all historical data related to expert j. So, the first term is associated with the exploitation process, in which the experts with higher empirical performance (in terms of utility score) are selected.

The second term in Eq. (4) represents the uncertainty caused by exploration of expert j. Specifically, the less expert j has been selected before round t (i.e., the smaller Nj(t) is), the more probable it will be selected at round t. At each round, the expert with the highest UCB value will be selected. So, the second term is related to the exploration process, i.e., the experts that have not been fully exploited are chosen. In this way, the system can strike a better balance between exploitation and exploration. By this way, we could strike a balance between exploitation and exploration. In addition, the information from the new data is utilized to calculate UCB values.

Algorithm 2.

Get_Utility.

1: function Get_Utility(t,Pt,Ej)
2:  Calculate the utility score based on Eq. (2)
3:  Return uj(t)
Algorithm 3.

Update_hat_mu.

1: function Update_hat_mu(t,μ^j(t),uj(t))
2: μ^j(t)(t1)μ^j(t1)+uj(t)t

Workflow.

As shown in Alg. 1, the workflow can be split into following steps: (1) At each round, the TTP processes and feeds the information of the current patient to all AI experts. (2) The experts predict the risk of developing sepsis. (3) The TTP selects Ej (i.e., the optimal expert) and shares the predictions of Ej with clinicians to assist them with the final decision. (4) The TTP receives the true predicted labels to calculate utility scores and update all variables (i.e., 𝒩(t),μ˜(t), UCBs). (5) The TTP calculates the reward and AR. Then, go back to step 1.

4. Regret Analysis

We denote ξi as the expected reward received by expert Ei, ξ=max1iKξi, and α is the same constant as that in Eq. (4). Following the similar steps in [29], we have:

Lemma 1.

For 1iK, the upper bound of the expected number of times expert Ei is selected up till round t is:

E[Ni(t)]4αlog(t)δi2+1+π23, (5)

where δi=ξξi. Furthermore, we can rewrite the expected AR after T rounds as:

E[i=1TΔi]=j=1KδjE[Nj(T)]. (6)

Thus, we have:

Theorem 1 (AR Upper Bound).

For K>1, the expected regret after T rounds of OnAI-Comp is bounded as follows:

E[i=1TΔi][4αj=1K(log(T)δj)]+(1+π23)(jKδj). (7)

Theorem 1 shows that the bound of expected AR is depends on T and δi. Intuitively, the first term of Eq. (7) is caused by uncertainty of the exploration process, while the second term is due to regrets gained in the exploitation process.

Algorithm 4.

Update_UCB.

1: function Update_UCB (t,α,μ^j(t),Nj(t))
2: UCBj(t)μ^j(t)+αlog(t)Nj(t)

5. Experimental Results

In the experiments, we use 2019 PhysioNet Computing in Cardiology Challenge dataset4 [4], in which sepsis is defined based on the Sepsis-3 guidelines [3]. It contains information on ICU patients from three hospital systems, where the data of each patient is stored separately in a .psv file5. The patient features (e.g., Demographics, Vital Signs, .etc.) are recorded on an hourly basis. After a patient leaves the ICU, the system will know if she/he has developed sepsis during the stay (on an hourly basis). Then, the system calculates the utility score based on Eq. (2), updates the corresponding parameters (i.e., UCB values, etc.), and adapts the strategy before next patient arrives.

In total, we have 40,336 patient records and more than 40 features. Four models, i.e., XGBoost6 (XGB) [20], [21], Support Vector Machine (SVC), Random Forest (RF) and Logistic Regression (LR) were trained as the AI experts. For fair comparison, we also included random guess (RG) as a dummy expert. We set α=2 in Eq. (4). We have open sourced our code7.

Regret Analysis.

First, we applied OnAI-Comp using different combinations of AI experts. As shown in Fig. 4 (a) and (b), our proposed OnAI-Comp converges to the expected optimal strategy in the long run. There might be a “cold start” in the beginning, but it converges fast. Comparing the results of different expert combination XGB+RF+SVC and LR+RF+SVC (see Fig. 4 (b)), we can see LR+RF+SVC combination has lower AR. This might happen because XGB [20] (which gains the highest utility score in the challenge) is better trained than LR (which is from the Python library). Thus, when better trained AI experts are used, it is possible to get better results. We also demonstrate the average regret and cumulative regret in TABLE 1 and 2 respectively. As shown in TABLE 1, the model converges with the first 500 time steps no matter what expert combination is used.

Fig. 4:

Fig. 4:

Average Regret. (a) Average regret gained by different number of experts. (b) Average regret gained by different expert combinations .

TABLE 1:

Average Regret of Different Expert Combinations.

Experts round = 10 round = 50 round = 100 round = 200 round = 500 round = 1000 round = 1500 round = 2000

XGB 0.1818 0.2431 0.1941 0.1821 0.1960 0.2080 0.2097 0.2105
RG+LR 0.131 0.123 0.1207 0.094 0.0892 0.0928 0.0915 0.0922
RG+SVC 0.1295 0.1019 0.1193 0.0961 0.0855 0.0931 0.0907 0.0906
RG+XGB 0.1974 0.1607 0.1317 0.1402 0.1633 0.1823 0.1876 0.1916
RG+RF 0.0521 0.1466 0.1003 0.0829 0.0778 0.0847 0.0861 0.0871
XGB+RF 0 0.0902 0.099 0.0826 0.0814 0.0885 0.0883 0.0898
XGB+LR 0 0.0902 0.099 0.0826 0.0814 0.0885 0.0883 0.0898
XGB+SVC 0 0.0902 0.099 0.0826 0.0814 0.0885 0.0881 0.0895
RF+LR 0 0.0980 0.1089 0.0896 0.0878 0.0949 0.0939 0.0945
RF+SVC 0 0.0980 0.1089 0.0896 0.0878 0.0949 0.0938 0.0944
LR+SVC 0 0.0980 0.1089 0.0896 0.0878 0.0949 0.0938 0.0944

RG+XGB+RF 0.1019 0.1654 0.1457 0.1123 0.0936 0.0918 0.089 0.0887
RG+XGB+LR 0.1126 0.1442 0.1227 0.0943 0.0811 0.0881 0.0872 0.0886
RG+XGB+SVC 0.0917 0.0963 0.0984 0.0874 0.0804 0.0869 0.0855 0.0865
RG+RF+SVC 0.1325 0.1443 0.1380 0.1000 0.0885 0.0947 0.0919 0.0915
RG+RF+LR 0 7380 0.0865 0.0965 0.0766 0.0824 0.087 0.0868 0.0886
RG+LR+SVC 0 9140 0.1308 0.1185 0.0891 0.0834 0.0897 0.0896 0.0911
XGB+RF+LR 0 0.1137 0.1149 0.0915 0.0870 0.0931 0.0913 0.0921
XGB+RF+SVC 0 0.1137 0.1149 0.0915 0.0870 0.0931 0.0915 0.0921
XGB+LR+SVC 0 0.1137 0.1149 0.0915 0.0870 0.0931 0.0915 0.0921
RF+LR+SVC 0 0.0980 0.1089 0.0896 0.0878 0.0949 0.0938 0.0944

RG+XGB+RF+SVC 0.1922 0.1819 0.1368 0.1065 0.0938 0.0939 0.0915 0.0918
RG+XGB+RF+LR 0.2026 0.1766 0.1438 0.1149 0.095 0.0965 0.0926 0.0926
RG+XGB+LR+SVC 0.1559 0.1653 0.1374 0.1068 0.0871 0.0923 0.0904 0.0904
RG+RF+LR+SVC 0.0762 0.1426 0.1402 0.1085 0.0955 0.0955 0.0944 0.0940
XGB+RF+LR+SVC 0 0.1176 0.1287 0.0985 0.0894 0.0937 0.0922 0.0927

RG+XGB+RF+LR+SVC 0.0322 0.1217 0.1283 0.1030 0.0898 0.0905 0.0879 0.0884

TABLE 2:

Cumulative Regret of Different Expert Combinations.

Experts round = 10 round = 50 round = 100 round = 200 round = 500 round = 1000 round = 1500 round = 2000

XGB 2.000 12.4000 19.6000 36.6000 98.2000 208.2000 314.8000 421.2000
RG+LR 1.441 6.2746 12.1922 18.8946 44.6685 92.9025 137.2798 184.5152
RG+SVC 1.4247 5.1945 12.0459 19.3167 42.8533 93.1598 136.1482 181.3200
RG+XGB 2.1714 8.1932 13.2985 28.1738 81.7912 182.4437 281.5643 383.4118
RG+RF 0.5727 7.4768 10.1285 16.6660 38.9590 84.7379 129.1850 174.3307
XGB+RF 0 4.6000 10.0000 16.6000 40.8000 88.6000 132.6000 179.6000
XGB+LR 0 4.6000 10.000 16.6000 40.8000 88.6000 132.6000 179.6000
XGB+SVC 0 4.600 10.000 16.6000 40.8000 88.6000 132.2000 179.0000
RF+LR 0 5.0000 11.0000 18.0000 44.0000 95.0000 141.0000 189.0000
RF+SVC 0 5.0000 11.0000 18.0000 44.0000 95.0000 140.8000 188.8000
LR+SVC 0 5.0000 11.0000 18.0000 44.0000 95.0000 140.8000 188.8000

RG+XGB+RF 1.1208 8.4329 14.7169 22.5649 46.9066 91.9052 133.5326 177.4717
RG+XGB+LR 1.2388 7.3562 12.3912 18.9492 40.619 88.1966 130.8845 177.3592
RG+XGB+SVC 1.0092 4.9128 9.9349 17.5769 40.2651 87.0092 128.3862 173.0308
RG+RF+LR 0.8121 4.4132 9.7481 15.3888 41.2942 87.0731 130.2903 177.3375
RG+RF+SVC 1.4572 7.3571 13.9346 20.0921 44.3526 94.7887 138.0153 183.0098
RG+LR+SVC 1.0058 6.6719 11.9658 17.9054 41.7606 89.7950 134.4917 182.2353
XGB+RF+LR 0 5.8000 11.6000 18.4000 43.6000 93.2000 137.0000 184.2000
XGB+RF+SVC 0 5.8000 11.6000 18.4000 43.6000 93.2000 137.4000 184.2000
XGB+LR+SVC 0 5.8000 11.6000 18.4000 43.6000 93.2000 137.4000 184.2000
RF+LR+SVC 0 5.0000 11.0000 18.0000 44.0000 95.0000 140.8000 188.8000

RG+XGB+RF+SVC 2.1145 9.2749 13.8171 21.4056 46.9877 93.9972 137.2963 183.7677
RG+XGB+RF+LR 2.2281 9.0083 14.527 23.0924 47.5925 96.5607 139.0171 185.3276
RG+XGB+LR+SVC 1.7146 8.4278 13.8764 21.4658 43.6148 92.3832 135.6365 180.9075
RG+RF+LR+SVC 0.8380 7.2735 14.1578 21.8025 47.8366 95.6304 141.7602 188.0392
XGB+RF+LR+SVC 0 6.0000 13.0000 19.8000 44.8000 93.8000 138.4000 185.4000

RG+XGB+RF+LR+SVC 0.3542 6.2084 12.9557 20.7061 44.9708 90.5720 131.9569 176.9695

Sometimes, adding an inferior expert (e.g., adding RG to XGB+RF+LR+SVC) can decrease regret after the model converges (see TABLE 1). This might be caused by the fact that when there is a bad-behavior expert such as RG, it is easier for the online-learning module to learn who is the optimal expert. However, adding RG usually causes a severer “cold start” problem (see TABLE 1 and 2).

Comparison with the offline model.

In order to evaluate the effectiveness of our online model, we compare OnAI-Comp with the best standalone offline AI expert [20] in our AI expert list. For fair comparison, we implemented all the experiments in an online environment. As shown in Fig. 3, OnAI-Comp outperforms the standalone XGB model even though OnAI-Comp only has two AI experts and one of them is the dummy RG. This might be caused by the fact that RG can promote the exploration process in MAB setting. From Fig. 3, we can conclude that our proposed model has better performance than existing standalone offline models. Thus, if well-trained ML models with good performance are selected as AI experts, the performance of OnAI-Comp can be guaranteed.

Fig. 3:

Fig. 3:

with the Offline Model.

Impact of the number of AI Experts.

To evaluate the impact of the number of AI experts on the performance, we compared different combinations of AI experts (see Fig. 4 (a)). First, we made two experts (i.e., SVC and RF) as the primary candidates. Then, we increased the number of experts by adding new experts to the candidate list. We observe that if we add one expert, AR decreases and converges faster than the combination of two experts. Even though the difference of AR seems to be negligible, such tiny improvement, corresponding to several minutes earlier prediction of sepsis, is crucial for saving the lives of septic patients. However, we should choose experts wisely because sometimes adding an inferior expert (e.g., RG) might increase regret (see TABLE 1 and 2).

Model Interpretation.

Using the existing technologies, it is not feasible to interpret the overall system. At each execution time we could only interpret one AI expert. We overcome this challenge by making each AI expert provide the interpretation of their own prediction, thereby allowing the clinician to infer whether the model is reliable based on the interpretation. In this paper, we only interpreted RF.

We used LIME [32], which gives a score to evaluate the impact of each feature on the final prediction, to interpret our models8. We demonstrated the impact of each feature on RF for patient #000967 (septic patient) and #000013 (non-septic patient) during the whole ICU stay in Fig. 5 (a) and (b), respectively. Specifically, positive impact values make predicted labels bias to “Septic”, and negative impact values make predicted labels bias to“Non-septic”. In Fig. 5 (a), the black vertical line marks the hour the patient developed sepsis, and the blue vertical line marks the hour of the end of septic symptoms. As shown in Fig. 5, the impact of each feature changes over time. For example, the impact of ICUOS will increase tremendously during the last few hours in ICUs. This observation is consistent with the practical scenario that the possibility of developing sepsis increases as the patient stays in the ICU for a longer period. We also compare the interpretation results of patient #000013 and #000967 within a particular hour in Fig. 6. We observed that Temp, ICUOS, and Resp tend to be essential features for early sepsis prediction.

Fig. 5:

Fig. 5:

Impact of Features over Time. (a) Patient #000967 (Septic Patient). The black vertical line marks the hour the patient developed sepsis and the blue vertical line marks the hour of the end of septic symptoms. (b) Patient #000013 (Non-septic Patient).

Fig. 6:

Fig. 6:

Impact of Features within a Particular Hour (RF). (a) Patient #000967 (ICUOS = 36). (b) Patient #000013 (ICUOS = 13).

6. Discussion, Limitations and Conclusion

In this paper, we demonstrate the feasibility of an online learning framework to predict sepsis using an application of contextual experts consisting of several pre-trained offline models. The results of this work suggest that not only can the OnAI-Comp framework outperform the top achieving offline models using MAB, it can also serve to incorporate dynamic information from the real-world with incremental improvements in predictions and contextual awareness of critically ill patients.

The experiments further demonstrate that the OnAI-Comp framework appropriately incorporates information that is found in new data arrivals without retraining an offline model. By dynamically optimizing the online model, we can surpass the performances of offline algorithms alone. Moreover, the resultant model can be easily interpreted as long as the AI experts are interpretable. Indeed, these findings also suggest that OnAI-Comp can be developed as an alternative to offline models, where performance generalization are not achieved.

There are two limitations in OnAI-Comp. First, the AI experts we use in this work are traditional ML models and serve as a pilot demonstration of feasibility; thus, they may be able to demonstrate further improvements with adaptive learning algorithms. In the future, we plan to incorporate more advanced deep learning RNN-based models such as the Long short-term memory (LSTM) model [33]. In addition, as shown in Section 5, though the proposed framework can converge to the optimal strategy in the long run, it suffers from the cold start problem. There is thus a risk that the patients visiting the system during the early rounds might get bad detection results. A possible solution is to silence the results from the early stage and merely use them for training the online module. After the framework converges to the optimal strategy, the clinicians can take advantage of the online learning module. In conclusion, we demonstrate an online learning-based expert selection framework for early sepsis detection for patients in ICUs. Our model can be extended to other applications as long as the utility score and each expert are normalized appropriately. The experimental results show that our model converges to the optimal strategy in the long run.

Acknowledgments

R. Kamaleswaran is supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR002378. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Biographies

graphic file with name nihms-1856430-b0007.gifAnni Zhou is a third-year graduate student at Georgia Institute of Technology. She received her Bachelor’s Degree in Electrical and Computer Engineering from Huazhong University of Science and Technology in 2018. She joined Georgia Tech Communications Assurance and Performance Group (CAP) in 2018. Her current interests include the severe sepsis detection and security.

graphic file with name nihms-1856430-b0008.gifRaheem Beyah is the Dean of the College of Engineering at Georgia Tech and the Southern Company Chair. Dr. Beyah received his Bachelor of Science in Electrical Engineering from North Carolina AT State University in 1998. He received his Master’s and Ph.D. in Electrical and Computer Engineering from Georgia Tech in 1999 and 2003, respectively. Dr. Beyah’s work is at the intersection of the networking and security fields. He leads the Georgia Tech Communications Assurance and Performance Group (CAP). The CAP Group develops algorithms that enable a more secure network infrastructure, with computer systems that are more accountable and less vulnerable to attacks. Through experimentation, simulation, and theoretical analysis, CAP provides solutions to current network security problems and to long-range challenges as current networks and threats evolve.

graphic file with name nihms-1856430-b0009.gifRishikesan Kamaleswaran is an Assistant Professor at Emory University, Department of Biomedical Informatics, with secondary appointments in Pediatrics and Emergency Medicine. He earned his Ph.D. in Computer Science from the University of Ontario Institute of Technology in Canada. He was a research fellow at the Division of Neonatology and the Department of Critical Care Medicine at the Hospital for Sick Children (Toronto) where he led efforts on the collection and analysis of physiological data in the Neonatal Intensive Care Unit for multiple clinical conditions including neonatal hypoglyceamia, physiological deterioration, nosocomial infection, and apnoea of prematurity. His current interests include severe sepsis detection and multi-organ dysfunction syndrome.

Footnotes

2.

The expert can also be a human expert or a clinical decision system based on both human and AI. Without loss of generosity, we treat all experts as AI experts throughout this paper.

3.

The assumption Ri,t[0,1] is easy to be satisfied by normalization.

8.

XGB was interpreted in [21], so its interpretation is not shown in this work.

Contributor Information

Anni Zhou, Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30332..

Raheem Beyah, Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30332..

Rishikesan Kamaleswaran, Department of Biomedical Informatics, with secondary appointments in Pediatrics and Emergency Medicine..

References

  • [1].Coopersmith CM, De Backer D, Deutschman CS, Ferrer R, Lat I, Machado FR, Martin GS, Martin-Loeches I, Nunnally ME, Antonelli M et al. , “Surviving sepsis campaign: research priorities for sepsis and septic shock,” Intensive care medicine, vol. 44, no. 9, pp. 1400–1426, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Weiss SL, Peters MJ, Alhazzani W, Agus MS, Flori HR, Inwald DP, Nadel S, Schlapbach LJ, Tasker RC, Argent AC et al. , “Surviving sepsis campaign international guidelines for the management of septic shock and sepsis-associated organ dysfunction in children,” Intensive care medicine, vol. 46, no. 1, pp. 10–67, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, Bellomo R, Bernard GR, Chiche J-D, Coopersmith CM et al. , “The third international consensus definitions for sepsis and septic shock (sepsis-3),” Jama, vol. 315, no. 8, pp. 801–810, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Reyna MA, Josef C, Seyedi S, Jeter R, Shashikumar SP, Westover MB, Sharma A, Nemati S, and Clifford GD, “Early prediction of sepsis from clinical data: the physionet/computing in cardiology challenge 2019,” in 2019 Computing in Cardiology (CinC). IEEE, 2019, pp. Page–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan DR, Colombara DV, Ikuta KS, Kissoon N, Finfer S et al. , “Global, regional, and national sepsis incidence and mortality, 1990–2017: analysis for the global burden of disease study,” The Lancet, vol. 395, no. 10219, pp. 200–211, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Esper AM, Moss M, Lewis CA, Nisbet R, Mannino DM, and Martin GS, “The role of infection and comorbidity: Factors that influence disparities in sepsis,” Critical care medicine, vol. 34, no. 10, p. 2576, 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Buchman TG, Simpson SQ, Sciarretta KL, Finne KP, Sowers N, Collier M, Chavan S, Oke I, Pennini ME, Santhosh A et al. , “Sepsis among medicare beneficiaries: 1. the burdens of sepsis, 2012–2018,” Critical care medicine, vol. 48, no. 3, p. 276, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Barriere SL and Lowry SF, “An overview of mortality risk prediction in sepsis,” Critical care medicine, vol. 23, no. 2, pp. 376–393, 1995. [DOI] [PubMed] [Google Scholar]
  • [9].Fleuren LM, Klausch TL, Zwager CL, Schoonmade LJ, Guo T, Roggeveen LF, Swart EL, Girbes AR, Thoral P, Ercole A et al. , “Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy,” Intensive care medicine, vol. 46, no. 3, pp. 383–400, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Giacobbe DR, Signori A, Del Puente F, Mora S, Carmisciano L, Briano F, Vena A, Ball L, Robba C, Pelosi P et al. , “Early detection of sepsis with machine learning techniques: a brief clinical perspective,” Frontiers in medicine, vol. 8, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Despins LA, “Automated detection of sepsis using electronic medical record data: a systematic review,” The Journal for Healthcare Quality (JHQ), vol. 39, no. 6, pp. 322–333, 2017. [DOI] [PubMed] [Google Scholar]
  • [12].Schinkel M, Paranjape K, Panday RN, Skyttberg N, and Nanayakkara PW, “Clinical applications of artificial intelligence in sepsis: A narrative review,” Computers in biology and medicine, vol. 115, p. 103488, 2019. [DOI] [PubMed] [Google Scholar]
  • [13].Mohammed A, Van Wyk F, Chinthala LK, Khojandi A, Davis RL, Coopersmith CM, and Kamaleswaran R, “Temporal differential expression of physiomarkers predicts sepsis in critically ill adults.” Shock (Augusta, Ga.), 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].van Wyk F, Khojandi A, Mohammed A, Begoli E, Davis RL, and Kamaleswaran R, “A minimal set of physiomarkers in continuous high frequency data streams predict adult sepsis onset earlier,” International journal of medical informatics, vol. 122, pp. 55–62, 2019. [DOI] [PubMed] [Google Scholar]
  • [15].Kamaleswaran R, Akbilgic O, Hallman MA, West AN, Davis RL, and Shah SH, “Applying artificial intelligence to identify physiomarkers predicting severe sepsis in the picu,” Pediatric Critical Care Medicine— Society of Critical Care Medicine, vol. 19, no. 10, pp. e495–e503, 2018. [DOI] [PubMed] [Google Scholar]
  • [16].Banerjee S, Mohammed A, Wong HR, Palaniyar N, and Kamaleswaran R, “Machine learning identifies complicated sepsis course and subsequent mortality based on 20 genes in peripheral blood immune cells at 24 h post-icu admission,” Frontiers in immunology, vol. 12, p. 361, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Larsen FF and Petersen JA, “Novel biomarkers for sepsis: A narrative review,” European journal of internal medicine, vol. 45, pp. 46–50, 2017. [DOI] [PubMed] [Google Scholar]
  • [18].Nemati S, Holder A, Razmi F, Stanley MD, Clifford GD, and Buchman TG, “An interpretable machine learning model for accurate prediction of sepsis in the icu,” Critical care medicine, vol. 46, no. 4, p. 547, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Mao Q, Jay M, Hoffman JL, Calvert J, Barton C, Shimabukuro D, Shieh L, Chettipally U, Fletcher G, Kerem Y et al. , “Multicentre validation of a sepsis prediction algorithm using only vital sign data in the emergency department, general ward and icu,” BMJ open, vol. 8, no. 1, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Yang M, Wang X, Gao H, Li Y, Liu X, Li J, and Liu C, “Early prediction of sepsis using multi-feature fusion based xgboost learning and bayesian optimization,” in The IEEE Conference on Computing in Cardiology (CinC), vol. 46, 2019, pp. 1–4. [Google Scholar]
  • [21].Yang M, Liu C, Wang X, Li Y, Gao H, Liu X, and Li J, “An explainable artificial intelligence predictor for early detection of sepsis,” Critical Care Medicine, vol. 48, no. 11, pp. e1091–e1096, 2020. [DOI] [PubMed] [Google Scholar]
  • [22].Vicar T, Novotna P, Hejc J, Ronzhina M, and Smisek R, “Sepsis detection in sparse clinical data using long short-term memory network with dice loss,” in 2019 Computing in Cardiology (CinC). IEEE, 2019, pp. Page–1. [Google Scholar]
  • [23].Bica I, Alaa AM, Lambert C, and van der Schaar M, “From real-world patient data to individualized treatment effects using machine learning: current and future methods to address underlying challenges,” Clinical Pharmacology & Therapeutics, vol. 109, no. 1, pp. 87–100, 2021. [DOI] [PubMed] [Google Scholar]
  • [24].Bubeck S and Cesa-Bianchi N, “Regret analysis of stochastic and nonstochastic multi-armed bandit problems,” arXiv preprint arXiv:1204.5721, 2012. [Google Scholar]
  • [25].Thompson WR, “On the likelihood that one unknown probability exceeds another in view of the evidence of two samples,” Biometrika, vol. 25, no. 3/4, pp. 285–294, 1933. [Google Scholar]
  • [26].Garivier A and Cappé O, “The kl-ucb algorithm for bounded stochastic bandits and beyond,” in Proceedings of the 24th annual conference on learning theory. JMLR Workshop and Conference Proceedings, 2011, pp. 359–376. [Google Scholar]
  • [27].Uchiya T, Nakamura A, and Kudo M, “Algorithms for adversarial bandit problems with multiple plays,” in International Conference on Algorithmic Learning Theory. Springer, 2010, pp. 375–389. [Google Scholar]
  • [28].Tekin C and Liu M, “Online algorithms for the multi-armed bandit problem with markovian rewards,” in 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2010, pp. 1675–1682. [Google Scholar]
  • [29].Auer P, Cesa-Bianchi N, and Fischer P, “Finite-time analysis of the multiarmed bandit problem,” Machine learning, vol. 47, no. 2, pp. 235–256, 2002. [Google Scholar]
  • [30].Macready WG and Wolpert DH, “Bandit problems and the exploration/exploitation tradeoff,” IEEE Transactions on evolutionary computation, vol. 2, no. 1, pp. 2–22, 1998. [Google Scholar]
  • [31].Auer P and Ortner R, “Ucb revisited: Improved regret bounds for the stochastic multi-armed bandit problem,” Periodica Mathematica Hungarica, vol. 61, no. 1–2, pp. 55–65, 2010. [Google Scholar]
  • [32].Ribeiro MT, Singh S, and Guestrin C, “”why should I trust you?”: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, 2016, pp. 1135–1144. [Google Scholar]
  • [33].Hochreiter S and Schmidhuber J, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997. [DOI] [PubMed] [Google Scholar]

RESOURCES