Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2024 Sep 11;57(1):105–118. doi: 10.1111/jnu.13024

The effects of applying artificial intelligence to triage in the emergency department: A systematic review of prospective studies

Nayeon Yi 1, Dain Baik 1,, Gumhee Baek 2
PMCID: PMC11771688  PMID: 39262027

Abstract

Introduction

Accurate and rapid triage can reduce undertriage and overtriage, which may improve emergency department flow. This study aimed to identify the effects of a prospective study applying artificial intelligence‐based triage in the clinical field.

Design

Systematic review of prospective studies.

Methods

CINAHL, Cochrane, Embase, PubMed, ProQuest, KISS, and RISS were searched from March 9 to April 18, 2023. All the data were screened independently by three researchers. The review included prospective studies that measured outcomes related to AI‐based triage. Three researchers extracted data and independently assessed the study's quality using the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) protocol.

Results

Of 1633 studies, seven met the inclusion criteria for this review. Most studies applied machine learning to triage, and only one was based on fuzzy logic. All studies, except one, utilized a five‐level triage classification system. Regarding model performance, the feed‐forward neural network achieved a precision of 33% in the level 1 classification, whereas the fuzzy clip model achieved a specificity and sensitivity of 99%. The accuracy of the model's triage prediction ranged from 80.5% to 99.1%. Other outcomes included time reduction, overtriage and undertriage checks, mistriage factors, and patient care and prognosis outcomes.

Conclusion

Triage nurses in the emergency department can use artificial intelligence as a supportive means for triage. Ultimately, we hope to be a resource that can reduce undertriage and positively affect patient health.

Protocol Registration

We have registered our review in PROSPERO (registration number: CRD 42023415232).

Keywords: artificial intelligence; decision support systems, clinical; emergency service, hospital; nurses; triage

INTRODUCTION

Overcrowding in emergency departments (EDs) and delays in the length of stay increase mortality rates of critically ill patients and occurrence rates of cardiac arrest, as well as decrease patient satisfaction (Kim et al., 2020; Sabaz et al., 2020; Walsh & Knott, 2010). This is also a negative factor not only for patients but also for healthcare professionals, leading to noncompliance with clinical guidelines, increased stress, and exposure to violence throughout emergency care (Morley et al., 2018). Consequently, it is essential to prioritize patients who require care through emergency patient triage and securely manage lower‐priority patients, thereby controlling the flow of patients within EDs (Park et al., 2017). Triage tools, developed and used worldwide, assess the severity and urgency of emergency patients to determine their priority, as relying solely on the experience and knowledge of emergency healthcare professionals can lead to inconsistencies. Despite the development of triage tools, improper classification and undertriaging of emergency patients as non‐emergency patients can lead to their condition deteriorating while they await treatment. However, overtriaging non‐emergency patients as emergency patients could lead to the misallocation of potential emergency resources needed for other patients who require immediate care (Silva et al., 2017). Hence, accurate assessment and triage are crucial for ensuring patient safety and saving lives.

Emergency patient triage is typically performed by trained healthcare professionals who evaluate patients, make judgments regarding priorities, and often record information manually in an electronic medical record system. In most healthcare settings (Roquette et al., 2020), this responsibility falls upon ED nurses (Dulandas & Brysiewicz, 2018). However, the manual input method for patient triage by nurses can be hindered by factors such as the pressure to deal with many patients, the chaotic environment within EDs, and the presence of waiting ambulances, which can disrupt the decision‐making process (Reay et al., 2020). It is also influenced by nurses’ triage competency, including their training, knowledge, clinical experience, and communication skills (Hwang & Shin, 2022; Moon, 2021). In addition, this manual input method is susceptible to subjective judgments by nurses, which can decrease the accuracy of triage and raise concerns regarding patient safety (Lee et al., 2019). As a result, there is an emphasis on the need for efficient triage systems that can support ED nurses in their triage roles, helping and facilitating timely and informed decision‐making (Fernandes et al., 2020; Reay et al., 2020).

Artificial intelligence (AI) aims to create systems with human‐like intelligence capable of learning and reasoning (Clancy, 2020). Machine learning (ML), a type of AI, comprises a series of algorithms that enable computers to continuously learn from data and make predictions or classify certain data. It is categorized into supervised learning, unsupervised learning, and reinforcement learning. Supervised ML is defined by using labeled data to learn mappings between input and outcome variables of interest. On the contrary, unsupervised learning refers to techniques that use unlabeled data to find naturally occurring groups or clusters. Reinforcement learning uses rewards or penalties as feedback based on the actions the agent performs, and over time, the agent learns a sequence of actions that maximizes the reward (Helm et al., 2020). Deep learning is a subfield of ML that has gained enormous popularity in the healthcare field over the past few years due to its success in various complex classification tasks (Esteva et al., 2021). ML includes various algorithms such as clustering, decision trees, deep learning, logistic regression, naive Bayes, natural language processing, neural networks, and random forests (Shafaf & Malek, 2019). Other types of AI include fuzzy logic and computer vision (Esteva et al., 2021; Zadeh, 1965). Advancements in such technologies have shown that AI‐based predictive models for emergency patient triage not only reduce the time required for manual input but also minimize the potential for errors resulting from subjective judgments by nurses (Raita et al., 2019). Consequently, AI‐based triage enables swift triage of variables and accurate detection of urgency levels, thereby improving the prediction of clinical outcomes compared to conventional manual methods (Sánchez‐Salmerón et al., 2022).

Previous researchers have mostly used retrospective research techniques to verify prediction models or improve algorithms using previously acquired data (Miles et al., 2020; Sánchez‐Salmerón et al., 2022). EDs are dynamic environments and not stable settings, and triage of emergency patients is a complex task that involves evaluating multiple factors, including patient health data, resource availability, and interaction with patients. Hence, assessing the impact of AI triage, such as its accuracy, is a complex issue. However, the use of vignette verification with retrospective data may restrict the consideration of several variables mentioned earlier, which may disrupt the practical implementation of AI triage.

Therefore, the purpose of this study was to evaluate the efficacy of AI‐based emergency patient triage in real‐world clinical settings by carefully reviewing prospective research that makes use of AI‐based triage in actual clinical situations to improve triage accuracy and explore methods to assist ED nurses with triage tasks. The specific aims of the study were as follows: (1) Identify the intervention contents of AI‐based emergency patient triage systems; (2) Investigate the intervention effect of AI‐based emergency patient triage systems.

DESIGN

This systematic literature review aimed to select and synthesize relevant national and international studies that utilized prospective design to explore the efficacy of AI‐based emergency patient triage systems. The objective of this study was to conduct a comprehensive literature review and provide an overview of the topic. The protocol for the systematic review was registered in the International Prospective Register of Systematic Reviews (PROSPERO) before this study was conducted (registration number: CRD 42023415232).

METHODS

Inclusion and/or exclusion criteria

Inclusion criteria

According to the Preferred Reporting Items for Systematic Review and Meta‐Analysis (PRISMA) guidelines, the literature selection criteria for this study were defined using the Participants, Intervention, Comparison, Outcome, Study Design (PICO‐SD) format (Moher et al., 2009; Saaiq & Ashraf, 2017).

  • Patient/Population/Problem: ED.

  • Intervention: AI‐based emergency patient triage system.

  • Comparison: No limit.

  • Outcome: No limit.

  • Study design: All prospective experimental studies.

  • There were no limits to the study period.

Exclusion criteria

  • Studies conducted in settings other than EDs, such as pre‐arrival, intensive care units, and hospital wards.

  • Retrospective study design, academic conference papers, and study protocols.

  • Studies without the application of AI in triage.

  • Studies that did not apply the emergency patient triage system, such as the coronavirus disease‐19 (COVID‐19) severe patient classification.

  • Studies that were not accessible in the full text.

  • Studies published in languages other than Korean or English.

Study search and selection

Study search

From March 9 to April 18, 2023, three researchers conducted a literature search and selection for this study. The search period was not restricted by the primary research query, and studies published in academic journals until February 2023 were included in the search. The analysis did not include conference abstracts or research protocols because they were not considered part of the study's quality assessment. The database search was based on five foreign databases (CINAHL, Cochrane, Embase, PubMed, and ProQuest) and two national databases (KISS and RISS). Based on the selection and exclusion criteria, three researchers independently searched seven databases for relevant studies. To conduct an extensive literature search, the study participants and research interventions of the PICO‐SD served as the search criteria. The search used MEDLINE's default search terms, Boolean operators “AND” and “OR,” without using controlled vocabulary. Truncation broadens a search by capturing multiple forms and alternative keyword endings. In selecting the search fields, only the title and abstract were selected to exclude irrelevant information such as author, table of contents, date, and department. Three researchers initially devised the search method, which was then reviewed for further refinement by a medical library information specialist affiliated with one of the researchers. The following search terms were used in international databases: [“emergency department” OR “emergency room” OR “emergency medical system” OR “emergency unit” OR “emergency care”] OR [“triage” OR “classif*” OR “CTAS” OR “ESI” OR “KTAS”] OR (“AI” OR “AI” OR “ML” OR “deep learning” OR “neural network” OR “reinforcement learning”). In the national search databases, the following Korean search terms were used: [“응급실”] AND [“분류”] AND [“인공지능”]. A detailed description of the specific search terms used for each database inquired during the systematic review process can be found in the supplemental file (File S1).

Study selection

Literature selection was conducted following the PRISMA Reporting Items for Systematic Reviews (Moher et al., 2009). Each piece of literature was downloaded using Endnote© Version X9 with a file extension and uploaded to the literature review software Covidence (https://covidence.org) for further analysis. Covidence is a well‐known instrument for efficient collaboration among researchers and for screening titles, abstracts, and full texts (Babineau, 2014). Through a two‐step procedure, seven studies were selected for the systematic literature review. In the initial phase, seven national and international databases were searched, yielding 1633 studies. After eliminating 494 duplicates, the titles and abstracts of 1139 publications were evaluated by the inclusion and exclusion criteria. A total of 1109 studies were excluded because they did not meet the selection criteria, and 30 relevant studies were selected based on the research objectives. Eighteen studies with retrospective study designs, four unrelated to AI‐based triage interventions, and studies published in languages other than English or with inaccessible full texts or duplicates were excluded, leaving five studies for inclusion. Second, a manual search was conducted to identify additional pertinent literature and examine the reference lists of the retrieved studies. Based on this procedure, 11 studies were selected. Five studies with retrospective study designs and four unrelated to AI‐based triage interventions were excluded after evaluating their full texts, resulting in the inclusion of only two studies. Finally, seven studies were selected for quality evaluation in the systematic review (Figure 1). Three researchers independently evaluated and selected relevant literature. After collectively reviewing the complete texts, instances of disagreement were discussed among the researchers until a consensus was reached. The final decision on the inclusion or exclusion of a study was made based on the unanimous agreement of all three researchers.

FIGURE 1.

FIGURE 1

Flow diagram corresponding to selection of the studies in the PRISMA 2020 format.

Quality appraisal

The final seven studies were prospective cohort study designs. An appraisal of the quality of the studies was conducted using the STROBE protocol, which is the reporting guideline for observational studies, 4th edition (von Elm et al., 2008), an appraisal of the studys quality was conducted. The STROBE inventory consists of six domains with a total of 22 items: title and abstract, introduction, methods, results, discussions, and other information. Each item gets evaluated with “yes,” “no,” or “not applicable.” In this study, the three researchers independently assessed the quality of the seven selected studies and, in cases of disagreement, reached a consensus through discussion. Each studys evaluation was converted into a Completeness of Reporting (COR) score. The COR (%) was calculated using the formula (yes/[yes + no]) × 100. Díaz Planelles et al. (2023) categorized COR scores as “low” for scores between 0% and 49%, “moderate” for scores between 50% and 74%, and “high” for scores of 75% or higher. In addition, the STROBE results for all the selected studies were summarized to assess the extent of reporting for each specific item in terms of frequency and percentage.

Data abstraction

For the systematic review, we recorded the general characteristics of the literature (first author and publication year, country, study design, clinical setting, characteristics of the study population and sample size, triage scale, applied AI programs, comparison groups, and key findings), the characteristics of AI‐based programs for triage (type of triage level, delivery methods of AI, types and techniques used for AI, use of retrospective data, model performance, and prediction accuracy of triage), and the effectiveness of AI‐based triage (model performance, prediction accuracy, and other outcomes) using Microsoft Excel 2019. To ensure the veracity of the data, three researchers independently analyzed the seven final studies. In the event of disagreement, the studies were re‐evaluated, and a secondary review was conducted for the three researchers to reach a consensus on the final decision.

RESULTS

General characteristics

From October 2017 to January 2023, we analyzed seven studies that applied AI‐based triage. Table 1 presents the main characteristics of the analyzed literature. All the studies used prospective designs, with seven studies adopting cohort designs (Cho et al., 2022; Cotte et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023; Kipourgos et al., 2022; Leung et al., 2021; Liu et al., 2021). Two studies were performed in Greece (Karlafti et al., 2023; Kipourgos et al., 2022), and each study was conducted in South Korea, Germany, Iran, Taiwan, and China.

TABLE 1.

Summary of the Studies.

No. First author (year) Country Design Setting Participants/sample size Triage scale AI intervention Program comparison Main outcomes
A1

Cho

(2022)

Republic of Korea Prospective cohort

Level 1

ED of tertiary hospital

ED patients

(n, 1063)

KTAS A real‐time medical record input assistance system with voice artificial intelligence (RMIS‐AI)

• Manual KTAS

‐ ED nurses

• Time for perform the triage task

‐ RMIS‐AI: 204 sec (IQR 155, 277),

‐ Manual record: 231 sec (IQR 180, 313)

• The record completion rate

‐ 1st Chief concern: 81.84%, highest value in RMIS‐AI record

• The accuracy of reproducing records by RMIS‐AI

‐ SBP, DBP, SPO2, Chief concern variables≧50%

‐ Categorical variables less accuracy than continuous variables

A2 Cotte et al. (2022) Germany Prospective cohort ED of university hospital

ED patients

(n = 378)

≥18 years

German‐

speaking

Walk‐in

MTS level 3,4,5

MTS

Symptom assessment apps (SAAs)

is Ada

• Manual MTS

‐ ED nurses

• Urgency assessment results

‐ Apps urgency assessments matched with MTS 33.9%

‐Degree of App overtriage 57.1%

‐Degree of App undertriage 8.9%

• Agreement between MTS and the apps advice level

‐Cohens kappa coefficient: 0.033 (95% CI–0.023 to 0.089)

A3 Farahmand et al. (2017) Iran Prospective cohort ED of complex hospital

ED patients

(n = 215)

≥18 years

Acute abdominal pain

ESI Web‐based interface (View layer)

• Manual ESI

‐ Emergency medicine physician

• First generation models (Levels 1 & 5 omitted)

‐ Level 2: All systems (AR, CL, LR, DT, NB, NN) showed fair level of prediction with Neural Network being the highest (ROC: 0.769)

‐ Level 3: All systems (AR, CL, LR, DT, NB, NN) showed fair level of prediction

‐ Level 4: Decision tree was the only system with fair prediction (ROC: 0.713)

• Ensemble models of ESI level 4

‐ Ensemble models with the Naïve Bayes algorithm highest overall accuracy (ROC: 0.839)

A4 Karlafti et al. (2023) Greece Prospective cohort

ED of tertiary

university

hospital

ED patients

(n = 322)

≥16 years

COVID‐19 patients excluded

ESI Automatic patient screening classifier (ANN)

• Manual ESI

‐ Measurer not confirmed

• ANN was trained for an average of 21 epochs considering all runs

• Performance of the trained model

‐ overall accuracy: 84.6%

‐ F1 score (more appropriate performance metric in unbalanced datasets): 72.2%

No. First author (year) Country Design Setting Participants/sample size Triage scale AI intervention Program comparison Main outcomes
A5 Kipourgos et al. (2022) Greece Prospective cohort ED of tertiary university hospital

ED patients

(n = 616)

ESI Intelligent‐Triage (i‐TRIAGE)

• Manual ESI

‐Expert triage nurse

• Efficiency of the system (WEKA): ESI level accuracy

‐ i‐TRIAGE_rt (resuscitation team system) 95%

‐ i‐TRIAGE_derm (dermatological system) 72%

• Rules for exporting output classes (Fuzzy Clips)

‐ Accuracy 0.99

‐ Precision 0.93

‐ Sensitivity 0.99

‐ Specialization 0.99

A6

Leung

(2021)

Taiwan Prospective cohort ED of tertiary academic medical center

ED patients

(n = 146)

≥20 years

TTAS TabNet with deep learning architecture • Retrospective data

• Performance of the trained model

‐ AUC‐ROC 0.836

‐ Mean accuracy 0.805

A7

Liu

(2021)

China Prospective cohort ED of tertiary urban teaching hospital

ED patients

(n = 17,072)

ETS Machine learning system (MLS) protocol • Retrospective data

• MLS protocol model performance

‐ AUC: 0.875 ± 0.006 (CI:95%)

• Life‐threatening miss‐triage rate

‐ Control group 1.2% (110 out of 8839),

‐ Intervention group 0.9% (72 out of 7936)

• Associated variables with mistriage in arrival mode, arrival time, age, sex, heart rate, blood pressure, and oxygen saturation

• Shock index: higher sensitivity in older patients

• Pulse pressure: higher sensitivity in younger patients

Abbreviations: AI, artificial intelligence; ANN, artificial neural network; AR, association rules; AUC, area under the curve; AUC‐ROC, area under the receiver operating characteristic curve; CI, confidence interval; CL, clustering; DBP, diastolic blood pressure; DT, decision tree; ED, emergency department; ESI, emergency severity index; ETS, emergency triage scale; ICU, Intensive care unit; IQR, inter quartile range; LR, logistic regression; MTS, Manchester triage scale; N/A, not applicable; NB, naïve bayes; NN, neural network; ROC, receiver operating characteristic curve; SBP, systolic blood pressure; SPO2, Oxygen saturation; WEKA, waikato environment for knowledge analysis.

All studies examined patients who visited university hospitals or tertiary care EDs, with sample sizes ranging from 146 to 17,072 individuals. Four of the seven studies (Cotte et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023; Leung et al., 2021) imposed an age restriction of 16 years, whereas the remaining three studies (Cho et al., 2022; Kipourgos et al., 2022; Liu et al., 2021) did not. Regarding the selection criteria for study participants, two studies did not include all triage levels (Cotte et al., 2022; Farahmand et al., 2017), excluding levels 1 and 2 (Cotte et al., 2022) or levels 1 and 5 (Farahmand et al., 2017), owing to the small sample size. In addition, one study (Farahmand et al., 2017) focused on specific conditions such as acute abdominal pain, and another study (Karlafti et al., 2023) excluded all COVID‐19‐related patients.

Six of the seven studies included in this study adopted a 5‐level triage scale. The Emergency Severity Index (ESI) developed in the United States was used in three studies (Farahmand et al., 2017; Karlafti et al., 2023; Kipourgos et al., 2022). Other scales included the MTS, which was developed in the United Kingdom (Cotte et al., 2022); the Korean Triage and Acuity Scale (KTAS), which was developed in South Korea (Cho et al., 2022); and the Taiwan Triage and Acuity Scale (TTAS), which was developed in Taiwan (Leung et al., 2021), each used in one study. Another study used a China‐developed 4‐level Emergency Triage Scale (ETS) (Liu et al., 2021).

Quality appraisal

Table 2 displays the outcomes of the quality assessment of the selected studies, which included seven prospective cohort studies, using the STROBE checklist. One studies (Kipourgos et al., 2022) were rated “high” with a COR score of 77%, and the remaining six studies were rated “high” with COR scores spanning from 86% to 100% (Cho et al., 2022; Cotte et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023;Leung et al., 2021; Liu et al., 2021). Table 3 displays the outcomes of the itemized analysis based on the STROBE guidelines. 14 of the 22 analyzed items (background and rationale, study design, setting, participants in methods, variables, data sources and measurements, quantitative variables, descriptive data, outcome data, main results, key results, limitations, interpretation, and generalizability) were reported in full in each of the seven studies (100%). The remaining three items, namely, bias participants in results and funding, were described well in six studies (85.7%), and four items (title and abstract, objectives, statistical methods, and other analyses) were described well in five studies (71.4%), totaling 21 items (95.5%) with relatively good descriptions. Only four studies (57.1%) explained how the study size was determined. This was the most inadequately reported aspect in the examined studies.

TABLE 2.

Critical appraisal with STROBE (strengthening the reporting of observational studies in epidemiology).

No. First author (year) STROBE item number COR/quality
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
A1 Cho et al. (2022)

21/22 (95%)

High

A2 Cotte et al. (2022)

22/22 (100%)

High

A3 Farahmand et al. (2017)

20/22 (91%)

High

A4 Karlafti et al. (2023)

19/22 (86%)

High

A5 Kipourgos et al. (2022)

17/22 (77%)

High

A6 Leung et al. (2021)

20/22 (91%)

High

A7 Liu et al. (2021)

21/22 (95%)

High

Note: Check symbols indicate the presence of these items in the chosen studies. Blank spaces indicate the absence of the item. The evidence column indicates the number of STROBE items present in the article relative to the total number of STROBE items. The quality of studies was measured according to the Completeness of Reporting (COR) score: “low” (COR: 0%–49%), “moderate” (COR: 50%–74%) and “high” if _75% of items were met.

TABLE 3.

Critical appraisal by detail items from STROBE (N, 7).

STROBE item No. Scale item N %
1 Title and Abstract 5 71.4%
2 Introduction Background/Rationale 7 100%
3 Objectives 5 71.4%
4 Methods Study Design 7 100%
5 Setting 7 100%
6 Participants 7 100%
7 Variables 7 100%
8 Data Sources/Measurement 7 100%
9 Bias 6 85.7%
10 Study size 4 57.1%
11 Quantitative variables 7 100%
12 Statistical methods 5 71.4%
13 Results Participants 6 85.7%
14 Descriptive data 7 100%
15 Outcome data 7 100%
16 Main results 7 100%
17 Other analyses 5 71.4%
18 Discussion Key results 7 100%
19 Limitations 7 100%
20 Interpretation 7 100%
21 Generalizability 7 100%
22 Other Information Funding 6 85.7%

Characteristics of AI‐based triage

Table 4 presents the characteristics of AI‐based triage literature. Six studies used cloud‐based network delivery (Cho et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023; Kipourgos et al., 2022; Leung et al., 2021; Liu et al., 2021), whereas one study used a standalone executable application (Cotte et al., 2022).

TABLE 4.

Summary of the AI Intervention Features.

No. First author (year) Types of triage levels in the study AI Delivery AI classification AI technology Utilize retrospective data Evaluation for model
Model performance Prediction accuracy of triage
A1 Cho et al. (2022) KTAS 5‐level Cloud Machine learning Natural Language Processing (NLP) Yes N/A N/A
A2 Cotte et al. (2022)

MTS: 3,4,5‐level

Ada: 8‐ level

App Machine learning

Deep learning

(Complex Bayesian networks)

No N/A N/A
A3

Farahmand

(2017)

ESI 2,3,4‐level

(Triage Levels 1 and 5 were omitted due to low number of cases)

Cloud Machine learning Mixed‐model approach (AR, CL, LR, DT, NB, NN) Yes

Level 2,3,4 order (AUC)

• Association rules 0.737, 0.704, 0.459

• Clustering 0.790, 0.739, 0.751

• Decision tree 0.712, 0.712, 0.770

• Logistic regression 0.714, 0.763, 0.642

• Naive bayes 0.635, 0.708, 0.839

• Neural network 0.782, 0.749, 0.495

N/A

A4 Karlafti et al. (2023) ESI 5‐level Cloud Machine learning Deep learning (Feed‐Forward Neural Network) No

• F1 score: 72.2%

Level 1–5 order (kappa value)

• Precision: 0.33, 0.44, 0.94, 0.80, 0.85

• Recall: 1.0, 0.67, 0.86, 0.86, 0.83

• ESI level overall accuracy: 84.6%
A5 Kipourgos et al. (2022) ESI 5‐level Cloud Fuzzy logic Fuzzy Clips Model No

• Precision 93%

• Sensitivity 99%

• Specialization 99%

• Fuzzy accuracy 99%

• WEKA accuracy subgroup 72 ~ 95%

A6 Leung et al. (2021) TTAS 5‐level Cloud Machine learning

Mixed‐model

(TabNet with deep learning architecture)

Yes

• Mean AUC‐ROC 0.836

• Sensitivity 0.803

• Specificity 0.807

• Mean accuracy 80.5%
A7 Liu et al. (2021) ETS 4‐level Cloud Machine learning SHAP14: game theoretic approach Yes • AUC 0.875 ± 0.006 (95%CI)

• Life‐threatening mis‐triage rate

‐ Control group 1.2% (110 out of 8839),

‐ Intervention group 0.9% (72 out of 7936)

Abbreviations: App, application; AR, association rules; AUC, area under the curve; AUC‐ROC, area under the receiver operating characteristic curve; CL, clustering; DT, decision tree; LR, logistic regression; NB, naïve bayes; NN, neural network; SHAP14, S Hapley additive explanations.

Regarding triage, six studies were used on ML (Cho et al., 2022; Cotte et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023; Leung et al., 2021; Liu et al., 2021), while one study employed fuzzy logic, another form of AI (Kipourgos et al., 2022). Two out of the six studies ML‐based studies used hybrid models (Farahmand et al., 2017; Leung et al., 2021), and two studies applied deep learning (Cotte et al., 2022; Karlafti et al., 2023). In addition, each study used natural language processing (Cho et al., 2022) and a game‐based method known as SHAP14 (Liu et al., 2021). One study based on fuzzy logic utilized a fuzzy clip model and compared its results with those of the Waikato Environment for Knowledge Analysis (WEKA), a type of ML (Kipourgos et al., 2022).

Comparative studies between AI and manual input methods comprised five of the seven studies (Cho et al., 2022; Cotte et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023; Kipourgos et al., 2022), whereas comparative studies between AI and retrospective data comprised two studies (Leung et al., 2021; Liu et al., 2021).

Effects of AI‐based triage

Model performance

The model performance was measured in five out of seven studies, and the studies evaluated their performance using different model performance indicators. Three studies use C‐statistics, also called the area under the receiver operating characteristic curve (AUC‐ROC) or area under the curve (AUC), and the results are as follows. In the literature that used a hybrid model based on ESI, the performance of the Naive Bayes algorithm used in the ensemble model was AUC of 0.839, which was higher than the accuracy of the related rules, clustering, logistic regression, and neural networks (Farahmand et al., 2017). A hybrid model that introduced a new framework combining TabNet and deep learning achieved an AUC‐ROC of 0.836 (Leung et al., 2021). Using a game‐based approach, the SHAP14 model demonstrated a high AUC of 0.875 ± 0.006, and it was the only study to report confidence intervals (Liu et al., 2021).

There are a total of two studies that did not use C‐statistic, and the results are as follows. The fuzzy clip model that applied fuzzy logic demonstrated high performance, achieving 93% precision, 99% sensitivity, and 99% specificity (Kipourgos et al., 2022). In addition, the precision value for ESI levels 1–5 of the feedforward neural network, a type of deep learning, was between 33.0% and 85%, and the recall value was between 67% and 100%. The F1 score of the feedforward neural network was 72.2%. Analysis of F1 scores by ESI level showed that levels 1 and 2 were performance deteriorators (Karlafti et al., 2023).

Triage prediction

Four of the seven selected studies examined triage predictions, and the prediction range was 80.5% to 99.1% (Karlafti et al., 2023; Kipourgos et al., 2022; Leung et al., 2021; Liu et al., 2021). The accuracy of the fuzzy clip model based on fuzzy logic was 99% (Kipourgos et al., 2022). The study achieved an accuracy of 84.6% by employing the feed‐forward neural network algorithm (Karlafti et al., 2023). A hybrid model combining TabNet and deep learning achieved an accuracy of 80.5% in triage prediction (Leung et al., 2021). The ML system protocol model showed high accuracy with a low miss‐triage rate of 0.9% in the intervention group (Liu et al., 2021).

Other results

Five of the seven articles in the included literature presented an account of the reliability (kappa coefficient), the outcomes of the triage process, and the factors related to treatment and prognosis. (Cho et al., 2022; Cotte et al., 2022; Farahmand et al., 2017; Leung et al., 2021; Liu et al., 2021).

One study presented reliability using the kappa coefficient (Farahmand et al., 2017). The study using the mixed model approach had a Naive Bayes kappa coefficient of 78.13% (Farahmand et al., 2017). Three studies presented additional classification‐related findings (Cho et al., 2022; Cotte et al., 2022; Liu et al., 2021). One study examined the average time required for triage and found that AI was 27 s faster than manual input; however, the completion rate of the records was only 81.84%, indicating incomplete documentation (Cho et al., 2022). Another study compared the classification of urgency and reported an overtriage rate of 57.1% and an undertriage rate of 8.9%, and 94.7% of cases were considered safe without exposing patients to potential risks (Cotte et al., 2022). Another study identified factors that could result in significant mistriage, in arrival mode, arrival time, age, sex, heart rate, blood pressure, and oxygen saturation. Additionally, elderly individuals exhibited high sensitivity as measured by the shock index, whereas younger individuals exhibited high sensitivity as measured by pulse pressure (Liu et al., 2021).

Another study reported patient treatment and prognosis‐related findings. The classification of urgency was higher when patients arrived by ambulance, acute changes were associated with hospitalization factors, and low triage levels were associated with discharge factors (Leung et al., 2021).

DISCUSSION

Our study was conducted to systematically review and synthesize the characteristics and findings of prospective studies that applied AI to emergency triage systems, focusing on identifying key trends, challenges, and areas for future research.

The sample sizes of the included studies varied widely, ranging from 146 to 17,072. AI requires large amounts of data to be optimized. For the simplest ML algorithms, 1000 samples per category are considered minimal and may not be sufficient to solve the problem when patient outcomes are involved (Varoquaux & Cheplygina, 2022). The small sample sizes may have been due to the prospective design, making it difficult to collect a large amount of data in a short period of time.

Six types of triages were applied in the studies included in this review; ESI was the most common among them. ESI is a classification scale divided into five levels, from level 1, which requires immediate resuscitation, to level 5, which is the least urgent. It was developed in the United States in 1999 and is used in an increasing number of countries. Except for one study that classified the ETS into four levels (Liu et al., 2021), most studies used a 5‐level triage. This is due to the publication of previous studies showing that the 5‐level triage has higher reliability and validity than the 3‐level triage used in the past (Christ et al., 2010; Travers et al., 2002), and it is thought that this is because the triage developed and used in each country follow the 5‐level triage.

In some studies, a predictive model was trained by setting the target classification stage, which had the advantage of triaging patients accurately and independently in the target classification stage (Cotte et al., 2022; Farahmand et al., 2017). For example, patients with levels 1 and 2, which were considered severe and life‐threatening in one study, were excluded because AI‐based interventions were judged to be unsafe and unfeasible (Cotte et al., 2022). This highlights the limitations of the currently developed AI‐based triages. In this context, levels 1 and 2 were factors that lowered the model performance, and level 2 had the lowest level of prediction of emergency patient triage (Farahmand et al., 2017; Karlafti et al., 2023). When triaging patients with severe illness, first‐impression risk assessment, in which the overall patient condition is grasped within the first 3–5 s, is very important (KTAS Committee, 2022). The currently developed AI‐based triage identifies and classifies patients using only data; however, nurses can evaluate the risk of first impressions through face‐to‐face observation of patients. Therefore, Patients with levels 1 and 2, which are considered to have severe and/or life‐threatening conditions, can be a limitation of AI‐based triage that cannot be observed.

According to the South Korea 2021 Emergency Medical Statistical Yearbook, only 7.1% of patients visiting EDs had a high severity of KTAS 1 and 2, but 92.9% of patients had KTAS 3–5, it is becoming the main cause of ED overcrowding (National Emergency Medical Center, 2022). In the current triage system, classification accuracy is reduced owing to middle‐level classification ambiguity, such as level 3 (Kang et al., 2020), and problems of undertriage and overtriage occur when some patients are classified as having non‐emergency (Lee et al., 2021). In addition, previous studies have shown that a triage system using an automated algorithm predicts hospitalization for ESI level 3–5 patients better than nurses do, suggesting the need for AI‐based triage to increase the accuracy of classification by targeting level 3–5 patients (Davis et al., 2022). The use of AI‐based triage targeting a specific level is expected to increase and accelerate the accuracy of decision‐making by triage nurses, thereby reducing the length of stay in an ED and improving the flow of the ED.

In our study, six studies applied ML, and one applied fuzzy logic. Both types of AI demonstrated outstanding accuracy in classification prediction in this systematic review. Fuzzy logic, in particular, achieved a higher accuracy rate of 99%. ML has the advantage of being easy to implement because it is designed as a series of algorithms that computers continuously learn (Shafaf & Malek, 2019). Compared with ML methods, this facilitates the processing of ambiguous variables, thereby increasing the accuracy and predictive power of classification (Dehghani Soufi et al., 2018).

Excluding accuracy among the performance of the models, fuzzy logic demonstrated a high precision of 93% and a high specificity and sensitivity of 99%. On the other hand, feed‐forward neural networks in ML exhibited a poor precision of 33%. Comparing both directly is difficult because of the differences in AI types. Nevertheless, the number of samples used in feedforward neural networks was approximately 50% lower despite the fact that they all employed the same classification tool, termed ESI, and did not utilize retrospective data in the same manner. This can generally be linked to ML algorithms that increase model performance within a certain range when using large data sets and cause overfitting when the size of the data is small. (Rajput et al., 2023).

AUC is a performance metric for binary classifiers. The closer the AUC value is to 1, the better the model's predictive ability (Naidu et al., 2023). This study has reported a wide range of 0.495 ~ 0.875 AUC values, with some showing high variability. The F1 score is the harmonic mean of Precision and Recall, providing a single metric that balances both concerns (Yacouby & Axman, 2020). Precision, also known as Positive Predictive Value, measures the proportion of true positive predictions among all positive predictions made by the model (Naidu et al., 2023). Recall, also known as Sensitivity or True Positive Rate, measures the proportion of actual positives that are correctly identified by the model (Yacouby & Axman, 2020). In this study, F1 score, precision, and recall were not consistently reported across all studies, making cross‐study comparisons challenging. The datasets used varied significantly, encompassing different patient populations and types of emergencies, which naturally led to differing model performances. Additionally, sensitivity and specificity were 80 ~ 99% in two studies, suggesting variability in the models’ performance across different studies. These inconsistencies highlight the need for standardized datasets and evaluation protocols in future research to facilitate more reliable comparisons. Such measures will ultimately enhance the reliability and applicability of AI tools in ED triage, ensuring better patient outcomes and resource management.

Model performance necessitates selecting measures that are appropriate for the specific task at hand, as well as comprehension of these measurements. For instance, AUC‐ROC effectively evaluates the overall performance, but it does not establish precise thresholds. However, if it prioritizes sensitivity or specificity over accuracy, the AUC‐ROC can be low (Erickson & Kitamura, 2021). Hence, it is necessary to conduct a suitable assessment that aligns with the particular task. Furthermore, the model's performance gradually declines over time (Allen et al., 2021), necessitating comprehensive insight at every stage. To ensure the successful integration and effective utilization of AI‐based clinical support systems in clinical practice, it is necessary to continuously check their accuracy, which is a critical indicator (Akhlaghi et al., 2024).

Therefore, it is dangerous to judge that fuzzy logic is superior only because fuzzy logic has higher accuracy than ML in this study's results. Notably, costs and benefits may vary depending on the sample used, available resources, and knowledge (Ivanova et al., 2021). Based on this study's results, it is expected that the two AI methods can be used in a complementary manner to solve complex problems efficiently.

Two studies combined various ML models in this systematic review. When ML was first introduced, research was conducted to verify the performance of single models. Recently, various algorithm models have been mixed and repeatedly trained, and better‐performing algorithms have been constructed and applied in clinical environments. (Inokuchi et al., 2022). For example, the trauma hybrid‐suite entry algorithm (THETA), which was developed to classify patients with severe trauma, combines six algorithms–Bayesian ridge regression, linear regression, multilayer perceptron, clustering, support vector machine, and XGBoost–into a model. It showed 2–3 times higher performance than existing algorithms, making predictions more robust (Senda et al., 2022). It suggests that follow‐up research is needed for an optimal algorithm for triage.

Even with the same sample, different ML methods can show different levels of accuracy (Alanazi et al., 2017). Therefore, for technical supplementation, data that can be continuously monitored in the future should be continuously introduced to optimize the algorithm, and follow‐up studies should be conducted to increase high‐level performance and classification prediction. In addition, because triage is a serious issue linked to patient safety, it is important to develop a new model for accurate prediction and find the optimal model before applying it to actual patients by continuously comparing it with previous studies.

Most of the studies in this systematic review focused on classification accuracy and prediction. Still, some studies dealt with interesting results, such as reduced classification time, emergency classification comparison, risk factors for classification errors, treatment, and prognosis. In addition, by confirming the retrospective evaluation of the medical staff's manual classification, not only the nurse's manual classification and comparison but also the retrospective evaluation of the classification results, it was possible to supplement the limitations of subjective judgment when setting the medical staff's manual classification as the gold standard. Therefore, by checking the variables to be considered in AI‐based triage and those to be considered when evaluating predictive effects, efforts to supplement the limitations of programs developed thus far should be continuously made.

The quality appraisal of the seven studies included in this systematic review revealed that the statistical methods and study sizes were somewhat insufficiently described. Among the statistical analysis methods in three studies, missing value treatment, confounding factor control, and sensitivity analysis were not mentioned. Although there was a record of the number of samples in all studies, four studies did not fully describe the basis for calculating the study size. Depending on the treatment method, missing values and confounding factors threaten internal validity, which can cause errors or biases and affect the research results (van Smeden et al., 2021; Wang & Kattan, 2020). In the future, when designing a study to verify the effect of AI‐based triage, an analysis method that includes or controls for missing values and confounding factors should be considered and recorded as an important part of observational studies to present correct results. In addition, when planning a study, the basis for sampling, based on statistical grounds and similar previous studies, should be specified in detail.

Through this study, it is meaningful to compare the effects of various AI models in prospective studies, providing basic data that can support the work of nurses for triage in EDs. AI cannot replace real nurses because nurses’ interactions with patients may contain important information that is difficult to enter the system (Davis et al., 2022). However, through this study, it is considered that identifying the degree of prediction of various models according to the type of model can be a supportive means of clinical decision‐making for triage in EDs where emergencies occur simultaneously. In addition, more rigorous interdisciplinary studies should be conducted to examine their effectiveness in improving patient health and other outcomes before they can be applied in real‐world clinical settings. For interdisciplinary research, it is necessary to provide an environment in which nurses can develop, use, and evaluate AI models. Support and intervention at the public level, such as local governments, should precede integrating AI‐related subjects into the nursing curriculum and provide nursing students with opportunities to learn new knowledge and skills. Thus, if the optimal algorithm is verified and the technology is standardized, it will become a powerful tool to support triage nurses’ decision‐making in overcrowded EDs. Furthermore, accurate and rapid triage can reduce undertriage and overtriage and improve ED flow. Ultimately, we hope that this will become a resource that can positively affect the treatment outcomes of patients.

Limitations

This study had a few limitations. First, the AI model's predictive accuracy was measured against the judgment of nurses and doctors in the studies included in this review, which may introduce bias due to their subjective opinions (Hwang & Shin, 2022; Lee et al., 2019; Moon, 2021). Future studies should address this by ensuring inter‐rater reliability and establishing qualification standards for classifiers. Second, empirical criteria such as “length of stay in the ED,” “ICU admission rate,” and “in‐hospital mortality,” which are objective and not open to interpretation, should be used to secure the reliability of the gold standard (Miles et al., 2020). Third, the reviewed studies were all conducted using a prospective observational design, and the results cannot be generalized to all ED environments because of the limited number of analyzed studies and the environments of various countries in which the studies were conducted. Therefore, high‐quality experimental studies applied to clinical practice should be conducted in the future through the verification of various models, identification of factors that can increase classification accuracy, and comparative analysis with prior literature.

Implications of the findings

This study examined the characteristics and effects of AI triages in prospective studies conducted at EDs. The primary objective of emergency patient triage is to promptly and precisely categorize patients at high risk and efficiently allocate emergency resources. The findings of this study show that AI exhibits potential capacities in emergency patient triage; however, it has not yet established evidence for substituting human involvement (Davis et al., 2022; Farahmand et al., 2017; Karlafti et al., 2023). It is critical that we acknowledge the unique complexity of patient triage and take ethical considerations into account when using AI.

This study highlights the constraints of the existing system that need to be considered when creating AI triage models for future implementation. It also offers suggestions on how to address these limits. To effectively implement AI triage in an ED, it is crucial to prioritize stability, efficiency, and reliability. To construct a suitable model, ensure thorough verification, and apply it based on the three conditions mentioned, it is essential to augment the study's findings with the assistance of several randomized controlled trials.

CONCLUSION

The study findings revealed that AI‐applied triage in EDs play a crucial role in classifying patients and optimizing resource allocation, particularly where triage experience or workforce is limited. AI‐based triage can improve care for patients with moderate to high needs and address overcrowding issues. However, challenges include potential bias due to subjective judgments by nurses and doctors and limited generalizability due to the diverse study environments. Future research should use objective criteria such as “length of stay,” “ICU admission rate,” and “in‐hospital mortality” to enhance accuracy and reliability. High‐quality experimental studies are needed to validate these findings and provide practical recommendations for AI integration in clinical practice.

This study significantly contributes to the existing body of knowledge on AI based triage, highlighting AI potential to enhance triage accuracy and support nurses with valuable decision‐making data. While AI cannot replace the nuanced interactions between nurses and patients, it significantly improves prediction precision. Rigorous experimental studies implementing AI‐based triage targeting level 3–5 patients, who are less severe but significantly contribute to ED overcrowding, are essential for practical validation. Furthermore, the cumulative research results are expected to be applied in practice after confirming the results using an integrated statistical analysis method. Creating an environment for nurses to develop, use, and evaluate AI models, supported by government initiatives to integrate AI education into nursing curricula, is crucial. Standardized AI algorithms can become powerful tools in overcrowded EDs, improving efficiency, reducing errors, and enhancing patient outcomes.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflict of interest.

CLINICAL RELEVANCE

Verification of the optimal artificial‐intelligence algorithm by conducting rigorous interdisciplinary research will be a powerful tool to support triage nurses decision‐making in overcrowded emergency departments. Artificial intelligence‐based triage has the potential to enhance emergency department flow and improve patient outcomes by reducing undertriage and overtriage through accurate, rapid triage.

CLINICAL RESOURCES

The Preferred Reporting Items for Systematic Reviews and Meta‐analysis (PRISMA) (prisma‐statement.org). International Prospective Register of Systematic Reviews (PROSPERO) PROSPERO (york.ac.uk). Korean Triage and Acuity Scale (KATS). http://www.ktas.org/ Manchester Triage (MTS). https://www.triagenet.net/. Canadian Triage and Acuity Scale (CTAS). https://ctas‐phctas.ca/. Emergency Severity Index (ESI). https://californiaena.org/wp‐content/uploads/2023/05/ESI‐Handbook‐5th‐Edition‐3‐2023.pdf.

Supporting information

Data S1.

JNU-57-105-s001.docx (1.9MB, docx)

ACKNOWLEDGMENTS

Not applicable.

Yi, N. , Baik, D. & Baek, G. (2025). The effects of applying artificial intelligence to triage in the emergency department: A systematic review of prospective studies. Journal of Nursing Scholarship, 57, 105–118. 10.1111/jnu.13024

Contributor Information

Nayeon Yi, Email: n990011@gmail.com.

Dain Baik, Email: lafore103@naver.com.

Gumhee Baek, Email: dnjsxka486@naver.com.

DATA AVAILABILITY STATEMENT

The data used in this systematic review is available. For inquiries regarding data access, please contact Dain Baik at lafore103@naver.com.

REFERENCES

  1. Akhlaghi, H. , Freeman, S. , Vari, C. , McKenna, B. , Braitberg, G. , Karro, J. , & Tahayori, B. (2024). Machine learning in clinical practice: Evaluation of an artificial intelligence tool after implementation. Emergency Medicine Australasia, 36(1), 118–124. 10.1111/1742-6723.14325 [DOI] [PubMed] [Google Scholar]
  2. Alanazi, H. O. , Abdullah, A. H. , & Qureshi, K. N. (2017). A critical review for developing accurate and dynamic predictive models using machine learning methods in medicine and health care. Journal of Medical Systems, 41(4), 69. 10.1007/s10916-017-0715-6 [DOI] [PubMed] [Google Scholar]
  3. Allen, B. , Dreyer, K. , Stibolt, R., Jr. , Agarwal, S. , Coombs, L. , Treml, C. , Elkholy, M. , Brink, L. , & Wald, C. (2021). Evaluation and real‐world performance monitoring of artificial intelligence models in clinical practice: Try it, buy it, check it. Journal of the American College of Radiology, 18(11), 1489–1496. 10.1016/j.jacr.2021.08.022 [DOI] [PubMed] [Google Scholar]
  4. Babineau, J. (2014). Product review: Covidence (systematic review software). Journal of the Canadian Health Libraries Association, 35(2), 68–71. 10.5596/c14-016 [DOI] [Google Scholar]
  5. Cho, A. , Min, I. K. , Hong, S. , Chung, H. S. , Lee, H. S. , & Kim, J. H. (2022). Effect of applying a real‐time medical record input assistance system with voice artificial intelligence on triage task performance in the emergency department: Prospective interventional study. JMIR Medical Informatics, 10(8), e39892. 10.2196/39892 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Christ, M. , Grossmann, F. , Winter, D. , Bingisser, R. , & Platz, E. (2010). Modern triage in the emergency department. Deutsches Ärzteblatt International, 107(50), 892–898. 10.3238/arztebl.2010.0892 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Clancy, T. R. (2020). Artificial intelligence and nursing: The future is now. Journal of Nursing Administration, 50(3), 125–127. 10.1097/NNA.0000000000000855 [DOI] [PubMed] [Google Scholar]
  8. Committee, K. T. A. S. (2022). Korean triage and acuity scale manual (2nd ed.). Korean Society of Emergency Medicine KTAS Committee. [Google Scholar]
  9. Cotte, F. , Mueller, T. , Gilbert, S. , Blümke, B. , Multmeier, J. , Hirsch, M. C. , Wicks, P. , Wolanski, J. , Tutschkow, D. , Schade Brittinger, C. , Timmermann, L. , & Jerrentrup, A. (2022). Safety of triage self‐assessment using a symptom assessment app for walk‐in patients in the emergency care setting: Observational prospective cross‐sectional study. JMIR mHealth and uHealth, 10(3), e32340. 10.2196/32340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Davis, S. , Ju, C. , Marchandise, P. , Diagne, M. , & Grant, L. (2022). The effect of human supervision on an electronic implementation of the Canadian triage acuity scale (CTAS). Journal of Emergency Medicine, 63(4), 498–506. 10.1016/j.jemermed.2022.01.014 [DOI] [PubMed] [Google Scholar]
  11. Dehghani Soufi, M. , Samad‐Soltani, T. , Shams Vahdati, S. , & Rezaei‐Hachesu, P. (2018). Decision support system for triage management: A hybrid approach using rule‐based reasoning and fuzzy logic. International Journal of Medical Informatics, 114, 35–44. 10.1016/j.ijmedinf.2018.03.008 [DOI] [PubMed] [Google Scholar]
  12. Díaz Planelles, I. , Navarro‐Tapia, E. , García‐Algar, Ó. , & Andreu‐Fernández, V. (2023). Prevalence of potentially inappropriate prescriptions according to the new STOPP/START criteria in nursing homes: A systematic review. Health, 11(3), 422. 10.3390/healthcare11030422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dulandas, R. , & Brysiewicz, P. (2018). A description of the self‐perceived educational needs of emergency nurses in Durban, KwaZulu‐Natal, South Africa. African Journal of Emergency Medicine: Revue Africaine de la Medecine dUrgence, 8(3), 84–88. 10.1016/j.afjem.2018.03.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Erickson, B. J. , & Kitamura, F. (2021). Magicians corner: 9. Performance metrics for machine learning models. Radiology . Artificial Intelligence, 3(3), e200126. 10.1148/ryai.2021200126 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Esteva, A. , Chou, K. , Yeung, S. , Naik, N. , Madani, A. , Mottaghi, A. , Liu, Y. , Topol, E. , Dean, J. , & Socher, R. (2021). Deep learning‐enabled medical computer vision. npj Digital Medicine, 4(1), 5. 10.1038/s41746-020-00376-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Farahmand, S. , Shabestari, O. , Pakrah, M. , Hossein‐Nejad, H. , Arbab, M. , & Bagheri‐Hariri, S. (2017). Artificial intelligence‐based triage for patients with acute abdominal pain in emergency department; a diagnostic accuracy study. Advanced Journal of Emergency Medicine, 1(1), e5. 10.22114/AJEM.v1i1.11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Fernandes, M. , Vieira, S. M. , Leite, F. , Palos, C. , Finkelstein, S. , & Sousa, J. M. C. (2020). Clinical decision support systems for triage in the emergency department using intelligent systems: A review. Artificial Intelligence in Medicine, 102, 101762. 10.1016/j.artmed.2019.101762 [DOI] [PubMed] [Google Scholar]
  18. Helm, J. M. , Swiergosz, A. M. , Haeberle, H. S. , Karnuta, J. M. , Schaffer, J. L. , Krebs, V. E. , Spitzer, A. I. , & Ramkumar, P. N. (2020). Machine learning and artificial intelligence: Definitions, applications, and future directions. Current Reviews in Musculoskeletal Medicine, 13(1), 69–76. 10.1007/s12178-020-09600-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hwang, S. , & Shin, S. (2022). Factors affecting triage competence among emergency room nurses: A cross‐sectional study. Journal of Clinical Nursing, 32(13–14), 3589–3598. 10.1111/jocn.16441 [DOI] [PubMed] [Google Scholar]
  20. Inokuchi, R. , Iwagami, M. , Sun, Y. , Sakamoto, A. , & Tamiya, N. (2022). Machine learning models predicting undertriage in telephone triage. Annals of Medicine, 54(1), 2990–2997. 10.1080/07853890.2022.2136402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ivanova, M. , Petkova, P. , & Petkov, N. (2021). Machine learning and fuzzy logic in electronics: Applying intelligence in practice. Electronics, 10(22), 2878. 10.3390/electronics10222878 [DOI] [Google Scholar]
  22. Kang, D. Y. , Cho, K. J. , Kwon, O. , Kwon, J. M. , Jeon, K. H. , Park, H. , Lee, Y. , Park, J. , & Oh, B. H. (2020). Artificial intelligence algorithm to predict the need for critical care in prehospital emergency medical services. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 28(1), 17. 10.1186/s13049-020-0713-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Karlafti, E. , Anagnostis, A. , Simou, T. , Kollatou, A. S. , Paramythiotis, D. , Kaiafa, G. , Didaggelos, T. , Savvopoulos, C. , & Fyntanidou, V. (2023). Support systems of clinical decisions in the triage of the emergency department using artificial intelligence: The efficiency to support triage. Acta Medica Lituanica, 30(1), 19–25. 10.15388/Amed.2023.30.1.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kim, J. S. , Seo, D. W. , Kim, Y. J. , Jeong, J. , Kang, H. , Han, K. S. , Kim, S. J. , Lee, S. W. , Ahn, S. , & Kim, W. Y. (2020). Prolonged length of stay in the emergency department and increased risk of in‐hospital cardiac arrest: A nationwide population‐based study in South Korea, 2016–2017. Journal of Clinical Medicine, 9(7), 2284. 10.3390/jcm9072284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kipourgos, G. , Tzenalis, A. , Diamantidou, V. , Koutsojannis, C. , & Hatzilygeroudis, I. (2022). An artificial intelligence based application for triage nurses in emergency department, using the emergency severity index protocol. International Journal of Caring Sciences, 15(3), 1764–1772. [Google Scholar]
  26. Lee, J. H. , Park, Y. S. , Park, I. C. , Lee, H. S. , Kim, J. H. , Park, J. M. , Chung, S. P. , & Kim, M. J. (2019). Over‐triage occurs when considering the patients pain in Korean triage and acuity scale (KTAS). PLoS One, 14(5), e0216519. 10.1371/journal.pone.0216519 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lee, J. T. , Hsieh, C. C. , Lin, C. H. , Lin, Y. J. , & Kao, C. Y. (2021). Prediction of hospitalization using artificial intelligence for urgent patients in the emergency department. Scientific Reports, 11(1), 19472. 10.1038/s41598-021-98961-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Leung, K. c. , Lin, Y. T. , Hong, D. Y. , Tsai, C. L. , Huang, C. H. , & Fu, L. C. (2021). A novel interpretable deep‐learning‐based system for triage prediction in the emergency department: A prospective study. In IEEE international conference on systems, man, and cybernetics (SMC), Melbourne, 2021 (pp. 2979–2985). IEEE. 10.1109/SMC52423.2021.9658729 [DOI] [Google Scholar]
  29. Liu, Y. , Gao, J. , Liu, J. , Walline, J. H. , Liu, X. , Zhang, T. , Wu, Y. , Wu, J. , Zhu, H. , & Zhu, W. (2021). Development and validation of a practical machine‐learning triage algorithm for the detection of patients in need of critical care in the emergency department. Scientific Reports, 11(1), 24044. 10.1038/s41598-021-03104-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Miles, J. , Turner, J. , Jacques, R. , Williams, J. , & Mason, S. (2020). Using machine‐learning risk prediction models to triage the acuity of undifferentiated patients entering the emergency care system: A systematic review. Diagnostic and Prognostic Research, 4, 16. 10.1186/s41512-020-00084-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Moher, D. , Liberati, A. , Tetzlaff, J. , Altman, D. G. , & PRISMA Group . (2009). Preferred reporting items for systematic reviews and meta‐analyses: The PRISMA statement. PLoS Medicine, 6(7), e1000097. 10.1371/journal.pmed.1000097 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Moon, S. H. (2021). Factors affecting triage competency for nurses working in Korean emergency departments. Crisis and Emergency Management: Theory and Praxis, 17(3), 125–135. 10.14251/crisisonomy.2021.17.3.125 [DOI] [Google Scholar]
  33. Morley, C. , Unwin, M. , Peterson, G. M. , Stankovich, J. , & Kinsman, L. (2018). Emergency department crowding: A systematic review of causes, consequences and solutions. PLoS One, 13(8), e0203316. 10.1371/journal.pone.0203316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Naidu, G. , Zuva, T. , & Sibanda, E. M. (2023). A review of evaluation metrics in machine learning algorithms. In Silhavy R. & Silhavy P. (Eds.), Artificial intelligence application in networks and systems (Vol. 724, pp. 15–25). Springer International Publishing. 10.1007/978-3-031-35314-7_2 [DOI] [Google Scholar]
  35. National Emergency Medical Center . (2022). Emergency medical statistical annual report, 2021. Retrieved from https://www.e‐gen.or.kr/nemc/statistics_annual_report.do (pp. 11–1352000‐001248‐10). National Emergency Medical Center.
  36. Park, J. B. , Je, S. H. , Oh, J. H. , Kim, O. , Park, Y. S. , & Ko, J. (2017). Provider Manual of Korean Triage and Acuity Scale (1st Ed.). Koonja Publishing Inc. [Google Scholar]
  37. Raita, Y. , Goto, T. , Faridi, M. K. , Brown, D. F. M. , Camargo, C. A. , & Hasegawa, K. (2019). Emergency department triage prediction of clinical outcomes using machine learning models. Critical Care, 23(1), 64. 10.1186/s13054-019-2351-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Rajput, D. , Wang, W. J. , & Chen, C. C. (2023). Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics, 24(1), 48. 10.1186/s12859-023-05156-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Reay, G. , Smith‐MacDonald, L. , Then, K. L. , Hall, M. , & Rankin, J. A. (2020). Triage emergency nurse decision‐making: Incidental findings from a focus group study. International Emergency Nursing, 48, 100791. 10.1016/j.ienj.2019.100791 [DOI] [PubMed] [Google Scholar]
  40. Roquette, B. P. , Nagano, H. , Marujo, E. C. , & Maiorano, A. C. (2020). Prediction of admission in pediatric emergency department with deep neural networks and triage textual data. Neural Networks: The Official Journal of the International Neural Network Society, 126, 170–177. 10.1016/j.neunet.2020.03.012 [DOI] [PubMed] [Google Scholar]
  41. Saaiq, M. , & Ashraf, B. (2017). Modifying “Pico” question into “Picos” model for more robust and reproducible presentation of the methodology employed in a scientific study. World Journal of Plastic Surgery, 6(3), 390–392. [PMC free article] [PubMed] [Google Scholar]
  42. Sabaz, M. S. , Asar, S. , Cukurova, Z. , Sabaz, N. , Doğan, H. , & Sertcakacilar, G. (2020). Effect of delayed admission to intensive care units from the emergency department on the mortality of critically ill patients. Iranian Red Crescent Medical Journal, 22(6), e102425. 10.5812/ircmj.102425 [DOI] [Google Scholar]
  43. Sánchez‐Salmerón, R. , Gómez‐Urquiza, J. L. , Albendín‐García, L. , Correa‐Rodríguez, M. , Martos‐Cabrera, M. B. , Velando‐Soriano, A. , & Suleiman‐Martos, N. (2022). Machine learning methods applied to triage in emergency services: A systematic review. International Emergency Nursing, 60, 101109. 10.1016/j.ienj.2021.101109 [DOI] [PubMed] [Google Scholar]
  44. Senda, A. , Endo, A. , Kinoshita, T. , & Otomo, Y. (2022). Development of practical triage methods for critical trauma patients: Machine‐learning algorithm for evaluating hybrid operation theatre entry of trauma patients (THETA). European Journal of Trauma and Emergency Surgery, 48(6), 4755–4760. 10.1007/s00068-022-02002-0 [DOI] [PubMed] [Google Scholar]
  45. Shafaf, N. , & Malek, H. (2019). Applications of machine learning approaches in emergency medicine; a review article. Archives of Academic Emergency Medicine, 7(1), 34. [PMC free article] [PubMed] [Google Scholar]
  46. Silva, J. A. D. , Emi, A. S. , Leão, E. R. , Lopes, M. C. B. T. , Okuno, M. F. P. , & Batista, R. E. A. (2017). Emergency severity index: Accuracy in risk classification. Einstein, 15(4), 421–427. 10.1590/S1679-45082017AO3964 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Travers, D. A. , Waller, A. E. , Bowling, J. M. , Flowers, D. , & Tintinalli, J. (2002). Five‐level triage system more effective than three‐level in tertiary emergency department. Journal of Emergency Nursing, 28(5), 395–400. 10.1067/men.2002.127184 [DOI] [PubMed] [Google Scholar]
  48. van Smeden, M. , Penning de Vries, B. B. L. , Nab, L. , & Groenwold, R. H. H. (2021). Approaches to addressing missing values, measurement error, and confounding in epidemiologic studies. Journal of Clinical Epidemiology, 131, 89–100. 10.1016/j.jclinepi.2020.11.006 [DOI] [PubMed] [Google Scholar]
  49. Varoquaux, G. , & Cheplygina, V. (2022). Machine learning for medical imaging: Methodological failures and recommendations for the future. npj Digital Medicine, 5, 48. 10.1038/s41746-022-00592-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. von Elm, E. , Altman, D. G. , Egger, M. , Pocock, S. J. , Gøtzsche, P. C. , Vandenbroucke, J. P. , & Initiative, S. T. R. O. B. E. (2008). The strengthening the reporting of observational studies in epidemiology (STROBE) statement: Guidelines for reporting observational studies. Journal of Clinical Epidemiology, 61(4), 344–349. 10.1016/j.jclinepi.2007.11.008 [DOI] [PubMed] [Google Scholar]
  51. Walsh, M. , & Knott, J. C. (2010). Satisfaction with the emergency department environment decreases with length of stay. Emergency Medicine Journal, 27(11), 821–828. [DOI] [PubMed] [Google Scholar]
  52. Wang, X. , & Kattan, M. W. (2020). Cohort studies: Design, analysis, and reporting. Chest, 158(1S), S72–S78. 10.1016/j.chest.2020.03.014 [DOI] [PubMed] [Google Scholar]
  53. Yacouby, R. , & Axman, D. (2020). Probabilistic extension of precision, recall, and F1 score for more thorough evaluation of classification models. In Proceedings Of The First Workshop On Evaluation And Comparison Of Nlp Systems (pp. 79–91). Association for Computational Linguistics. 10.18653/v1/2020.eval4nlp-1.9 [DOI] [Google Scholar]
  54. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. 10.1016/S0019-9958(65)90241-X [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1.

JNU-57-105-s001.docx (1.9MB, docx)

Data Availability Statement

The data used in this systematic review is available. For inquiries regarding data access, please contact Dain Baik at lafore103@naver.com.


Articles from Journal of Nursing Scholarship are provided here courtesy of Wiley

RESOURCES