Skip to main content
Acta Informatica Medica logoLink to Acta Informatica Medica
. 2025;33(2):152–157. doi: 10.5455/aim.2025.33.152-157

Using Machine Learning Technique in Managing Emergency Triage Flow

Mohammed Almulhim 1, Dunya Alfaraj 1, Dina Alabbad 2, Faisal A Alghamdi 3, Mubarak A AlKhudair 3, Khalid A AlKatout 3, Saud A AlShehri 3, Amal Alsulaibaikh 1
PMCID: PMC12212266  PMID: 40606243

Abstract

Background:

Triage is a critical component of Emergency department care. Erroneous patient classification and mis-triaging are common in present triage systems worldwide. Therefore, several institutes worldwide have developed artificial intelligence-based algorithms that use machine learning approaches to sort and triage patients effectively.

Objective:

This study aims were to propose a machine learning model to predict the triage level for emergency medicine department patients and compare its performance to the standard nursing triage system.

Methods:

This retrospective pilot study collected the dataset of emergency department records from King Fahad Hospital of the University in khobar, between January 1, 2020, and December 31, 2022. A sample of 998 randomly selected patients was included in this cohort. The machine learning model was trained using 10-fold cross-validation. Two experiments were conducted, including five triage levels, and the second combing triage levels 2, 3, 4, and 5.

Results:

The machine learning model achieved an accuracy of 84% in experiment 1 and 64% in experiment 2. The mis-triage rates of the machine learning model were significantly lower than those of the standard nursing triage system.

Conclusion:

The machine learning model achieved higher accuracy and lower mis-triage rates than the standard nursing triage system. Thus, the proposed machine learning model can be a helpful tool for emergency department triage, enabling more efficient and accurate patient management.

Keywords: Canadian Triage and Acuity Scale Machine Learning, Emergency Department Mis-triage, Random Forest

1. BACKGROUND

Triage in emergency departments (ED) is vital to efficiently and effectively managing contemporary E.Ds. Overcrowding in emergency departments is a widespread issue that causes delays in medical care. Identifying acutely unwell patients is a crucial aspect of ED triage to provide timely care to those who require it. Unfortunately, erroneous patient classification and mis-triaging are commonplace in the present worldwide triage systems.

Triage scales were developed to categorize patients according to their health status and need for care, considering their subjective complaints, vital signs, and the nurse’s clinical judgment. The Ipswich Triage Scale (ITS) was a five-category scale used for the first time 30 years ago. This method was validated and used by other scales; and served as the foundation for the Canadian Triage and Acuity Scale (CTAS) and the Manchester Triage Scale (MTS). (1) The purpose of triage is to sort and process patients so that those who require immediate care receive it.

Nevertheless, mis-prioritizing patients by over- or under-triaging them could lead to patient harm and resource management concerns. (2) Due to variances in clinical judgment, subjective evaluation of vital signs, and fundamental human error, the reliability and accuracy of the currently employed triage systems are uncertain.

Several institutes worldwide sought to avoid this risk by developing an artificial intelligence (AI)-based algorithm that uses Machine Learning (ML) systems to effectively sort and triage patients depending on their urgency and need. These algorithms aimed to improve patient care by providing precise and rapid assessment (2). In one published study, an algorithm was designed and retrospectively verified using the data of 22,272 patients. The algorithm’s performance was prospectively compared to the standard method of triaging. The algorithm achieved an error rate of 0.9%, compared to 1.2% for conventional triaging. (3)

2. OBJECTIVE

This study aims to propose and optimize a supervised machine learning model for predicting the triage level of patients presenting to the emergency department (ED) with high accuracy. Patient demographics, such as gender, presenting symptoms, and vital signs, are used to train the model, which is then used to evaluate the mis-triage rate of critically ill ED patients. The performance model is compared by comparing its ability to assign triage levels to patients with standard nursing triage. The ultimate objective of the study is to provide an improved and efficient approach to triage patients in the ED, leading to better patient care.

3. MATERIAL AND METHODS

Study design, settings, and data source

This comparative study aimed to evaluate the effectiveness of a machine learning (ML) model in increasing triaging accuracy in hospital emergency department (ED) settings. The study was retrospectively conducted using a dataset collected retrospectively from a single-center ED of a university hospital in Al Khobar, with an annual visit of over 100,000 patients. The dataset consisted of records of 300,000 patients who visited the ED between January 1, 2020, and December 31, 2022. From this dataset, 998 patients were randomly selected and stratified based on their triage levels, as defined by the Canadian Triage and Acuity Scale (CTAS) guidelines. Each patient’s vital signs and chief complaints were thoroughly reviewed to ensure the accuracy and consistency of the dataset. The dataset did not include any personally identifiable information. The selected dataset was preprocessed and fed into the random forest classifier for training using 10-fold cross-validation. The study consisted of two experiments, with the first experiment including five triage levels, and the second experiment combining triage levels 2 and 3 and levels 4 and 5, resulting in only three classes.

Data analysis approach

The dataset underwent preprocessing procedures to prepare for machine learning model training. Initially, redundant features were removed. Next, categorical features were converted into numeric by applying the Label Encoder technique. (4) Subsequently, the numeric features were scaled between the values of -1 and 1 using the Min-Max Scaler technique. (5) To balance the number of samples in each class, the Synthetic Minority Over-sampling Technique (SMOTE) was employed. (6) The effect of SMOTE on the sample distribution is illustrated in Figure 1.

Figure 1. Sample distribution for experiment 1.

Figure 1.

To perform the second set of experiments, we combined classes 2 and 3 and classes 4 and 5 to have three classes. The SMOTE technique has been used again to balance the dataset. Figure 2 shows sample distribution before and after using SMOTE.

Figure 2. Sample distribution for experiment 2.

Figure 2.

Data Preparation

The inclusion criteria for the sample consisted of all patients who presented to the ED and were above 18 years of age. The sample on which the machine learning model was based consisted of 998 randomly selected patients stratified based on their triage levels, as defined by the Canadian Triage and Acuity Scale (CTAS) guidelines.

The machine learning model was trained on this dataset, which included information on patient demographics regarding gender, presenting symptoms, and vital signs. The model was optimized through rigorous descriptive and exploratory analyses to predict and assess the mis-triage rate of critically ill ED patients. The performance of the model was then validated by conducting.

The study used preprocessing techniques like feature reduction, Label Encoder, Min-Max Scaler, and SMOTE to clean and preprocess the dataset. The first experiment included five triage levels, and the second included three classes after combining triage levels 2 and 3 with 4 and 5.

Ethical approval has been granted by the Institution Review Board (IRB) of the University to proceed with the study. IRB-UGS-2022-01-425

Variables

Independent variables: In an experiment, independent variables are those manipulated. In this study, the independent variables were the demographics (gender and age) of the patients, their presenting symptoms, and their vital signs. These variables were extracted from the ED’s medical records for use in training the machine learning model.

The independent variables influenced the outcomes by supplying the training data for the machine learning model. Using the input features, the model then predicted the triage level. The dependent variable (triage level) was determined by comparing the predicted triage level to the actual triage level based on CTAS guidelines.

Dependent variables: In this study, the dependent variable was the machine learning model’s determination of the triage level. Based on the accumulated independent variables, the model predicted each patient’s triage level based on the collected independent variables.

Controlled variables: The data collecting process, the hospital where the data was obtained, and the CTAS triage level assignment rules were the controlled variables in this study. Maintained data collection accuracy and consistency, evaluated the defined triage levels using CTAS criteria, and trained and validated the model using the same dataset.

The controlled variables aided the experiment’s measurement by ensuring that data collection and triage level assignment was precise and consistent. The CTAS rules used to determine the triage level were standardized, and the same dataset was used for training and verifying the model. These controlled variables reduced the impact of outside factors on the experiment’s outcomes, boosting the experiment’s reliability and validity.

Machine Learning Algorithm

The machine learning algorithm used in the study was Random Forests (RF). RF builds an ensemble of decision trees and then trains them by bagging to reach a more accurate classification. The RF algorithm combines the output of multiple decision trees, which can determine the importance of variables in a novel way, model complex relationships between predictors, and perform many types of statistical data analysis, such as survival analysis, classification, regression, and unsupervised learning.

Random forests (RF) are a widely used machine learning algorithm developed by Leo Breiman and Adele Cutler. This algorithm has gained popularity recently because of its simplicity and versatility. It builds an ensemble of decision trees and then trains them by bagging; thus, it is called a random forest. The RF algorithm combines the output of multiple decision trees to achieve a more accurate classification. (7)

In comparison to other statistical classifiers, RF has the following advantages:

  • High accuracy of classification.

  • Determining the importance of variables in a novel way.

  • Modeling complex relationships between predictors.

  • Performing many types of statistical data analysis, such as survival analysis, classification, regression, and unsupervised learning.

  • Imputing missing values.

RF calculations also generate measures of data point similarity and variables’ importance, which can be used for clustering, multidimensional scaling, and graphical representation. (7)

Finally, random forest is a highly useful and flexible algorithm for both regression and classification. Furthermore, if the forest contains sufficient trees, the over-fitting problem is solved, and highly accurate prediction results are achieved. An unfavorable aspect is that RF algorithms are inefficient when many trees are used, resulting in extremely slow real-time predictions when many trees are used. More trees are required for a much more accurate prediction, which results in a slower model. RF works well in most real-world applications, but in some cases, run-time performance is critical, and other approaches may be more effective. (8)

Evaluation Metrics and Data Analysis

The model’s performance was evaluated using various metrics, including accuracy, the confusion matrix, recall, precision, specificity, and F1-score. Model 1 represents the experiment using five classes, and Model 2 uses three classes. The results obtained from the experiments show that the proposed model has high accuracy and can accurately predict the triage level of emergency department patients.

As well as accuracy, other metrics such as the confusion matrix, recall, precision, specificity, and F1-score are also used.

The confusion matrix evaluates the model’s performance by comparing predicted and actual values. Figure 3 shows a confusion matrix for the binary problem classification.

Figure 3. Confusion matrix.

Figure 3.

True positive, true negative, false positive, and false negative are represented by the symbols TP, TN, FP, and FN, respectively. (1).

Accuracy =TP+TNTP+TN+FP+FN

Accuracy is the percentage of truly predicted samples in the testing set. (4)

Precision =TPTP+FP

Precision is the percentage of truly predicted positive class samples among all positive predictions. (4)

Recall =TP+TNTP+FN

Recall (also known as sensitivity) is the percentage of correctly predicted positive samples among all real positive samples. (4)

F1 score =2× Precision × Recall Precision + Recall

The F1-score is the average of the positive class truly predicted samples (precision) and correctly predicted positive samples (recall). (4)

4. RESULTS

Features analysis

After visualizing the correlation matrix in Figure (3) and Figure (4), we observed some correlation between features.

Figure 3. Model 1 confusion matrix.

Figure 3.

In this section, we analyze and present the performance of the proposed machine learning model. The results are summarized in Table 1, where model 1 represents the experiment using five triage classes. In contrast, model 2 represents the experiment using three classes obtained after combining levels 2 and 3 and levels 4 and 5.

Table 1. Experiments results.

Model Precision Recall F1-score Accuracy
Model 1 0.70 0.68 0.69 0.68
Model 2 0.84 0.84 0.84 0.84

Table 1 shows that model 2 outperformed model 1 in all evaluation metrics, with an 84% and 68% accuracy, respectively. Due to the small size of the dataset, this reduces the model’s ability to learn from data. Reducing the number of classes gives the model a better opportunity to learn from the data, as more samples represent each class.

In conclusion, this study provides valuable insights into using machine learning models to predict the triage level of emergency department patients. The study’s medical perspective highlights the significance of accurate triage decisions to ensure patients receive timely and appropriate care, which can significantly impact their health outcomes. Integrating machine learning models with medical data can potentially revolutionize the healthcare industry, leading to more efficient and effective patient care.

5. DISCUSSION

The study aimed to design and compare machine learning models for predicting patient triage outcomes in hospital emergency departments. The study collected a comprehensive dataset of emergency department records from King Fahad University Hospital in AlKhobar, which includes 998 randomly selected patients stratified based on their triage levels as defined by the Canadian Triage and Acuity Scale (CTAS) guidelines. The dataset included patient demographics, such as gender, presenting symptoms, and vital signs. Conducted rigorous descriptive and exploratory analyses and optimized the machine learning model through a random forest classifier, trained using ten-fold cross-validation. The results suggest that incorporating machine learning models with traditional nursing triage can improve accuracy and reduce missed cases, thus benefiting emergency department patients. Our study found that the highest accuracy was achieved with three classes, reaching a rate of 84%. However, we identified the need for massive actual data samples to eliminate the imbalances in specific triage classes when developing machine learning triage models. Comparing model 1 and model 2, we concluded that these findings could enhance patient outcomes and decrease waiting times in emergency departments by offering more precise and efficient triage evaluations.

Figure 4. Model 2 confusion matrix.

Figure 4.

In today’s fast-paced and ever-evolving healthcare landscape, triaging patients efficiently and effectively in the Emergency Department (ED) is crucial for optimal patient outcomes and resource management. With the integration of cutting-edge technologies into medical practice, there is a growing interest in harnessing the power of artificial intelligence to develop innovative solutions that can revolutionize healthcare in general. (9) Rapidly assessing and prioritizing incoming patients, ensuring timely interventions, reducing wait times, and ultimately improving the quality of care in the ED can be achieved by properly utilizing machine learning systems. (10) To achieve the highest quality of care provided to ED patients, machine learning (ML) can be used hand in hand with nurses’ and physicians’ expertise. In a scoping review done in 2020, a total of 150 articles were eligible to be utilized to study the implementation of artificial intelligence (AI) in emergency medicine. 24 (16%) identified studies found in the review had human comparators, and 12 of them showed an out-performance of the AI over the clinician’s intervention in at least one calculated outcome. (11) Another prospective study found that AI was more accurate than human decisions in recognizing medical conditions with only a brief history. Moreover, a higher percentage of safer triaging recommendations are done by AI compared to human decisions. (12)

However, this study used artificial intelligence and machine learning techniques to manage and simplify emergency triage workflows. A total of 998 patients were tested using the Canadian 5-point triage scale to predict the correct category based on the data entered. The results showed 68% accuracy. Three studies used machine learning to predict outcomes at the triage level. One study compared the performance of the Adaptive Neuro-Fuzzy Inference System (ANFIS) and the Artificial Neural Network (ANN) models using 3015 records and classified patients based on a four-level triage scale. The accuracy of the two models was 96% and 92%, respectively. (13) A similar study by Azeez et al. using 2223 records and applying them to a three-level triage scale showed that ANN achieved 99% accuracy and ANFIS achieved 96% accuracy. (14)

On the decision to reduce the severity scales to three triage levels, there was a drastic overall improvement in the accuracy of the applied model, with an increase in accuracy to 84%. The original five-level triage scale achieved an accuracy of 68%, and a similar study by Zmiri D. et al. yielded an approximately similar accuracy of 52.94% for the same triage scale. Reducing the triage level in our model to a 3-level severity scale significantly increased accuracy to 84%, whereas Zmiri D. et al. found that the 2-level triage scale yielded an overall accuracy of 71.71%. (15) From previous studies, we can conclude that increasing the amount of data reviewed by the system and decreasing the levels of the triage scale will lead to a higher hit rate.

Artificial intelligence is already enforcing itself in the field of healthcare. (16) Currently, it is most commonly used to predict the best applicable treatment protocol based on various patient characteristics using large datasets that a module can be trained. (17) This is similar to what was done in our case, in which the 998 patient dataset was used to train the module to be capable of making a well-calculated decision on which triage level should the patient be allocated to. However, more complex forms of machine learning exist, such as deep learning, which utilizes neural network models. This method was used in the two studies that used AI to help triage patients in the ED. (13,14) The use of deep learning is also seen in the field of radiology, where AI is made capable of recognizing abnormalities in radiological images. (18) Similar to our study in terms of being implemented in the ED, a systematic review assessed the capabilities of another form of AI, natural language processing, to extract information deemed relevant from unstructured clinical notes, such as physician notes and nursing notes, and integrate them to reduce the risk of medical errors due to missing or incomplete information. (19) However, even when considering all the benefits that AI is bringing, it is important to note that while AI has the potential to revolutionize healthcare, it should not replace human judgment or the patient-provider relationship, but rather complement it to improve overall patient care.

The primary limitation of our study is the limited sample size of only 998 patients included in the training and validation of the algorithm, in which a larger sample size is preferred for creating a robust algorithm. Additionally, this limited sample size may not represent the larger population, and may affect how the results can be generalized over different populations. Furthermore, using the Canadian Triage and Acuity Score may limit the external validity of the results, as the score may not be applicable in other healthcare systems that use different triage scoring systems. Other studies included the patient triage level in the myriad scales available to enhance the algorithm. However, we only included the CTAS level. In addition, other important variables, such as the mechanism of injury, presenting time, visual pain scores, disposition, past medical history, and previous laboratory values, were not included, which may impact the triage level. Lastly, the study only used a single machine learning algorithm (Random Forest); other algorithms could have been utilized to improve model performance. These limitations should be considered when interpreting the results.

6. CONCLUSION

In conclusion, this study aimed to develop machine learning techniques using Random Forest to predict patients presenting to the Emergency Department triage level. Our objectives were to evaluate the performance of our models in predicting 5-class and 3-class triage levels and to compare their accuracy using the Canadian Triage and Acuity Score. The main findings of our study showed that our Random Forest models achieved an accuracy of 68% and 84% for predicting 5-class and 3-class triage levels, respectively, within our sample patient cohort. Our study demonstrates the successful potential of machine learning algorithms to improve the accuracy and efficiency of the triage process in emergency departments, which can ultimately lead to better patient outcomes and more efficient resource management. However, we recommend further research is performed to refine and test our models on larger and more diverse datasets to ensure their generalizability and to assess their clinical utility in real-world settings.

Acknowledgement:

We appreciate the contribution of medical students Abdulaziz K. AlNaimi and Nasser A. AlJoaib and Computer science students Dorieh M. Alomari, and Roaa H. AlAmri in this research project.

Author’s contribution:

All authors were involved in all steps of preparation this article. Final proofreading was made by the first author.

Conflict of interest:

None to declare.

Financial support and sponsorship:

None.

REFERENCES

  • 1.FitzGerald G, Jelinek GA, Scott D, Gerdtz MF. Emergency department triage revisited. Emerg Med J. 2010 Feb;27(2):86–92. doi: 10.1136/emj.2009.077081. [DOI] [PubMed] [Google Scholar]
  • 2.Sánchez-Salmerón R, Gómez-Urquiza JL, Albendín-García L, Correa-Rodríguez M, Martos-Cabrera MB, Velando-Soriano A, et al. Machine learning methods applied to triage in emergency services: A systematic review. Int Emerg Nurs. 2022 Jan;60:1011–09. doi: 10.1016/j.ienj.2021.101109. Epub 2021 Dec 22. [DOI] [PubMed] [Google Scholar]
  • 3.Liu Y, Gao J, Liu J, Walline JH, Liu X, Zhang T, et al. Development and validation of a practical machine-learning triage algorithm for the detection of patients in need of critical care in the emergency department. Sci Rep. 2021 Dec 15;11(1):24044. doi: 10.1038/s41598-021-03104-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sklearn.preprocessing.LabelEncoder (Internet) scikit. (cited 2023 Apr 30) Available from: https://scikitlearn.org/stable/modules/generated/sklearn.Preprocessing.Label Encoder.html . [Google Scholar]
  • 5.Sklearn.preprocessing.LabelEncoder. scikit. n.d. Retrieved April 30, 2023 https://scikitlearn.org/stable/modules/generated/sklearn.preprocessing.Label Encoder.html . [Google Scholar]
  • 6.SMOTE–Version 0.10.1. n.d. Retrieved April 30, 2023 from https://imbalancedlearn.org/stable/references/generated/imblearn.over_sampling.SMOTE . [Google Scholar]
  • 7.Random Forest: A complete guide for machine learning. Built In. n.d. Retrieved April 30, 2023 from https://builtin.com/data-science/random-forest-algorithm . [Google Scholar]
  • 8.DeepAI. Evaluation metrics. DeepAI. 2019 May 17; Retrieved April 30, 2023 from https://deepai.org/machine-learning-glossary-and-terms/evaluation-metrics. 9. Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. [Google Scholar]
  • 9.Future Healthc J. 2019 Jun;6(2):94–98. doi: 10.7861/futurehosp.6-2-94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tang KJW, Ang CKE, Constantinides T, Rajinikanth V, Acharya UR, Cheong KH. Artificial Intelligence and Machine Learning in Emergency Medicine. Biocybernetics and Biomedical Engineering. 2021 Dec 24;41(1):156–172. doi: 10.1016/j.bbe.2020.12.00. Epub 2021 Dec 24. [DOI] [Google Scholar]
  • 11.Kirubarajan A, Taher A, Khan S, Masood S. Artificial intelligence in emergency medicine: A scoping review. J Am Coll Emerg Physicians Open. 2020 Nov 7;1(6):1691–1702. doi: 10.1002/emp2.12277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Baker A, Perov Y, Middleton K, Baxter J, Mullarkey D, Sangar D, Butt M, DoRosario A, Johri S. A Comparison of Artificial Intelligence and Human Doctors for the Purpose of Triage and Diagnosis. Front Artif Intell. 2020 Nov 30;3:5434–05. doi: 10.3389/frai.2020.543405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Taghavifard MT, Kalhori SR, Farazmand P, Farazmand K. An intelligent system for prioritising emergency services provided for people injured in road traffic accidents. Mediterranean journal of social sciences. 2016 Jan 7;7(1 S1):354. doi: 10.5901/mjss.2016.v7n1s1p35. [DOI] [Google Scholar]
  • 14.Azeez D, Ali MA, Gan KB, Saiboon I. Comparison of adaptive neuro-fuzzy inference system and artificial neutral networks model to categorize patients in the emergency department. Springerplus. 2013 Aug 29;2:416. doi: 10.1186/2193-1801-2-416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zmiri D, Shahar Y, Taieb-Maimon M. Classification of patients by severity grades during triage in the emergency department using data mining methods. J Eval Clin Pract. 2012 Apr;18(2):378–388. doi: 10.1111/j.1365-2753.2010.01592.x. Epub 2010 Dec 19. [DOI] [PubMed] [Google Scholar]
  • 16.Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng. 2018 Oct;2(10):719–731. doi: 10.1038/s41551-018-0305-z. Epub 2018 Oct 10. [DOI] [PubMed] [Google Scholar]
  • 17.Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J. 2019 Jun;6(2):94–98. doi: 10.7861/futurehosp.6-2-94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Saba L, Biswas M, Kuppili V, Cuadrado Godia E, Suri HS, Edla DR, et al. The present and future of deep learning in radiology. Eur J Radiol. 2019 May;114:14–24. doi: 10.1016/j.ejrad.2019.02.038. Epub 2019 Mar 2. [DOI] [PubMed] [Google Scholar]
  • 19.Kreimeyer K, Foster M, Pandey A, Arya N, Halford G, Jones SF, Forshee R, Walderhaug M, Botsis T. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J Biomed Inform. 2017 Sep;73:14–29. doi: 10.1016/j.jbi.2017.07.012. Epub 2017 Jul 17. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Acta Informatica Medica are provided here courtesy of Academy of Medical Sciences of Bosnia and Herzegovina, Sarajevo, Bosnia and Herzegovina

RESOURCES