Skip to main content
Journal of the American Medical Informatics Association : JAMIA logoLink to Journal of the American Medical Informatics Association : JAMIA
. 2024 Feb 13;31(4):958–967. doi: 10.1093/jamia/ocae018

Estimation of racial and language disparities in pediatric emergency department triage using statistical modeling and natural language processing

Seung-Yup (Joshua) Lee 1, Mohammed Alzeen 2, Abdulaziz Ahmed 3,
PMCID: PMC10990499  PMID: 38349846

Abstract

Objectives

The study aims to assess racial and language disparities in pediatric emergency department (ED) triage using analytical techniques and provide insights into the extent and nature of the disparities in the ED setting.

Materials and Methods

The study analyzed a cross-sectional dataset encompassing ED visits from January 2019 to April 2021. The study utilized analytical techniques, including K-mean clustering (KNN), multivariate adaptive regression splines (MARS), and natural language processing (NLP) embedding. NLP embedding and KNN were employed to handle the chief complaints and categorize them into clusters, while the MARS was used to identify significant interactions among the clinical features. The study also explored important variables, including age-adjusted vital signs. Multiple logistic regression models with varying specifications were developed to assess the robustness of analysis results.

Results

The study consistently found that non-White children, especially African American (AA) and Hispanic, were often under-triaged, with AA children having >2 times higher odds of receiving lower acuity scores compared to White children. While the results are generally consistent, incorporating relevant variables modified the results for specific patient groups (eg, Asians).

Discussion

By employing a comprehensive analysis methodology, the study checked the robustness of the analysis results on racial and language disparities in pediatric ED triage. The study also recognized the significance of analytical techniques in assessing pediatric health conditions and analyzing disparities.

Conclusion

The study’s findings highlight the significant need for equal and fair assessment and treatment in the pediatric ED, regardless of their patients’ race and language.

Keywords: natural language processing, racial triage disparity, language triage disparity, pediatric emergency department

Introduction

In the fast-paced environment of emergency departments (EDs), triage systems standardize and prioritize patient care to accelerate ED throughput. These systems enable ED nurses to promptly evaluate a patient’s condition severity and estimate the necessary resources for each ED patient.1 Pediatric triage requires intricate interpretation of patient information due to its unique nature such as the distinct age groups. The goal is bias-free, consistent, and equitable care delivery. Nonetheless, disparities in pediatric ED and its triage among racial and ethnic minorities have been reported. Payne and Puumala2 and Purtell et al3 discovered disparities in ED assessments and wait times among East African immigrants, younger patients, uninsured individuals, and patients from Native American, Biracial, Hispanic, and African American (AA) backgrounds compared to non-Hispanic White counterparts. Zook et al4 found that non-White patients, including AA, Hispanic, and American Indian children, were more likely to be under-triaged. Dennis5 suggested that this bias may stem from an asymmetry in urgency perception between the child’s parents and the nursing staff. Additionally, African American (AA) children were 22% more likely to receive inaccurately lower acuity scores. These results highlight the significant need for equal and fair assessment and treatment in the ED, regardless of their patients’ race, ethnicity, or other demographic characteristics.

About 5% of pediatric ED cases suffer from under-triaging.6 Such unequal triage can lead to incomplete ED evaluations or treatments for children7 and reduced diagnostic imaging and Asthma steroid treatment rates.8 Delays stemming from under-triaging can aggravate complications and severity of emergencies like stroke, sepsis, and myocardial infarction.9–11 Research found that racial disparities in triage practices can negatively impact healthcare for a specific group of patients. Zhang et al12 and Boley et al13 indicated that AA patients are disproportionately under-triaged compared to White patients, leading to more frequent Fast-Track area assignment, which primarily addresses less urgent cases. This practice, tainted by implicit bias, resulted in lower admission rates for AA patients. Longer ED wait times due to systematic under-triaging among AA patients correlates with the elevated mortality rate.14 Hence, the emergence of a tier-based system presents a critical issue of unequal care and outcomes.12,13

Studies on pediatric ED triage disparities have often employed Chi-square testing, along with bivariate, multivariate linear, and logistic regression models, considering factors like race, age, sex, insurance type, and geographical region.4,8 Metzger’s model15 considered language as variable, while Zook’s study4 introduced geographical factors, such as distance between the patient’s home and the nearest ED as a predictor. However, prior studies have often missed key patient-level characteristics in analyzing triage score assignment. Particularly, chief complaints and vital signs are the key predictors for estimating the triage score of a patient. However, most of the studies have been conducted without many of these features, which may lead to biased results due to endogeneity caused by the omitted variables influencing triage score assignment. Challenges in utilizing the key information include unstructured electronic health records (EHRs) data (ie, complaint text) and age-dependency involved in interpreting vital sign readings. Our study utilized a set of analytical techniques, including K-mean clustering (KNN), multivariate adaptive regression splines (MARS), and natural language processing (NLP) embedding. NLP embedding and KNN were employed to handle the chief complaints and categorize them into clusters, while the MARS was used to identify significant interactions among the clinical features. We also explored important variables, including age-adjusted vital signs, crowding level, and time-related covariates. These techniques and considerations provide several options to investigate pediatric triage disparities and to assess the consistency of impact adjusting for different confounders. By developing multiple logistic regression models employing varying specifications, we checked the robustness of analysis results that report the odds ratios of low-acuity triage scores depending on the racial and language factors. Unlike other studies, our approach offers a comprehensive analysis methodology, using the analytical techniques and robustness check specifications, for better understanding racial and language disparities in pediatric ED triage. This is the major contribution of this study.

Methods

Study design and protocol

We analyzed a cross-sectional dataset from an academic urban level 1 pediatric trauma center, encompassing all ED visits from January 2019 to April 2021. The hospital serves pediatric patients aged 0 to 21, offering comprehensive pediatric specialties. Out of the initial 136 375 pediatric patient visits, after removing abnormal records such as duplicate patient visits and ones with negative ED length of stay (LOS), we retained 135 389 patient visits for analysis. Since the purpose of this study is to investigate the triage disparity, we only include the patient information that can be obtained during the triage process. Any feature collected after triage, such features are LOS, and medical diagnoses, is excluded. The descriptions of the data items and our processing approaches are presented in the following subsections. For the bivariate analysis, we compared the vital sign values depending on their acuity level assignment (ie, low vs high acuity levels). We acknowledge that with a large sample, P-values obtained from t-tests can be misleading in term of practical significance of a variable.16 To address this, we calculated Hedge’s g, a well-established alternative, to measure the effect size recorded for the continuous vital sign variables.17 To interpret the effect size using Hedge’s g statistic, we employed the following rule-of-thumb thresholds: negligible (<0.2), small (0.2 ≤ g < 0.5), medium (0.5 ≤ g < 0.8), and large (≥0.8).

Dependent variable

In this study, the dependent variable we focused on is the urgency score, determined by the Emergency Severity Index (ESI) ranging from level 1 (most urgent) to 5 (least urgent).18 The ESI applied to the pediatric setting has proven reliability and validity.19,20 We applied a dichotomous dependent variable structure: more urgent cases (ESI levels 1 to 3) and less urgent cases (ESI levels 4 and 5). This approach adopted in prior studies,4,5 clarifies research findings, as levels 4 and 5 are considered low-acuity, and patients with such urgency levels are often treated differently such as through fast track and vertical split-flow. Disparities in this binary structure could imply further disparities in care delivery beyond prioritization. In our study, 60.2% of the entire visits were level 4 or 5 patients.

Chief complaint variable processing

The chief complaint feature in our study includes a wide range of complaints that patients present when arriving at the ED. The use of chief complaint as a patient categorizer is a common practice in research on disparities in ED triage.13,21,22 However, using the chief complaint feature as it appears in the patient’s chart can increase the dimensionality of the data and make it difficult to interpret the results. Therefore, we reduced the dimension of the chief complaints using both Bidirectional Encoder Representations from Transformers (BERT) sentence transformer and clustering.23 Firstly, we employed the BERT sentence transformer, a cutting-edge model in NLP. BERT, known for its deep contextual understanding, analyzes the text bidirectionally, providing a more nuanced interpretation of language. In our study, BERT processed each chief complaint, interpreting its meaning based on the entire context of the sentence rather than isolated words. This method ensures a comprehensive understanding of each complaint’s semantic nuances. The output from BERT for each complaint is a dense numerical representation known as an embedding. These embeddings are high-dimensional vectors that capture the intricate semantic meanings of the sentences, transforming the textual data into a format that can be efficiently processed by machine learning algorithms. The embeddings effectively condense the rich information contained in the chief complaints into a manageable form without significant loss of meaning or context. Subsequently, we applied K-means clustering to these embedding matrices. This clustering categorized the chief complaints into a smaller number of distinct groups based on their semantic similarities. During the clustering process, we used the elbow graph to determine the optimal number of clusters. We conducted the K-means algorithm multiple times, each for varying numbers of clusters (eg, 1-50), and recorded the distortion—the average squared distance between data points and their respective cluster centers—for each iteration.24 The optimal cluster count was identified by analyzing the percentage change in distortion over 50 iterations. The point at which the change in distortion reached 0.0% (indicating convergence) was marked on the elbow graph, helping us decide the optimal number of clusters. The embedding process was implemented in Python, utilizing the Sentence-Transformers library for BERT embeddings23 and Scikit-learn for K-means clustering.25

Age-dependent vital sign processing

Triage data processing for pediatric patients presents a unique aspect due to the age-specific normal vital sign ranges. Recognizing the significance of the age-dependent nature of vital signs in assessing pediatric health conditions, we accounted for their impact by utilizing 2 approaches.

Firstly, we utilized established clinical knowledge to process the numerical vital signs. We categorized the pediatric patients into 6 age groups: Newborn (0 to 1 month), Infant (1 to 12 months), Toddler (1 to 3 years), Preschooler (3 to 6 years), Schooler (6 to 12 years), and Adolescents (12 years and above). For normal vital sign ranges, we referred to the Pediatric Early Warning Signs (PEWS), which defines the normal ranges for heart rate, respirations, and blood pressure.26 For temperature, no significant variation in the normal range across the age groups has been reported.27,28 We relied on Leduc et al to categorize temperature as Normal (≥96.4 and ≤100.4), Low (<96.4), and High (>100.4) temperature categories. Oxygen saturation levels higher than or equal to 92% were considered normal across all age ranges, resulting in 2 categories, Normal and Low. While research reports variability in oxygen saturation measurements depending on skin pigmentation,29,30 our categorization was based on a broad and inclusive criterion, considering that a saturation level of 95% is generally accepted as the lower limit of the normal range. We specifically aimed to identify clinically low oxygen saturation levels. Therefore, we grouped these lower levels together under a unified criterion, acknowledging the predominant reports of skin pigmentation effects on pulse oximetry readings at low saturation (below 90%), without further subdivision. Table A in the Appendix provides the threshold values utilized to define the categories for each vital sign for each age group.

The second approach incorporates a data-driven approach. We applied the standardization method by transforming each vital sign value based on eqn (1), where j is the age-vital sign group defined by both the age groups and the types of vital signs. The index i denotes the patient within group j. xij represents the vital sign value of patient i within group j. x-ij is the mean value across the patients within group j. sj is the standard deviation of the measured values within group j. The standardization method was applied to heart rate, respiration, temperature, and systolic and diastolic blood pressure. Oxygen saturation values do not show normality, so standardization was not applied.

zij=xij-x-ijsj. (1)

ED visit history

Previous ED visit history can influence a nurse’s triage decision. Revisits within a 30-day period can indicate redundant visits by frequent ED utilizers who may have higher odds ratios of low-acuity triage scores.31 Conversely, revisits can also suggest an escalation in the severity of a patient’s condition if initial health issues were not adequately treated during previous visits.4 In this study, we categorized 30-day revisits as 0, 1-2, and more.

System-level covariates

Behavioral variations, influenced by system status, can impact triage decisions. Research on ED crowding, for instance, has shown conflicting results. Chen et al32 concluded that increased ED crowding is associated with higher-acuity triage scores, suggesting potential over-triage patients under workload pressure to speed up the process. Meanwhile, O’Connor et al33 and van der Linden et al34 found that although ED crowding significantly increased the wait times and ED LOS, it was not associated with triage scoring. Therefore, to account for such non-clinical influences on triage decisions, we incorporated ED crowding level, determined by current patient count upon the patient’s arrival, and time-related categorical covariates, such as day of week and hour of day, to adjust for factors like staffing variations in an indirect manner.

Analytically enhancing procedure

To analyze interactions among ED triage predictors, we utilized the MARS, a non-parametric technique for detecting interactions by modeling recursive partitioning regression. MARS was proposed by Friedman,35 and since then it has been widely applied to conduct large-scale data analysis in genetic research and the study of diseases and complications, including gene-gene interaction detection (epistasis)35–37; genotype-environment interaction detection,38 and complex interactions among health risk factors.39,40 The MARS model can be represented by eqn (2).

f^(x)=α0+Km=1fi(xi)+Km=2fik(xi, xk)+Km=3fikl(xi, xk, xl)+, (2)

where Km represents the number of splits (ie, categories in our modeling case) evaluated for interactions with i, k, and l denoting variable indices. Using the MARS procedure, we examined interactions between the chief complaint categories and vital sign categories as well as among the vital signs themselves, focusing on 2-way interactions (Km=2). With 30 labels from the unstructured chief complaints data, 210 potential interaction terms arise (ie, interactions between chief complaints and 6 vital signs, resulting in 180 possibilities [30 × 6 = 180] and interactions among 6 vital signs, resulting in 30 possibilities [6 × 5 = 30]). The interaction detection process utilizing MARS was conducted in R, utilizing the “earth” library version 5.3.2.41

Robustness check

Unlike the existing studies on racial and language disparities in pediatric ED triage, we pay a special attention to checking the robustness of analysis outcomes. We developed 6 logistic regression models, as shown in Table 1, with varying specifications to assess the impact of adjusting for potential confounders on the odds ratios of low-acuity triage scores in relation to racial and language factors. Model 1 includes the standard variables found in most of the triage disparities studies. Model 2 adds chief complaints. Complaints have often appeared in the pediatric ED triage disparities literature although the categorization of complaints has not been detailed sufficiently.4,15 Model 3 incorporates non-clinical, system-related information, such as the patient ED revisit history, crowding level, and time-related covariates. Some of these items have been employed in the existing studies.4,15 Therefore, the variables introduced in Models 1, 2, and 3 have been frequently covered in the pediatric ED triage literature while we conducted the NLP and clustering modeling to complete Model 2 and a more comprehensive set of time-related variables for Model 3. In Model 4, we incorporated the age-dependent vital sign categories based on medical knowledge. Model 5 introduces standardized vital sign values to Model 3. Model 6 incorporates interactions between chief complaints and the age-dependent vital sign categories into Model 4. Therefore, Models 4, 5, and 6 introduce variables that have been less explored in the existing literature. By taking the robustness check approach, we aim for a comprehensive view of conducting disparities analysis, including its methodology and findings, in the pediatric ED triage setting.

Table 1.

Summary of our models.

Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
Age X X X X X X
Gender X X X X X X
Insurance X X X X X X
Race X X X X X X
Language X X X X X X
Chief complaints X X X X X
Vital sign—knowledge X X
Vital sign—standardization X
Chief complaints and vital signs interactions X
Past ED visits (last 30 days) X X X X
Crowding level X X X X
Time-related covariates X X X X

Results

Table 2 presents the demographic profile of the sample included in the analysis. The sample consists of 135 387 unique encounters with patients of the pediatric age group. Of which, 24.93% were adolescents, 23.52% were School age, 19.87% were toddlers, followed by 16.47% were Preschool, 13.39% were infants, and 1.83% Newborns. The participant encounters were 52.15% with male patients, and 47.85% were with females. The patient encounters pool were equally distributed between non-Hispanic Black (48.67%) and non-Hispanic White (42.66%) followed by a smaller representation of Hispanic White (6.59%). In terms of Language, the English-speaking pool was the predominant element of the sample with 94.75%, followed by 4.94% of Spanish speaking patients. The study hospital records the race, ethnicity, and language information for most pediatric patients. When this information is not collected, the system categorizes these entries as “Unknown.” As a result, we have not excluded any data points from our analysis.

Table 2.

Sample demographics.

Variable Levels Mean (standard deviation) for numerical factors, % for nominal factors
Age 6.09 (5.73)
Newborn 1.83%
Infant 13.39%
Preschool 16.47%
Toddler 19.87%
School-age 23.52%
Adolescents 24.93%
Gender Female 47.85%
Male 52.15%
Race Black 48.95%
White 49.44%
Asian 0.57%
Other 0.90%
Unknown 0.14%
Hispanic Yes 7.25%
No 92.37%
Language English 94.75%
Spanish/Castilian 4.94%
Other 0.29%

Data analysis results

Table 3 provides the bivariate analysis results reporting how the continuous vital sign values were distributed depending on the age groups and binary acuity classes.

Table 3.

Vital sign distribution across different age groups.

Vital sign variable Low acuity—ESI 4 or 5 Mean (Q1, Q3) High acuity—ESI 1, 2, or 3 Mean (Q1, Q3) Hedge’s g statistic (absolute value)
Heart rate
 Newborn 146.0 (134, 162) 150.1 (135, 167) 0.158
 Infant 143.5 (129, 157) 149.8 (134, 165) 0.295
 Toddler 133.5 (117, 148) 136.8 (118, 154) 0.143
 Preschool 112 (100, 129) 118 (103, 135) 0.221
 School age 97 (86, 112) 102 (88, 118) 0.227
 Adolescents 87 (76, 99) 93 (81, 108) 0.358
Respiratory rate
 Newborn 40 (36, 48) 44 (36, 50) 0.279
 Infant 36 (30, 42) 40 (32, 48) 0.545
 Toddler 28 (24, 32) 30 (26, 37) 0.451
 Preschool 24 (22, 26) 24 (24, 28) 0.437
 School age 22 (20, 24) 22 (20, 24) 0.256
 Adolescents 20 (18, 20) 20 (18, 22) 0.237
Systolic blood pressure
 Newborn 97 (86, 110) 93 (83, 104) 0.241
 Infant 109 (100, 118) 106 (96, 116) 0.197
 Toddler 114 (104, 122) 114 (105, 124) 0.063
 Preschool 111 (103, 118) 112 (104, 121) 0.157
 School age 117 (109, 124) 118 (110, 126) 0.136
 Adolescents 125 (117, 133) 126 (118, 135) 0.107
Diastolic blood pressure
 Newborn 55 (46, 64) 53 (44, 62) 0.181
 Infant 61 (55, 69) 59 (53, 67) 0.182
 Toddler 66 (59, 74) 66 (58, 75) 0.002
 Preschool 65 (59, 72) 66 (59, 74) 0.115
 School age 68 (62, 75) 69 (62, 76) 0.122
 Adolescents 70 (64, 77) 72 (65, 79) 0.154
Temperature
 Newborn 98.7 (98.3, 99.2) 98.8 (98.2, 99.3) 0.012
 Infant 99.3 (98.5, 100.5) 99.1 (98.3, 99.9) 0.244
 Toddler 99.1 (98.2, 100.9) 98.8 (98.1, 100.1) 0.211
 Preschool 98.5 (98.1, 99.2) 98.6 (98.2, 99.3) 0.114
 School age 98.6 (98.3, 99.1) 98.5 (98.2, 99.0) 0.167
 Adolescents 98.5 (98.2, 98.9) 98.5 (98.2, 98.9) 0.070
O2 saturation
 Newborn 99 (97, 100) 99 (97, 100) 0.109
 Infant 99 (98, 100) 99 (97, 100) 0.295
 Toddler 98 (97, 100) 98 (96, 99) 0.127
 Preschool 98 (97, 100) 98 (97, 100) 0.132
 School age 99 (98, 100) 98 (97, 100) 0.050
 Adolescents 98 (98, 100) 98 (97, 99) 0.015

Chief complaint feature processing

The analysis identified the optimal number of clusters as 30, where the absolute percentage change reached 0 (see Figure 1). Consequently, the K-means clustering was executed for 30 clusters. The resulting cluster labels were then mapped with the original chief complaints and manually annotated.

Figure 1.

Figure 1.

Percentage of absolute change in distortion with number of clusters.

Age-dependent vital signs

Figure 2 reports the distribution of each vital sign across different age groups. While temperature and O2 saturation did not exhibit clear trends, the other vital signs showed clear upward or downward trend as age increased. Table 3 presents the bivariate statistics (low vs high acuity levels) for the vital sign variables. According to the Hedge’s g statistic thresholds, respiratory rate among the infant group presented medium level of effect between the 2 acuity level classes.

Figure 2.

Figure 2.

Vital signs.

Analytically enhancing procedure

The MARS algorithm detected 55 significant interactions terms out of 210 possible combinations between chief complaints and vital signs, as well as among the vital signs themselves. Of 55 interactions, 48 involved chief complaints, indicating that the chief complaints feature is instrumental in modeling interactions in classifying low versus high acuity levels. Table 4 summarizes the 48 interactions terms, organized according to the complaint annotation. For example, there were 3 interaction terms related to gastrointestinal distress that significantly contributed to the classification of low versus high acuity levels. The 3 terms were gastrointestinal distress × temperature, gastrointestinal distress × diastolic blood pressure, and gastrointestinal distress × respiratory rate. The remaining 7 terms were interactions between vital signs, including (1) heart rate and diastolic blood pressure, (2) respiratory rate and oxygen saturation, (3) heart rate and temperature, (4) respiratory rate and diastolic blood pressure, (5) respiratory rate and temperature, (6) heart rate and oxygen saturation, and (7) heart rate and respiratory rate, which are selected out of 30 possible interaction terms.

Table 4.

Interactions summary.

Complaint label Complaint annotation Vital signs that have significant interaction effects with each complaint
1 Gastrointestinal distress Temperature, diastolic blood pressure, respiratory rate
2 Shortness of breath Temperature, diastolic blood pressure, respiratory rate, heart rate, oxygen saturation
3 Joint pain Heart rate, diastolic blood pressure
4 Mental health issue Respiratory rate, heart rate
6 Collision injury Heart rate
7 Ear-related issue Temperature, diastolic blood pressure
8 Other injury Diastolic blood pressure, systolic blood pressure
11 Seizure concern Diastolic blood pressure, respiratory rate
13 Laceration Diastolic blood pressure, systolic blood pressure, heart rate, temperature
15 Behavioral health concerns Temperature
16 Child check Respiratory rate
17 Fever Diastolic blood pressure, systolic blood pressure, respiratory rate, oxygen saturation, temperature, heart rate
19 Abdominal pain Diastolic blood pressure
20 Nausea and vomiting Diastolic blood pressure, temperature
22 Abnormal health indicator Systolic blood pressure
23 Skin rash Diastolic blood pressure, heart rate, respiratory rate
24 Cough Oxygen saturation, respiratory rate
26 Blood sugar concern Temperature
27 Gastric tube problem Oxygen saturation
28 Swelling concern Respiratory rate, heart rate
30 Sore throat Respiratory rate, temperature, diastolic blood pressure, oxygen saturation

Disparities analysis results

Table 5 summarizes the logistic regression results. To examine the impact of adjusting for the different types of variables on the model outcomes, we progressively introduced the different sets of variables from Model 1 to 6.

Table 5.

Summary of logistic regression results.

Variables Odds ratio from logistic regression
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6
Race
 White
 AA
  • 2.70***

  • (2.64, 2.77)

  • 2.65***

  • (2.58, 2.72)

  • 2.67***

  • (2.60, 2.75)

  • 2.43***

  • (2.34, 2.52)

  • 2.48***

  • (2.38, 2.58)

  • 2.44***

  • (2.34, 2.53)

 Asian
  • 1.39***

  • (1.20, 1.61)

  • 1.33***

  • (1.13, 1.56)

  • 1.30**

  • (1.11, 1.53)

  • 1.13

  • (0.90, 1.43)

  • 1.12

  • (0.89, 1.45)

  • 1.15

  • (0.92, 1.46)

 Hispanic
  • 1.76***

  • (1.48, 2.10)

  • 1.78***

  • (1.47, 2.15)

  • 1.64***

  • (1.35, 1.99)

  • 1.46**

  • (1.11, 1.93)

  • 1.49**

  • (1.13, 1.96)

  • 1.48**

  • (1.12, 1.96)

 Other
  • 1.28***

  • (1.11, 1.48)

  • 1.32***

  • (1.13, 1.55)

  • 1.35***

  • (1.15, 1.58)

  • 1.23+

  • (0.99, 1.53)

  • 1.24+

  • (0.99, 1.54)

  • 1.21+

  • (0.97, 1.51)

Language
 English
 Spanish/Castilian
  • 2.10***

  • (1.98, 2.22)

  • 1.93***

  • (1.82, 2.05)

  • 1.85***

  • (1.74, 1.97)

  • 1.73***

  • (1.59, 1.88)

  • 1.73***

  • (1.59, 1.88)

  • 1.72***

  • (1.58, 1.88)

 Other
  • 1.41**

  • (1.14, 1.74)

  • 1.42**

  • (1.12, 1.79)

  • 1.38**

  • (1.09, 1.74)

  • 1.45*

  • (1.05, 2.02)

  • 1.41*

  • (1.02, 1.95)

  • 1.34+

  • (0.96, 1.87)

Pseudo-R2 0.122 0.291 0.313 0.340 0.327 0.352
+

P < .1,

*

P < .05,

**

P < .01,

***

P < .001.

The numerical values are the odds ratio values for lower acuity, which were computed based on the logistic regression coefficients.

Based on the findings presented in Table 5, it became evident that there existed higher odds of low-acuity assignment among the non-reference racial groups. This evidence of racial disparity implies that pediatric patients from non-White racial backgrounds were more likely to be designated a lower ESI level, in contrast to their White counterparts. Particularly, AA pediatric patients consistently exhibited an elevated odds ratio of low-acuity assignment (2.4-2.7), controlling for all the different sets of control variables, signifying significant racial disparity in comparison to the reference group. Meanwhile, for Asian, Hispanic, and other racial groups, the odds ratios ranged between 1.2 and 1.7. Particularly in models 4, 5, and 6, the odds of low-acuity triage for Asians pediatric patients became insignificant. In the case of Hispanic and other racial groups, the odds ratios decreased, yet remained statistically significant.

For the impact of Language, Table 5 shows that pediatric patients who communicate in languages other than English faced higher odds of being assigned a low-acuity level, controlling for all the different sets of control variables across Model 1 to 6, compared to their English-speaking counterparts. Although the odds of low-acuity assignment for patients who speak Spanish/Castilian and other languages decreased as we move from Model 1 to 6, they still signaled language disparities involved in triage. Regarding the influence of adding the patient-level factors on estimating the binary ESI score, the more complex models generally produced modest improvements in model fit (in terms of the pseudo-R2 value).

Discussion

Our study utilized several analytical approaches to analyzing triage assignment disparities in the pediatric ED setting. By applying various models to a large patient dataset, we demonstrated that there is room for improvement in pediatric patient triage over the traditional ESI system or the current practice.15 Follow-up studies exploring the feasibility of staff training or the development of decision support system to reduce these identified disparities could contribute to making the ED triage workflow more equitable.

It is important to highlight the implications of Models 1 and 2 in understanding disparities in ED triage. Model 1 showed that non-White pediatric patients, particularly AA patients, and those whose primary language is not English, are more likely of being assigned to low-acuity care. This is exemplified by the fact that AA pediatric patients are 2.7 times more likely to receive a low-acuity assignment compared to their White counterparts. The incorporation of the chief complaint as a variable in Model 2 did not significantly alter these findings. This suggests that while chief complaints are valuable predictors for binary triage assignment, they do not account for the disparities observed. Our results are aligned with the findings of Dugas et al42 and Gouin et al.43 indicating that chief complaints are not the primary drivers of the association between disparity factors and ED triage assignments. underscores the need for further investigation into the factors influencing triage decisions in pediatric emergency care.

Model 3, which included crowding level, showed results were consistent with Models 1 and 2, indicating that while crowding level improved triage assignment predictions, it did not significantly impact the relationship between racial and language disparities and pediatric ED triage outcomes. This suggests that factors beyond crowding levels contribute to these disparities, which posits a plausible explanation for the conflicting nature of the results by Chen et al32 on one hand and van der Linden et al34 and O’Connor et al33 on the other hand.

In Model 4, knowledge-based vital signs were introduced to the models. This impacted our findings significantly, particularly in racial disparities in ED triage. Notably, the previously significant association between the Asian category and low-acuity assignments became insignificant, highlighting the importance of age-dependent vital sign values in explaining these disparities. However, the associations with increased odds of low-acuity assignment for AA and Hispanic categories remained, indicating that the significance of these associations was robust across the different model specifications for those 2 groups. The introduction of feature interactions in Model 6 underscores the significance of considering how different variables interact in ED triage. Although this addition did not substantially alter the empirical results, it enhanced the model’s performance, highlighting the nuanced contributions of variable interaction in the context of ED triage.

Limitations

Our study comes with some limitations. First, about 10% of visits have been made by the same patients who visited the ED last month (high utilizers), and their visits might have been perceived differently by the triage nurses, compared to others. While we adjusted this aspect by including a visit history-related variable (the number of visits last 30 days), there may be non-linear association between visit history and acuity level assignment (eg, a single revisit considered urgent vs frequent visits regarded redundant), which can require a different analysis approach. Second, while our feature processing technique applied to the unstructured complaint data is a state-of-the-art text mining approach, the annotation of resulted clusters was done by only one annotator, which could cause bias.

The research was conducted at a single site, which may limit the generalizability of the findings to other settings. Additionally, there were inherent challenges associated with the data on race, ethnicity, and language in EHRs, including issues of misclassification and missing data, but these issues are unlikely to have had a major impact on the overall results since the data we used is large. Moreover, the analyses were constrained by the data available in the EHRs, suggesting the possibility of unmeasured confounders that were not accounted for in the study. Despite these limitations, the study’s conclusions remain robust. Our study considered adjustments for an extensive list of clinical and administrative variables, the large dataset size, and the consistency of the findings across various specifications to address the potential of data misinterpretation. Our approach to ensure reliability in studying racial and language disparities in pediatric ED can guide future research efforts in a similar context.

Conclusion and future work

This work expands on the existing literature on healthcare disparities particularly associated with clinical decision-making (ie, ESI score). Our research specifically focused on racial disparities and language barriers in the pediatric ED triage assignment setting. By incorporating analytical enhancement, such as unstructured complaint data, age-dependency consideration in vital sign values, and interaction detection modeling, we aimed to provide options to test the robustness of the results. To the best of our knowledge, this work is the first attempt to inform the analysis procedure for studying disparities in the pediatric ED triage setting from the robustness check point of view. With the increasing EHR data availability and advancement in analytical techniques, it is necessary to include robustness check in disparity study to minimize the possibility of reporting biased analysis outcomes. Our results were consistently indicative of disparities among AA regardless of model specifications utilized. The results also showed that pediatric patients who speak languages other than English were more likely to be assigned to a lower acuity ESI level. By employing the proposed methodology, our findings exhibit a high degree of robustness. The study’s findings on racial and language disparities in pediatric ED triage could enhance ED practices by informing targeted awareness and training programs for staff, highlighting the need to address unconscious biases. Furthermore, these findings could spur the development of clinical decision support tools that reduce bias by providing recommendations based explicitly on clinical presentations (eg, chief complaints and vital sign categories), thus ensuring fairer and more equitable treatment for all pediatric patients, regardless of their race or language.

As future work, since our study only contributes to our understanding of healthcare disparities, it does not provide insights into the implications of disparities. A future study can be done to understand how the disparities found in acuity level assignment have influenced the quality of care for those who are affected.

IRB statement

This research project was reviewed by the Institutional Review Board of the University of Alabama at Birmingham. It was determined that the study does not involve Human Subjects.

Supplementary Material

ocae018_Supplementary_Data

Contributor Information

Seung-Yup (Joshua) Lee, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States.

Mohammed Alzeen, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States.

Abdulaziz Ahmed, Department of Health Services Administration, School of Health Professions, The University of Alabama at Birmingham, Birmingham, AL 35233, United States.

Author contributions

S-Y.(J.)L. contributed to the study design and the development of the regression models; conducted data cleaning and analysis, with particular expertise in data mining and MARS modeling; drafted the article; and contributed to the article revisions. M.A. with guidance from all co-authors; conducted data analysis and drafted the article. A.A. conducted data cleaning, and analysis, with particular expertise in NLP and clustering of chief complaint; drafted the article; and contributed to the article revisions.

Supplementary material

Supplementary material is available at Journal of the American Medical Informatics Association online.

Funding

None declared.

Conflicts of interest

None declared.

Data availability

The data underlying this article was provided by the University of Alabama at Birmingham by permission. Data will be shared upon request to the corresponding author with permission of the University of Alabama at Birmingham.

References

  • 1. Barata I, Brown KM, Fitzmaurice L, et al. ; Emergency Nurses Association Pediatric Committee. Best practices for improving flow and care of pediatric patients in the emergency department. Pediatrics. 2015;135(1):e273-e283. 10.1542/peds.2014-3425 [DOI] [PubMed] [Google Scholar]
  • 2. Payne NR, Puumala SE.. Racial disparities in ordering laboratory and radiology tests for pediatric patients in the emergency department. Pediatr Emerg Care. 2013;29(5):598-606. [DOI] [PubMed] [Google Scholar]
  • 3. Purtell R, Tam RP, Avondet E, Gradick K.. We are part of the problem: the role of children’s hospitals in addressing health inequity. Hosp Pract (1995). 2021;49(sup1):445-455. [DOI] [PubMed] [Google Scholar]
  • 4. Zook HG, Kharbanda AB, Flood A, Harmon B, Puumala SE, Payne NR.. Racial differences in pediatric emergency department triage scores. J Emerg Med. 2016;50(5):720-727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Dennis JA. Racial/ethnic disparities in triage scores among pediatric emergency department fever patients. Pediatr Emerg Care. 2021;37(12):e1457-e1461. [DOI] [PubMed] [Google Scholar]
  • 6. Escobar MA Jr, Morris CJ.. Using a multidisciplinary and evidence-based approach to decrease undertriage and overtriage of pediatric trauma patients. J Pediatr Surg. 2016;51(9):1518-1525. 10.1016/j.jpedsurg.2016.04.010 [DOI] [PubMed] [Google Scholar]
  • 7. Weber TL, Ziegler KM, Kharbanda AB, Payne NR, Birger C, Puumala SE.. Leaving the emergency department without complete care: disparities in American Indian children. BMC Health Serv Res. 2018;18(1):267-266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Zook HG, Payne NR, Puumala SE, Burgess K, Kharbanda AB.. Racial/ethnic variation in emergency department care for children with asthma. Pediatr Emerg Care. 2019;35(3):209-215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. De Luca G, Suryapranata H, Ottervanger JP, Antman EM.. Time delay to treatment and mortality in primary angioplasty for acute myocardial infarction: every minute of delay counts. Circulation. 2004;109(10):1223-1225. 10.1161/01.Cir.0000121424.76486.20 [DOI] [PubMed] [Google Scholar]
  • 10. Gaieski DF, Mikkelsen ME, Band RA, et al. Impact of time to antibiotics on survival in patients with severe sepsis or septic shock in whom early goal-directed therapy was initiated in the emergency department. Crit Care Med. 2010;38(4):1045-1053. 10.1097/CCM.0b013e3181cc4824 [DOI] [PubMed] [Google Scholar]
  • 11. Lees KR, Bluhmki E, von Kummer R, et al. ; ECASS, ATLANTIS, NINDS and EPITHET rt-PA Study Group. Time to treatment with intravenous alteplase and outcome in stroke: an updated pooled analysis of ECASS, ATLANTIS, NINDS, and EPITHET trials. Lancet. 2010;375(9727):1695-1703. 10.1016/S0140-6736(10)60491-6 [DOI] [PubMed] [Google Scholar]
  • 12. Zhang X, Carabello M, Hill T, Bell SA, Stephenson R, Mahajan P.. Trends of racial/ethnic differences in emergency department care outcomes among adults in the United States from 2005 to 2016. Front Med (Lausanne). 2020;7:300. 10.3389/fmed.2020.00300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Boley S, Sidebottom A, Vacquier M, et al. Investigating racial disparities within an emergency department rapid-triage system. Am J Emerg Med. 2022;60:65-72. 10.1016/j.ajem.2022.07.030 [DOI] [PubMed] [Google Scholar]
  • 14. Vigil JM, Alcock J, Coulombe P, et al. Ethnic disparities in emergency severity index scores among US Veteran’s Affairs emergency department patients. PLoS One. 2015;10(5):e0126792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Metzger P, Allum L, Sullivan E, Onchiri F, Jones M.. Racial and language disparities in pediatric emergency department triage. Pediatr Emerg Care. 2022;38(2):e556-e562. [DOI] [PubMed] [Google Scholar]
  • 16. Lin M, Lucas HC Jr, Shmueli G.. Research commentary—too big to fail: large samples and the p-value problem. Inf Syst Res. 2013;24(4):906-917. [Google Scholar]
  • 17. Cohen J, Cohen P, West SG, Aiken LS.. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Routledge; 2013. [Google Scholar]
  • 18. Wuerz RC, Milne LW, Eitel DR, Travers D, Gilboy N.. Reliability and validity of a new five‐level triage instrument. Acad Emerg Med. 2000;7(3):236-242. [DOI] [PubMed] [Google Scholar]
  • 19. de Magalhães-Barbosa MC, Robaina JR, Prata-Barbosa A, de Souza Lopes C.. Validity of triage systems for paediatric emergency care: a systematic review. Emerg Med J. 2017;34(11):711-719. [DOI] [PubMed] [Google Scholar]
  • 20. Travers DA, Waller AE, Katznelson J, Agans R.. Reliability and validity of the emergency severity index for pediatric triage. Acad Emerg Med. 2009;16(9):843-849. [DOI] [PubMed] [Google Scholar]
  • 21. Patel MD, Lin P, Cheng Q, et al. Patient sex, racial and ethnic disparities in emergency department triage: a multi-site retrospective study. Am J Emerg Med. 2023;76:29-35. 10.1016/j.ajem.2023.11.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Sørensen SF, Ovesen SH, Lisby M, Mandau MH, Thomsen IK, Kirkegaard H.. Predicting mortality and readmission based on chief complaint in emergency department patients: a cohort study. Trauma Surg Acute Care Open. 2021;6(1):e000604. 10.1136/tsaco-2020-000604 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Reimers N, Gurevych I. Sentence-BERT: sentence embeddings using siamese BERT-networks. arXiv, arXiv:1908.10084, 2019, preprint: not peer reviewed. Accessed February 2, 2023. https://arxiv.org/abs/1908.10084
  • 24. Kotu V, Deshpande B.. Data Science: concepts and Practice. Morgan Kaufmann; 2018. [Google Scholar]
  • 25. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825-2830. [Google Scholar]
  • 26. Akre M, Finkelstein M, Erickson M, Liu M, Vanderbilt L, Billman G.. Sensitivity of the pediatric early warning score to identify patient deterioration. Pediatrics. 2010;125(4):e763-e769. [DOI] [PubMed] [Google Scholar]
  • 27. Oguz F, Yildiz I, Varkal MA,. et al. Axillary and tympanic temperature measurement in children and normal values for ages. Pediatr Emerg Care. 2018;34(3):169-173. 10.1097/PEC.0000000000000693 [DOI] [PubMed] [Google Scholar]
  • 28. Leduc D, Woods S, Community Paediatrics Committee. Temperature measurement in paediatrics. Paediatr Child Health. 2000;5(5):273-284. 10.1093/pch/5.5.273 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Bickler PE, Feiner JR, Severinghaus JW.. Effects of skin pigmentation on pulse oximeter accuracy at low saturation. J Am Soc Anesthesiol. 2005;102(4):715-719. [DOI] [PubMed] [Google Scholar]
  • 30. Feiner JR, Severinghaus JW, Bickler PE.. Dark skin decreases the accuracy of pulse oximeters at low oxygen saturation: the effects of oximeter probe type and gender. Anesth Analg. 2007;105(6 Suppl):S18-S23. [DOI] [PubMed] [Google Scholar]
  • 31. Wajnberg A, Hwang U, Torres L, Yang S.. Characteristics of frequent geriatric users of an urban emergency department. J Emerg Med. 2012;43(2):376-381. [DOI] [PubMed] [Google Scholar]
  • 32. Chen W, Linthicum B, Argon NT, et al. The effects of emergency department crowding on triage and hospital admission decisions. Am J Emerg Med. 2020;38(4):774-779. 10.1016/j.ajem.2019.06.039 [DOI] [PubMed] [Google Scholar]
  • 33. O’Connor E, Gatien M, Weir C, Calder L.. Evaluating the effect of emergency department crowding on triage destination. Int J Emerg Med. 2014;7:16. 10.1186/1865-1380-7-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. van der Linden MC, Meester BE, van der Linden N.. Emergency department crowding affects triage processes. Int Emerg Nurs. 2016;29:27-31. 10.1016/j.ienj.2016.02.003 [DOI] [PubMed] [Google Scholar]
  • 35. Friedman JH. Multivariate adaptive regression splines. Ann Statist. 1991;19(1):1-67. [DOI] [PubMed] [Google Scholar]
  • 36. Van Steen K. Travelling the world of gene–gene interactions. Brief Bioinform. 2012;13(1):1-19. [DOI] [PubMed] [Google Scholar]
  • 37. Cook NR, Zee RY, Ridker PM.. Tree and spline based association analysis of gene–gene interaction models for ischemic stroke. Stat Med. 2004;23(9):1439-1453. [DOI] [PubMed] [Google Scholar]
  • 38. York TP, Eaves LJ, van den Oord EJ.. Multivariate adaptive regression splines: a powerful method for detecting disease–risk relationship differences among subgroups. Stat Med. 2006;25(8):1355-1367. [DOI] [PubMed] [Google Scholar]
  • 39. Austin PC. A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality. Stat Med. 2007;26(15):2937-2957. [DOI] [PubMed] [Google Scholar]
  • 40. Menon R, Bhat G, Saade GR, Spratt H.. Multivariate adaptive regression splines analysis to predict biomarkers of spontaneous preterm birth. Acta Obstet Gynecol Scand. 2014;93(4):382-391. [DOI] [PubMed] [Google Scholar]
  • 41. Milborrow S, Hastie T, Tibshirani R, Miller A, Lumley T. earth: Multivariate adaptive regression splines. R Package Version. Vol. 5, No. 2. 2017. Accessed January 30, 2024. http://cran.nexr.com/web/packages/earth/earth.pdf
  • 42. Dugas AF, Kirsch TD, Toerper M, et al. An electronic emergency triage system to improve patient distribution by critical outcomes. J Emerg Med. 2016;50(6):910-918. [DOI] [PubMed] [Google Scholar]
  • 43. Gouin S, Gravel J, Amre DK, Bergeron S.. Evaluation of the Paediatric Canadian Triage and Acuity Scale in a pediatric ED. Am J Emerg Med. 2005;23(3):243-247. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ocae018_Supplementary_Data

Data Availability Statement

The data underlying this article was provided by the University of Alabama at Birmingham by permission. Data will be shared upon request to the corresponding author with permission of the University of Alabama at Birmingham.


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES