Abstract
Objectives:
Incorporating social determinants of health to identify distinct pediatric asthma patient groups can help stratify populations by their risk of adverse events, improving targeted outreach and care.
Methods:
Insurance claims and enrollment data from the Arkansas All-Payer Claims Database identified 22 169 children aged 5-18 years with an asthma diagnosis in 2018 and continuous Medicaid enrollment in 2018 and 2019. The clustering approach used information on comorbid conditions, asthma controller medication intensity, total controller and reliever medications filled, zip code-level Child Opportunity Index, and rural-urban classification. Binary and categorical variables were first transformed into continuous latent variables using Generalized Low-Rank Models. K-means clustering with Euclidean distance was then applied. The resulting clusters were compared based on asthma-related emergency department (ED) visits and hospitalizations in 2018.
Results:
K-means clustering identified six clusters. The distribution of ED visits differed significantly across the clusters (p<0.001) with Cluster 1 having the highest observed percentages (1 ED visit: 9.5%; ≥2 ED visits: 2.6%). This cluster consisted of 65.9% Black and had the highest proportion of children residing in neighborhoods with very low child opportunity scores: 90.5% had very low education scores, 85.5% very low health and environment scores, and 94.4% very low social and economic scores.
Conclusions:
Interventions to reduce pediatric asthma disparities should address social, economic, and environmental inequities. Clustering identified children from low child opportunity areas in Arkansas, with a high percentage of Black children, as a high-risk group for asthma exacerbations, underscoring the potential of population risk stratification for tailoring interventions.
Keywords: administrative claims data, Medicaid, Child Opportunity Index, Generalized Low-Rank Models, adverse events, k-Means Clustering, asthma exacerbations, pediatric asthma
Introduction
Asthma is the most prevalent chronic illness in children and a major contributor to preventable emergency department (ED) visits (1) and hospitalizations (2,3). A recent study on asthma prevalence among children under 18 in the United States using data from the National Center for Health Statistics from 2003 to 2019 shows that asthma prevalence is higher among Black children (14.3%) compared to other racial and ethnic groups (4). Additionally, the results of the same study indicate a higher prevalence of asthma among children with Medicaid insurance compared to those with private insurance (4). Children with asthma covered by Medicaid are also at higher risk for asthma exacerbations, including more frequent ED visits and hospitalizations (5–7). Strategies aimed at effectively providing care to large populations, such as Medicaid-enrolled children at higher risk for asthma and poor asthma control, require risk-stratifying the population based on their characteristics to aid in designing focused, specific interventions and improving asthma outcomes.
Identifying children at risk for asthma exacerbations is challenging, and many organizations and insurance companies still rely on the National Committee on Quality Assurance’s Health Plan Employer Data and Information Set (HEDIS) to identify patients with persistent asthma who may be ultimately at risk for asthma exacerbations (8). According to HEDIS, a child meets the definition of persistent asthma if they meet one of the following criteria during the measurement year and the prior year: (1) an ED visit with a primary diagnosis of asthma, (2) a hospitalization with a primary diagnosis of asthma, (3) four outpatient visits with any diagnosis of asthma along with at least two prescriptions for HEDIS-specified asthma controller or reliever medications, (4) four prescriptions for HEDIS-specified asthma controller or reliever medications, or (5) four prescriptions for only leukotriene modifiers or antibody inhibitors without other asthma medications (9). However, previous studies have highlighted the limitations of using claims-based approaches, such as HEDIS, to identify patients at risk for asthma exacerbations, including lower specificity in identifying persistent asthma (10), reduced sensitivity in detecting asthma exacerbations (8), and limitations in identifying Black children at risk for poor asthma outcomes (11). Additionally, other studies have shown the relationship between social determinants of health and asthma-related adverse outcomes (12,13), as well as the predictive power of social determinants of health about avoidable healthcare utilization using machine learning algorithms (14), which HEDIS does not address.
Our study proposes a data-driven approach that goes beyond the HEDIS criteria by incorporating additional factors such as social determinants of health, including place of residence and the child opportunity index, to enhance population stratification. This approach also considers comorbid conditions, which may provide a deeper understanding of the characteristics of various clusters and help identify children more likely to experience asthma exacerbations. By stratifying patients most at risk for asthma exacerbations, as well as clinical and sociodemographic characteristics of such children, clinical and population health management teams may be able to develop tailored case management outreach.
Materials and Methods
Study sample
The study used data from the Arkansas All-Payer Claims Database (APCD) to identify a sample of Medicaid-enrolled children aged 5 to 18 with a diagnosis of asthma in 2018. The Arkansas APCD is a comprehensive administrative database containing medical, pharmacy, and dental claims, plus enrollment and provider files. For children aged 18 and under in Arkansas as of October 2021, nearly 60% of the children had Medicaid, 9% had commercial insurance, and 5% were self-insured and the remaining were covered through Provider-Led Arkansas Shared Savings Entity (PASSE) (4%), Qualified Health Plans (2%), or Employee Benefits Division (EBD) (3%). Approximately 17% of children aged 18 and under were covered by Tricare, had unknown insurance, or were uninsured; these groups are not represented in the Arkansas APCD (15).
A broad case definition of asthma was applied, identifying individuals with at least one asthma diagnosis (primary or secondary) coded as J45.xx, based on the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10). The study excluded children with less than 11 months of continuous enrollment in either 2018 or 2019, those lacking Evaluation and Management (E&M) codes, and those diagnosed with other chronic lung diseases such as chronic obstructive pulmonary disease, cystic fibrosis, emphysema, or acute respiratory failure. The clustering analysis throughout the paper was conducted using 2018 data, while the HEDIS population was identified using two years of data (2018 and 2019), as required by the HEDIS measure for persistent asthma.
Cluster variables
The variables included in the clustering included comorbid conditions, asthma management (medication type, controller intensity, and number of fills), and social determinants of health. The comprehensive list of the variables included in the clustering process is provided in Additional File 1. We did not include demographic variables to avoid clusters being formed around demographic differences and cohort-defining variables, such as asthma-related ED visits and hospitalizations, in the clustering stage. Asthma-related ED visits and hospitalizations were treated as outcome-related rather than explanatory features. This approach allowed us to focus the clustering on potentially modifiable risk factors and later examine how these factors may explain differences in asthma exacerbations, as measured by ED visits, hospitalizations and various demographic characteristics. Adverse outcomes were defined as asthma-related hospitalizations or ED visits in 2018, categorized as 0, 1, or 2 or more events. Inpatient hospitalization and ED visit claims were flagged as asthma-related if they had a primary or secondary asthma diagnosis code (J45.XX). ED visits resulting in an inpatient hospitalization were only included as inpatient hospitalizations. Other healthcare utilization included spirometry and fractional exhaled nitric oxide (FeNO) usage in 2018.
HEDIS value set definitions using the National Drug Code were used to identify all asthma medications (controllers and relievers). Controller medication intensity was defined based on the use of leukotriene receptor antagonists (LTRA), inhaled corticosteroids (ICS), ICS/long-acting beta-2 agonists (LABA), and asthma biologics (16). The categories suggested by Bloom et al.(17) were used to define the number of reliever fills, with a reliever refill of more than 3 per year considered high use. The asthma medication ratio (AMR) guidelines were used to define the categories for the number of controllers, with fewer than 6 refills considered poor use, and 6 or more refills considered adequate use. The 2010 Rural-Urban Commuting Area Codes (RUCA) classifications at the zip code level were used to define urban versus rural residence, with lower numbers indicating more urban areas (18). The social determinants of health included both the neighborhood quality, assessed using the child opportunity index (COI 2.0) (19) and individual-level social determinants of health identified in medical claims by ICD-10 diagnosis codes (Z55.xx through Z65.xx) (20). The child opportunity index includes three main domains: education (early childhood education; elementary education; secondary and post-secondary education; educational and social resources), health and environment (healthy environment; toxic exposures; health resources indicators), and social and economic factors (economic opportunities; economic and social resources), aggregated at the zip code level (19) .
Statistical analysis
We followed the approach recommended by a previous study to apply k-means clustering (21). A generalized low-rank model (GLRM) approximation technique was used to generate a low-rank representation of the dataset. GLRM extends principal component analysis (PCA) by handling various data types, including continuous, binary, categorical, ordinal, and others (22). The original data matrix A is approximated as the product of two matrices, X and Y, where X represents the original observations in the latent feature space (with reduced dimension k), and Y represents the latent features in terms of the original features (23). The objective function in GLRM consists of two components: the loss function and the regularization terms (22).
In this study, the GLRM settings, including the number of latent factors and the regularization parameters, were configured to maximize reconstruction accuracy. Quadratic regularization functions were applied, and the maximum number of iterations was set to 1,000. The loss function was set to logistic for binary variables and categorical for the categorical variables. The k-means clustering was then applied to the reconstructed data with the Euclidean distance metric. The h2o package in R was used to implement the GLRM and k means (24). The number of clusters was determined by maximizing the average silhouette width using the cluster package (25). The silhouette score works based on comparing tightness (i.e., how close a data point is to the other points in the same cluster) and separation (i.e., how far a data point is from other points in the other clusters) (26). To assess whether unequal variance across the latent dimensions affected clustering performance, we evaluated the silhouette score for the latent representation matrix both before and after standardization, prior to applying k-means. After determining the clusters, differences among them were examined using Chi-square tests. To account for the increased risk of Type I error due to multiple comparisons, we applied the Bonferroni correction to adjust p-values. To assess the strength of association between cluster membership and categorical variables, Cramér’s V was also calculated following each chi-square test. This study was deemed non-human subjects research by the authors’ Institutional Review Board (274140).
Results
A cohort of 22,169 children with asthma was identified in 2018. The cohort consisted of 58% males, 59% children aged 5-11 years, 45% non-Hispanic White (“White”), 36% non-Hispanic Black (“Black”), 6% Hispanic, 4.3% from other racial and ethnic backgrounds, and 8.7% missing race and ethnicity (Table 1).
Table 1.
Demographic characteristics of Medicaid enrolled children with asthma by cluster, 2018
| Variable | Value | Overall | Cluster 1 (n=3367) | Cluster 2 (n=3765) | Cluster 3 (n=3644) | Cluster 4 (n=3371) | Cluster 5 (n=5380) | Cluster 6 (n=2642) | p-value | Adjusted p-value | Cramer’s V |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 5-11 | 12989 (59%) | 62.2% | 64.1% | 53.6% | 52.3% | 60.0% | 58.1% | <0.001 | <0.001 | 0.09 |
| Gender | Female | 9240 (42%) | 40.8% | 39.6% | 43.8% | 41.9% | 42.0% | 41.9% | 0.01 | 0.58 | 0.03 |
| Race | Black | 7888 (36%) | 65.9% | 31.1% | 25.7% | 34.8% | 36.8% | 15.3% | <0.001 | <0.001 | 0.16 |
| Race | Hispanic | 1391 (6%) | 5.2% | 7.6% | 7.8% | 5.3% | 5.0% | 7.6% | |||
| Race | missing | 1934 (8.7%) | 9.3% | 9.6% | 10.2% | 8.3% | 7.2% | 8.5% | |||
| Race | Other | 1085 (4.3%) | 4.0% | 5.7% | 5.5% | 5.1% | 3.8% | 5.9% | |||
| Race | White | 9871 (45%) | 15.6% | 45.9% | 50.8% | 46.5% | 47.2% | 62.8% |
Cluster characteristics
The optimal number of clusters was 6, based on a silhouette score of 0.16. Standardizing the reconstructed matrix did not change the silhouette score. The six clusters were then interpreted based on study characteristics.
Demographics
There were significant differences among the clusters in both age and race distributions (adjusted p<0.001). Clusters 1, 2, and 5 had higher proportions of younger children (ages 5-11). Cluster 1 had the highest proportion of Black children (65.9%), while Cluster 6 had the highest proportion of White children (62.8%) (Table 1).
Comorbid Conditions
There were significant differences in the proportions of comorbid conditions across the six clusters, except for autism (adjusted p-value=1), diabetes (adjusted p-value=1), iron deficiency (adjusted p-value=1), primary immunodeficiency (adjusted p-value=0.1) and sleep-related breathing disorders (adjusted p-value=1) (Additional File 2). Cluster 1 had the highest percentage of children with allergic conditions, including allergic rhinitis (66.3%), atopic dermatitis (24.8%), and food allergy (8.9%), followed by Cluster 2 with 55.9%, 14.6%, and 6.5%, respectively.
Medications
The controller medication intensity varied significantly across the six clusters (all adjusted p<0.001), with ICS therapy having notably higher percentages across all clusters (except Cluster 4) than LTRA and ICS/LABA. Cluster 2 had the highest use of ICS (98.9%), Cluster 5 had the highest use of LTRA (6.6%), and Cluster 3 had the highest use of ICS/LABA (9.2%). Cluster 4 had no controller prescriptions. Despite the lack of controller use in Cluster 4, 12.6% of children in this cluster had prescriptions for albuterol.
Fluticasone was the most commonly filled controller medication overall across clusters 1, 2 and 5, with cluster 2 showing the highest usage (98.4%). Budesonide-formoterol was the most commonly filled ICS/LABA with Cluster 3 showing the highest usage (6.2%). Albuterol was also frequently used across all clusters (except cluster 4), with nearly all children in Clusters 1 (98.4%) and 2 (95.3%) being prescribed albuterol (Table 2).
Table 2.
Controller medication intensity and asthma medication use among Medicaid enrolled children with asthma by cluster, 2018
| Variable | Cluster 1 (n=3367) | Cluster 2 (n=3765) | Cluster 3 (n=3644) | Cluster 4 (n=3371) | Cluster 5 (n=5380) | Cluster 6 (n=2642) | p-value | Adjusted p-value | Cramer’s V |
|---|---|---|---|---|---|---|---|---|---|
| Controller Medication Intensity | |||||||||
| Leukotriene Modifier | 1.0% | 0.6% | 2.1% | 0.0% | 6.6% | 3.8% | <0.001 | <0.001 | 0.15 |
| Inhaled Corticosteroid | 70.7% | 98.9% | -a | 0.0% | 60.2% | 36.6% | <0.001 | <0.001 | 0.72 |
| Inhaled Corticosteroid/Long-Acting Beta-2-Agonist | 5.4% | 0.5% | 9.2% | 0.0% | 1.7% | 3.5% | <0.001 | <0.001 | 0.18 |
| Medications | |||||||||
| Albuterol | 98.4% | 95.3% | 92.1% | 12.6% | 88.8% | 83.9% | <0.001 | <0.001 | 0.72 |
| Montelukast | 26.0% | 35.3% | 8.9% | 0.0% | 26.6% | 20.0% | <0.001 | <0.001 | 0.29 |
| Beclomethasone | 1.4% | 3.5% | - a | 0.0% | 1.7% | 1.5% | <0.001 | <0.001 | 0.09 |
| Fluticasone | 72.7% | 98.4% | 3.7% | 0.0% | 60.1% | 37.3% | <0.001 | <0.001 | 0.70 |
| Budesonide-Formoterol | 3.0% | - a | 6.2% | 0.0% | 1.0% | 2.1% | <0.001 | <0.001 | 0.15 |
| Fluticasone-Salmeterol | 0.5% | 0.0% | 0.8% | 0.0% | - a | - a | <0.001 | <0.001 | 0.05 |
| Mometasone-Formoterol | 2.1% | - a | 2.5% | 0.0% | 0.5% | 1.1% | <0.001 | <0.001 | 0.09 |
The percentage is not displayed due to the small sample size
Number of refills
A higher percentage of children in Clusters 1 (19.7%) and 2 (17.9%) had 7 or more controller refills in 2018, while only 8.7% of children in Cluster 3 had more than 7 controller medication refills, and Cluster 4 had no controller refills.
Notably, higher percentages of children had 3 or more refills for reliever medications in Cluster 1 (53.1%) and Cluster 2 (48.8%), whereas nearly all children in Cluster 4 (99.2%) had either no reliever prescriptions or only one fill (Table 3). Children in Cluster 2 and 5 had the highest number of oral corticosteroid bursts compared to other clusters.
Table 3.
Number of controller, reliever, and oral corticosteroid burst fills among Medicaid enrolled children with asthma by cluster, 2018
| Variable | Cluster 1 (n=3367) | Cluster 2 (n=3765) | Cluster 3 (n=3644) | Cluster 4 (n=3371) | Cluster 5 (n=5380) | Cluster 6 (n=2642) | p-value | Adjusted p-value | Cramer’s V | |
|---|---|---|---|---|---|---|---|---|---|---|
| No. of Refills | ||||||||||
| Controllers | 0 | 22.8% | 0.0% | 88.5% | 100.0% | 31.5% | 56.1% | <0.001 | <0.001 | 0.41 |
| Controllers | 1-6 | 57.6% | 82.1% | 2.8% | 0.0% | 53.6% | 29.9% | |||
| Controllers | 7-12 | 15.9% | 13.8% | 5.8% | 0.0% | 11.4% | 9.8% | |||
| Controllers | >=13 | 3.8% | 4.1% | 2.9% | 0.0% | 3.5% | 4.3% | |||
| Relievers | 0-1 | 23.7% | 28.0% | 46.4% | 99.2% | 52.7% | 55.1% | <0.001 | <0.001 | 0.25 |
| Relievers | 2 | 23.2% | 23.2% | 23.5% | 0.7% | 23.0% | 18.6% | |||
| Relievers | 3-6 | 45.5% | 40.2% | 23.7% | 0.0% | 16.2% | 19.0% | |||
| Relievers | 7-12 | 7.1% | 8.0% | 5.9% | 0.0% | 7.5% | 6.4% | |||
| Relievers | >=13 | 0.5% | 0.6% | 0.5% | 0.0% | 0.7% | 0.9% | |||
| Oral Corticosteroid Bursts | 35.3% | 40.5% | 33.0% | 15.7% | 40.2% | 35.9% | <0.001 | <0.001 | 0.17 | |
Social determinants of health
Cluster 1 and 2 had the lowest percentage of children living in rural areas (16.2% and 15.2%, respectively). A majority of children in Cluster 5 (76.4%) were from rural areas. Individual-level Z codes for social determinants of health were undercoded across all clusters, showing no significant differences in issues related to other psychological circumstances (adjusted p=1), primary support group (adjusted p=1), and social support (adjusted p=1). A sensitivity analysis was performed excluding the social determinants of health Z codes. Although these codes were underutilized, their exclusion resulted in a weaker clustering structure (silhouette score of 0.14); therefore, they were retained in the clustering process. A higher percentage of children in Cluster 1 were from neighborhoods with very low child opportunity index scores in education (90.5%), health and environment (85.5%), and social and economic factors (94.4%) compared to the other clusters. Clusters 4 and 5 also showed higher percentages of children from very low and low opportunity neighborhoods in terms of education (Cluster 4: 61.8%, Cluster 5: 80.8%), health and environment (Cluster 4: 63%, Cluster 5: 63.1%), and social and economic factors (Cluster 4: 66.5%, Cluster 5: 81.8%).
Asthma exacerbations and healthcare utilization
Overall, the rate of asthma-related ED visits was 9.0%, with 7.5% having 1 ED visit and 1.5% having ≥2 ED visits. The rate of asthma-related hospitalizations was 4.6%, with 3.5% having only 1 hospitalization and 1.1% having ≥2 hospitalizations.
Clusters 1 (12.1%) and 2 (10.8%) had the highest percentages of ED usage (>=1). Cluster 4 had the highest percentage of hospitalizations, with 6.4% having only one hospitalization and 1.9% having two or more. Both Clusters 1 and 2 had higher spirometry usage (42.6% and 46.1%, respectively), and Cluster 1 had the highest utilization of FeNO (4.1%). Cluster 4 had the lowest usage of spirometry and FeNO (3.7% and 0%, respectively). Cluster 4 also showed no use of controller medications, and only 0.7% of the children used two or more relievers in 2018 (Tables 2 and 3).
The percentage of children meeting the HEDIS indicator requirements for persistent asthma was calculated for each cluster. The HEDIS population was distributed across the clusters, with Clusters 1 and 2 having the highest percentages of children meeting the HEDIS requirements (37.6% and 38.3%, respectively), while Cluster 4 had no children meeting the HEDIS requirements.
Adverse outcomes were assessed within each cluster among both HEDIS and non-HEDIS flagged children to identify the proportion of children experiencing adverse events outside the HEDIS population. For ED visits, Cluster 1 continued to show a higher rate of adverse outcomes even among the non-HEDIS population. In contrast, Cluster 4 exhibited a higher rate of hospitalizations overall among the non-HEDIS population and compared to the HEDIS population. In the HEDIS population, Cluster 2 had the highest rate of hospitalizations (5.3%) (Table 5). In the non-HEDIS population, Cluster 4 had the highest hospitalization rate (8.3%). The unusual increase in hospitalizations within the non-HEDIS population was further investigated by examining the number of claims with an asthma diagnosis. It was found that 70% of the population in Cluster 4 had only one claim with an asthma diagnosis.
Table 5.
Asthma-related adverse outcomes, HEDIS status and healthcare utilization among Medicaid enrolled children with asthma by cluster, 2018
| Variable | Cluster 1 (n=3367) | Cluster 2 (n=3765) | Cluster 3 (n=3644) | Cluster 4 (n=3371) | Cluster 5 (n=5380) | Cluster 6 (n=2642) | p-value | Adjusted p-value | Cramer’s V | |
|---|---|---|---|---|---|---|---|---|---|---|
| ED (overall) | 0 | 87.9% | 89.2% | 92.4% | 92.7% | 91.0% | 92.5% | <0.001 | <0.001 | 0.05 |
| 1 | 9.5% | 8.6% | 6.5% | 6.5% | 7.3% | 6.0% | ||||
| >=2 | 2.6% | 2.2% | 1.1% | 0.8% | 1.7% | 1.4% | ||||
| Hospitalization (overall) | 0 | 96.0% | 95.7% | 95.4% | 91.7% | 96.5% | 96.2% | <0.001 | <0.001 | 0.06 |
| 1 | 3.2% | 3.2% | 3.2% | 6.4% | 2.6% | 3.1% | ||||
| >=2 | 0.7% | 1.1% | 1.4% | 1.9% | 0.9% | 0.7% | ||||
| FeNo | 4.1% | 2.5% | 1.7% | 0.0% | 1.3% | 1.2% | <0.001 | <0.001 | 0.09 | |
| Spirometry | 42.6% | 46.1% | 24.1% | 3.7% | 18.9% | 22.6% | <0.001 | <0.001 | 0.32 | |
| HEDIS (Yes/No) | 37.6% | 38.3% | 9.8% | 0 | 20.2% | 17.4% | <0.001 | <0.001 | 0.33 | |
| ED (HEDIS) | 0 | 86.1% | 87.0% | 86.2% | 100% | 89.0% | 89.6% | 0.17 | 0.35 | 0.04 |
| 1 | 10.4% | 9.3% | 9.9% | 0% | 8% | 6.1% | ||||
| >=2 | 3.5% | 3.7% | 3.9% | 0% | 3% | 4.3% | ||||
| Hospitalization (HEDIS) | 0 | 96.1% | 94.7% | 97.1% | 100% | 96.5% | 97.1% | 0.05 | 0.1 | 0.04 |
| 1 | 3.9% | 5.3% | 2.9% | 0% | 3.5% | 2.9% | ||||
| >=2 | -a | - a | - a | 0% | - a | - a | ||||
| ED (non-HEDIS) | 0 | 89% | 90.5% | 93.1% | 92.7% | 91.5% | 93.2% | <0.001 | <0.001 | 0.04 |
| 1 | 9.0% | 8.2% | 6.1% | 6.5% | 7.1% | 6.0% | ||||
| >=2 | 2.0% | 1.3% | 0.8% | 0.8% | 1.4% | 0.8% | ||||
| Hospitalization (non-HEDIS) | 0 | 96.0% | 96.3% | 95.2% | 91.7% | 96.5% | 96.0% | <0.001 | <0.001 | 0.06 |
| 1 | 3.1% | 2.6% | 3.3% | 6.4% | 2.5% | 3.3% | ||||
| >=2 | 0.9% | 1.1% | 1.5% | 1.9% | 1% | 0.7% | ||||
Due to small sample sizes, hospitalization≥2 has been merged with hospitalization=1; HEDIS: Health Plan Employer Data and Information Set
Discussion
Clustering approaches have been used to describe distinct clinical phenotypes of asthma, showing differences in pathophysiologic mechanisms across five clusters by lung function, age of asthma onset and duration, atopy, sex, medication uses, and healthcare utilization (27). The clinical relevance and temporal stability of the phenotypic clusters of children with asthma were assessed, and five clusters were identified based on atopic burden, lung function, and exacerbation rate (28). Both studies highlight the importance of unsupervised clustering approaches in understanding asthma subtypes within and across asthma severity levels and the risk of future asthma exacerbations (27,28).
The current study is different from prior studies in multiple ways. Instead of data based on clinical trial (28) or intensive characterization study of asthma (27), the current study uses administrative health claims data to develop variables to identify children with asthma. This approach may enhance the inclusivity and applicability of findings for population health management. Additionally, the proposed study uses the zip codes available in the APCD to account for social determinants of health, such as the childhood opportunity index, in the clustering process. Including social determinants of health in the clustering process enhances the application of unsupervised learning approaches with a focus on health disparities. This ultimately supports the design of interventions aimed at improving asthma management in a manner that is inclusive of these disparities.
Building on this framework, the findings of the current study provide novel insights into asthma population health improvement by identifying distinct clusters among children with asthma. This study identified six clusters characterized by varying asthma exacerbations, differences in race and ethnicity, rurality, child opportunity index, medication intensity, and overall medication use. We also found differences in asthma exacerbations stratified by an established HEDIS measure for persistent asthma, which is used to identify children with persistent asthma who may be at risk for poor outcomes.
We identified two clusters with the highest rates of ED visits, Cluster 1 and Cluster 2. The characteristics across these clusters, however, are distinct. Both Cluster 1 and Cluster 2 had high rates of children in urban areas and had the highest proportion of children meeting the HEDIS criteria. Cluster 1 had the highest rate of Black children, high rates of reliever inhaler overuse, adequate controller fills, and low child opportunity index scores. Cluster 2 also had high rates of reliever inhaler overuse but had higher rates of oral corticosteroid bursts and asthma-related hospitalizations with the HEDIS population despite higher rates of ICS use compared to all other clusters and high child opportunity index scores. Both Cluster 1 and Cluster 2 had higher rates of spirometry use. These two urban populations, one with the lowest neighborhood opportunity and one with the high neighborhood opportunity, highlight the utility of understanding differences in risk factors within and across regions for poor asthma outcomes.
Our findings show that Black urban-dwelling children experience high rates of asthma-related ED visits, consistent with prior studies (29,30). Most children in Cluster 1 were from areas with high neighborhood inequality (very low and low child opportunity index). Our findings may be explained by the effects of environmental injustice and institutional racism impacting differential exposures to asthma triggers (e.g., housing, allergens, mold, pollution) as a root cause of racial disparities in asthma outcomes (31). In Cluster 1, however, in contrast to prior studies that have described differences in access to controller inhalers among Black children (32), we found higher rates of adequate inhaler fill (defined as >6), but reliever inhaler overuse (>3) suggests persistent symptoms and poor outcomes. In Cluster 2, most children were from areas with high neighborhood opportunity (high and very high child opportunity index) but with higher rates of oral corticosteroid bursts. This could represent more severe disease or underlying differences in access to acute medications for asthma symptoms, such as oral corticosteroids.
Regarding hospitalizations, Cluster 4 had the highest rate; however, when analyzed by HEDIS criteria, most children in Cluster 4 did not meet those criteria (most children in Cluster 4 had only one asthma claim). This indicates a potential limitation of the study in differentiating whether these children have intermittent asthma, poorly controlled asthma, or undiagnosed persistent asthma. A study of 362 children seeking ED care for respiratory symptoms showed 36% had undiagnosed asthma and 66% had greater than one asthma exacerbation in the prior 12 months. These findings underscore the need for improved strategies to enhance asthma diagnosis and identification of at-risk children (33).
In addition to informing population-level interventions that address social and environmental inequities, clustering results also highlight clinically relevant patterns that may guide more immediate, individualized medical interventions. For example, Cluster 4 were characterized by higher hospitalization in children not using controller medications, while Cluster 1 reflected frequent utilization among children with comorbid allergic conditions. These patterns suggest that unsupervised learning can help identify children who may benefit from targeted clinical outreach, medication optimization, or enhanced follow-up, even in the absence of structural reforms.
This study has several limitations. The current methodological approach of using GLRM followed by K-means clustering with Euclidean distance may not fully capture the true clustering structure of the data. The reconstruction process in GLRM can distort the geometry of the latent space, potentially affecting the accuracy of clustering (as indicated by a weak silhouette score). Therefore, further methodological refinement is warranted to enhance clustering performance. Second, the medications analyzed are those that were filled, which may not necessarily represent the total number or all types of medications prescribed by providers. Furthermore, our analysis does not account for adherence or compliance, as filling a prescription doesn’t necessarily mean the medication was taken. Third, the individual level Z codes are undercoded as evidenced by other studies (34). Fourth, the clusters identified in this study may not be temporally stable, as the study does not span multiple years of data. Fifth, since this study relies on insurance claims data, the age of asthma onset among the children is not accessible, unlike in other studies (27,28). Lastly, the findings of this study are based on Medicaid enrolled children from a single state.
Conclusions
Clustering identified children from low child opportunity index areas with a high proportion of Black children as a high-risk group for asthma exacerbations, highlighting the potential of population stratification to tailor interventions. Policymakers and healthcare systems can use cluster analysis to identify cohorts of children who would benefit from social risk interventions. This is essential as the results of a systematic review indicate that social risk interventions are associated with reduced asthma-related ED visits and hospitalizations (35). Given these findings are based solely on Medicaid-enrolled children from a single state, this study provides a framework to extend similar analyses more broadly to other populations (e.g., other states) and insurance payors.
Supplementary Material
Table 4.
Social determinants of health among Medicaid enrolled children with asthma by cluster, 2018
| Variable | Cluster 1 (n=3367) | Cluster 2 (n=3765) | Cluster 3 (n=3644) | Cluster 4 (n=3371) | Cluster 5 (n=5380) | Cluster 6 (n=2642) | p-value | Adjusted p-value | Cramer’s V | |
|---|---|---|---|---|---|---|---|---|---|---|
| Rural/Urban designation | ||||||||||
| Rural | 16.2% | 15.2% | 18.1% | 41.4% | 76.4% | 49.4% | <0.001 | <0.001 | 0.50 | |
| Social determinants of health Z codes | ||||||||||
| Problems Related to Education and Literacy | 1.4% | 1.0% | 1.2% | 1.6% | 1.7% | 0.6% | <0.001 | 0.01 | 0.03 | |
| Problems Related to Other Psychosocial Circumstances | 0.8% | 0.5% | 0.7% | 0.7% | 0.6% | - a | 0.12 | 1 | 0.02 | |
| Other Problems Related to Primary Support Group, Including Family Circumstances | 1.6% | 1.3% | 1.9% | 1.7% | 1.4% | 1.1% | 0.08 | 1 | 0.02 | |
| Problems Related to Social Environment | 0.5% | 0.3% | 0.5% | 0.4% | 0.5% | - a | 0.48 | 1 | 0.01 | |
| Stress | - a | 0.4% | 0.5% | 0.6% | 1.4% | - a | <0.001 | <0.001 | 0.06 | |
| Problems Related to Upbringing | 1.3% | 1.9% | 4.0% | 5.0% | 2.5% | 2.3% | <0.001 | <0.001 | 0.07 | |
| COI Domains | ||||||||||
| Education | very low | 90.5% | 8.2% | 9.1% | 39.7% | 25.3% | 0.7% | <0.001 | <0.001 | 0.49 |
| Education | low | 2.4% | 8.4% | 11.5% | 22.1% | 55.5% | 14.0% | |||
| Education | moderate | 7.1% | 8.3% | 14.2% | 14.9% | 13.5% | 75.8% | |||
| Education | high | -b | 45.9% | 41.2% | 13.5% | 5.1% | - b | |||
| Education | very high | - b | 29.1% | 23.9% | 9.8% | 0.6% | 9.5% | |||
| Health and Environment | very low | 85.5% | 4.6% | 1.8% | 35.8% | 38.8% | 13.1% | <0.001 | <0.001 | 0.45 |
| Health and Environment | low | 14.0% | 28.7% | 35.6% | 27.2% | 24.3% | 0.0% | |||
| Health and Environment | moderate | 0.5% | 21.3% | 21.8% | 18.6% | 30.1% | 5.8% | |||
| Health and Environment | high | 0.0% | 18.0% | 13.0% | 9.2% | 2.6% | 80.0% | |||
| Health and Environment | very high | 0.0% | 27.4% | 27.7% | 9.2% | 4.2% | 1.1% | |||
| Social and Economic | very low | 94.4% | 1.1% | 4.9% | 40.1% | 33.6% | 18.4% | <0.001 | <0.001 | 0.43 |
| Social and Economic | low | 4.9% | 18.6% | 20.4% | 26.4% | 48.2% | - b | |||
| Social and Economic | moderate | 0.7% | 18.5% | 21.7% | 12.3% | 10.3% | 46.4% | |||
| Social and Economic | high | - b | 28.5% | 23.3% | 13.4% | 7.6% | 34.0% | |||
| Social and Economic | very high | 0.0% | 33.4% | 29.6% | 7.7% | 0.4% | 1.2% | |||
The percentage is not displayed due to the small sample size;
When the sample size was small, the percentage was combined with the previous category level above it; COI: Child Opportunity Index
Footnotes
Declaration of Interest
Ethics approval and consent to participate: The University of Arkansas for Medical Sciences Institutional Review Board deemed this study non-human subjects research. During the course of preparing this work, the author(s) used ChatGPT4 from OpenAI for the purpose of linguistic and grammatical proofreading. Following the use of this tool/service, the author(s) formally reviewed the content for its accuracy and edited it as necessary. The author(s) take full responsibility for all the content of this publication.
Competing interests: The authors declare that they have no competing interests.
Availability of data and materials:
The datasets used in this study are not publicly available due to a Data Use Agreement with the Arkansas Insurance Department (AID) and the Arkansas Center for Health Improvement (ACHI). Researchers must request access to the data directly from the Arkansas Center for Health Improvement.
References
- 1.Nath JB, Hsia RY. Children’s emergency department use for asthma, 2001–2010. Acad Pediatr . 2015;15:225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fuhrman C, Dubus JC, Marguet C, Delacourt C, Thumerelle C, De Blic J, Delmas MC. Hospitalizations for asthma in children are linked to undertreatment and insufficient asthma education. J Asthma. 2011;48:565–571. [DOI] [PubMed] [Google Scholar]
- 3.Hasegawa K, Tsugawa Y, Brown DFM, Camargo CA. Childhood asthma hospitalizations in the United States, 2000-2009. J Pediatr. 2013;163:1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ojo RO, Okobi OE, Ezeamii PC, Ezeamii VC, Nwachukwu EU, Gebeyehu YH, Okobi E, David AB, Akinsola Z. Epidemiology of current asthma in children under 18: a two-decade overview using National Center for Health Statistics (NCHS) data. Cureus. 2023;15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Goff SL, Shieh MS, Lindenauer PK, Ash AS, Krishnan JA, Geissler KH. Differences in health care utilization for asthma by children with Medicaid versus private insurance. Popul Health Manag. 2024;27:105–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kaufmann J, Marino M, Lucas J, Bailey SR, Giebultowicz S, Puro J, Ezekiel-Herrera D, Suglia SF, Heintzman J. Racial and ethnic disparities in acute care use for pediatric asthma. Ann Fam Med. 2022;20:116–122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Chang J, Freed GL, Prosser LA, Patel I, Erickson SR, Bagozzi RP, Balkrishnan R. Comparisons of health care utilization outcomes in children with asthma enrolled in private insurance plans versus Medicaid. J Pediatr Health Care. 2014;28:71–79. [DOI] [PubMed] [Google Scholar]
- 8.Hatoun JTEVL. Identifying children at risk of asthma exacerbations: beyond HEDIS. Am J Manag Care. 2018;24:e170–e174. [PubMed] [Google Scholar]
- 9.Asthma Medication Ratio (AMR) - NCQA [Internet]. [cited 2024 Oct 10]. Available from: https://www.ncqa.org/hedis/measures/medication-management-for-people-with-asthma-and-asthma-medication-ratio/.
- 10.Cabana MD, Slish KK, Nan B, Clark NM. Limits of the HEDIS criteria in determining asthma severity for children. Pediatrics. 2004;114:1049–1055. [DOI] [PubMed] [Google Scholar]
- 11.Jefferson AA, Brown CC, Eyimina A, Goudie A, Rezaeiahari M, Perry TT, Mick Tilford J. Asthma quality measurement and adverse outcomes in Medicaid-enrolled children. pediatrics. 2023;152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Federico MJ, McFarlane AE, Szefler SJ, Abrams EM. The impact of social determinants of health on children with asthma. J Allergy Clin Immunol Pract. 2020;8:1808–1814. [DOI] [PubMed] [Google Scholar]
- 13.Tyris J, Gourishankar A, Ward MC, Kachroo N, Teach SJ, Parikh K. Social determinants of health and at-risk rates for pediatric asthma morbidity. Pediatrics. 2022;150. [DOI] [PubMed] [Google Scholar]
- 14.Chen S, Bergman D, Miller K, Kavanagh A, Frownfelter J, Showalter J. Using applied machine learning to predict healthcare utilization based on socioeconomic determinants of care. Am J Manag Care. 2020;26:26–31. [DOI] [PubMed] [Google Scholar]
- 15.Arkansas All-Payer Claims Database. Available from: https://www.arkansasapcd.net/Home/.
- 16.Cloutier MM, Baptist AP, Blake KV., Brooks EG, Bryant-Stephens T, DiMango E, Dixon AE, Elward KS, Hartert T, Krishnan JA, et al. 2020 Focused updates to the asthma management guidelines: A report from the National Asthma Education and Prevention Program Coordinating Committee Expert Panel Working Group. J Allergy Clin Immunol. 2020;146:1217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bloom CI, Cabrera C, Arnetorp S, Coulton K, Nan C, van der Valk RJP, Quint JK. Asthma-related health outcomes associated with short-acting β2-agonist inhaler use: An Observational UK study as part of the SABINA global program. Adv Ther. 2020;37:4190–4208. [DOI] [PubMed] [Google Scholar]
- 18.USDA ERS - Rural-Urban Commuting Area Codes. Available from: https://www.ers.usda.gov/data-products/rural-urban-commuting-area-codes.aspx. [DOI] [PubMed]
- 19.Child Opportunity Index (COI) | diversitydatakids.org. Available from: https://www.diversitydatakids.org/child-opportunity-index.
- 20.Resource on ICD-10-CM coding for social determinants of health | AHA. Available from: https://www.aha.org/dataset/2018-04-10-resource-icd-10-cm-coding-social-determinants-health. [Google Scholar]
- 21.Grant RW, McCloskey J, Hatfield M, Uratsu C, Ralston JD, Bayliss E, Kennedy CJ. Use of latent class analysis and k-means clustering to identify complex patient profiles. JAMA Netw Open. 2020;3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Udell M, Horn C, Zadeh R, Boyd S. Generalized Low Rank Models. Foundations and trends in machine learning. 2014;9:1–118. [Google Scholar]
- 23.Schuler A, Liu V, Wan J, Callahan A, Udell M, Stark DE, Shah NH. Discovering patient phenotypes using generalized low rank models. Pac Symp Biocomput. 2016; 21:144. [PMC free article] [PubMed] [Google Scholar]
- 24.R Interface for the “H2O” Scalable machine learning platform [R package h2o version 3.44.0.3]. 2024. [Google Scholar]
- 25.“Finding groups in data”: Cluster analysis extended Rousseeuw et al. [R package cluster version 2.1.6]. 2023. [Google Scholar]
- 26.Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53–65. [Google Scholar]
- 27.Moore WC, Meyers DA, Wenzel SE, Teague WG, Li H, Li X, D’Agostino R, Castro M, Curran-Everett D, Fitzpatrick AM, et al. Identification of asthma phenotypes using cluster analysis in the severe asthma research program. Am J Respir Crit Care Med. 2009;181:315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Howrylak JA, Fuhlbrigge AL, Strunk RC, Zeiger RS, Weiss ST, Raby BA. Classification of childhood asthma phenotypes and long-term clinical responses to inhaled anti-inflammatory medications. J Allergy Clin Immunol. 2014;133:1300.e12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Smith LB, O’Brien C, Kenney GM, Waidmann TA. Black-white disparities in asthma hospitalizations and ed visits among Medicaid-enrolled Children. Hosp Pediatr. 2024;14:490–498. [DOI] [PubMed] [Google Scholar]
- 30.Akinbami LJ, Moorman JE, Simon AE, Schoendorf KC. Trends in racial disparities for asthma outcomes among children 0-17 years, 2001-2010. J Allergy Clin Immunol. 2014;134:547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Correa-Agudelo E, Ding L, Beck AF, Brokamp C, Altaye M, Kahn RS, Mersha TB. Understanding racial disparities in childhood asthma using individual- and neighborhood-level risk factors. J Allergy Clin Immunol. 2022;150:1427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lintzenich A, Teufel RJ, Basco WT. Under-utilization of controller medications and poor follow-up rates among hospitalized asthma patients. Hosp Pediatr. 2011;1:8–14. [DOI] [PubMed] [Google Scholar]
- 33.Pade KH, Thompson LR, Ravandi B, Chang TP, Barry F, Halterman JS, … Okelo SO (2021). Children with under-diagnosed asthma presenting to a pediatric emergency department. Journal of Asthma, 59(7), 1353–1359. [DOI] [PubMed] [Google Scholar]
- 34.McQuistion K, Stokes S, Allard B, Bhansali P, Davidson A, Hall M, Magyar M, Parikh K. Social determinants of health ICD-10 code use in inpatient pediatrics. Pediatrics. 2023;152. [DOI] [PubMed] [Google Scholar]
- 35.Tyris J, Keller S, Parikh K. Social risk interventions and health care utilization for pediatric asthma: a systematic review and meta-analysis. JAMA Pediatr. 2022;176:E215103. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used in this study are not publicly available due to a Data Use Agreement with the Arkansas Insurance Department (AID) and the Arkansas Center for Health Improvement (ACHI). Researchers must request access to the data directly from the Arkansas Center for Health Improvement.
