Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2021 Jan 25;2020:263–272.

Mental Health Comorbidity Analysis in Pediatric Patients with Autism Spectrum Disorder Using Rhode Island Medical Claims Data

Katherine A Brown 1, Indra Neil Sarkar 1,2, Elizabeth S Chen 1
PMCID: PMC8075466  PMID: 33936398

Abstract

Identification of comorbidity subgroups linked with Autism Spectrum Disorder (ASD) could provide promising insight into learning more about this disorder. This study sought to use the Rhode Island All-Payer Claims Database to examine mental health conditions linked to ASD. Medical claims data for ASD patients and one or more mental health conditions were analyzed using descriptive statistics, association rule mining (ARM), and sequential pattern mining (SPM). The results indicated that patients with ASD have a higher proportion of mental health diagnoses than the general pediatric population. ARM and SPM methods identified patterns of comorbidities commonly seen among ASD patients. Based on the observed patterns and temporal sequences, suicidal ideation, mood disorders, anxiety, and conduct disorders may need focused attention prospectively. Understanding more about groupings of ASD patients and their comorbidity burden can help bridge gaps in knowledge and make strides toward improved outcomes for patients with ASD.

Introduction

Autism Spectrum Disorder (ASD) is a group of complex behavioral and developmental disorders that have a broad range of clinical presentations including communication barriers, social-emotional deficits, hyperactivity, hypoactivity, and repetitive movements.1 While ASD prevalence continues to rise at an estimated 1 in 59 children based on the latest report by the Centers for Disease Control and Prevention (CDC) in 2014,2 there is still much unknown about this disorder. Epidemiological evidence indicates that certain medical conditions are more prevalent in children with ASD compared to children in the general population, which makes the study of comorbidities an important focus in the field.3 In particular, recent efforts have focused on identifying and characterizing subgroup patterns of ASD comorbidities. In several studies, common comorbidity profiles or subgroups that are linked with ASD include seizures, attention deficit hyperactivity disorder (ADHD), gastrointestinal (GI) disorders and other psychiatric disorders.3–9 A few recent articles have identified that co-occurring psychiatric and medical conditions with ASD account for a substantial burden, and further research and attention is needed in this area for better understanding of the effects of these comorbidities on patient outcomes and direction for overall treatment.10,11 Being able to effectively subcategorize patients with ASD based on comorbidity patterns may have broad implications. A characterization of disease development trajectory can facilitate prognosis, inform treatment decisions, and assist in risk stratification for future complications and safety concerns. Categorizing a heterogeneous patient cohort into more homogeneous subgroups is also the first step toward more powerful genomic and molecular studies that can lead toward a better understanding of the etiologies involved in ASD.12

Previous studies have indicated that children with ASD are at an increased risk for being diagnosed with one or more co-occurring mental health conditions,4,10 but not much is known about the unique relationships and patterns of the different mental health comorbidities along with the unique trajectories and outcomes of these patients. Research focusing on the treatment and understanding of mental health comorbidities associated with ASD patients has been limited.10 Due to overlapping symptoms and diagnostic bias or interpretation, assessment and prognosis can be extremely challenging and complex. Unsupervised learning approaches such as association rule mining (ARM) and sequential pattern mining (SPM) can be used as a preliminary analysis to ascertain similarities and temporal patterns for ASD patients diagnosed with one or more mental health disorders.

Data mining methods can be used to link comorbidities and identify unknown patterns to gain knowledge and understanding about the disorder. ARM is a well-established method to discover and identify unique relationships in a dataset. SPM uses longitudinal data to identify temporal or ordered patterns in a dataset.13 In this study, ARM and SPM were used to find unique associations between different combinations of mental health disorders in ASD patients compared to the general pediatric population. Combining these two methods can help identify common mental health diagnoses and combinations of diagnoses seen in ASD patients. Taking an unsupervised approach throughout is preferred in an attempt to minimize clinician label bias with the goal of uncovering unknown patterns and relationships. Although this study does not present new methods, this work presents a new framework on combining different machine learning techniques used in a variety of fields to find patterns and relationships between mental health comorbidity patterns in patients with ASD from a large dataset with limited previous research.

This study sought to use the All-Payer Claims Database in Rhode Island to better understand mental health conditions linked to ASD. The goal of this study was two-fold. First, this study aimed to identify and validate common mental health conditions seen from patients with ASD by using medical claims data in the state of Rhode Island.14–19 Most research thus far has only been preliminary with little validation from disparate data sources. By using a large set of medical claims data, trends within the ASD population can be tracked and visualized. The diagnosis rates within a large medical claims dataset can further validate common mental health comorbidities seen in the ASD population. The second aim of this study was to further analyze and identify patterns of ASD patients also diagnosed with mental health conditions using ARM and SPM. Further analysis is needed to determine the importance of the subgroups and infer similarities and relationships.

Methods

Overview

This study used data from the Rhode Island All-Payer Claims Database (or HealthFacts RI). All analyses were performed using Julia Version 1.3 and R-statistical software Version 3.6.1. All of the data were preprocessed in a secure computing environment at Brown University. Code for pre-processing and analyses can be found in the supplemental GitHub repository.

Dataset

HealthFacts RI is a deidentified dataset with data beginning in 2011, which includes information from healthcare claims, insurance enrollment information, provider information and other data from Medicare, Medicaid and large commercial insurance companies (Figure 1). This dataset is publicly available and is monitored and run by the State of Rhode Island Department of Health. It comprises over 300 million medical claims and includes insurance claim information for over 1 million patients, which is most of the insured Rhode Island residents.

Figure 1.

Figure 1.

Overview of Study Approach

The database consists of solely structured data from medical claims, and the sole features of interest for this project included patient’s internal member ID, age, gender, and all associated diagnoses for each patient. In this study, ASD patient cases were extracted from the database using MySQL queries by joining different tables of the HealthFacts RI database and extracting each patient’s diagnosis codes reported as International Classification of Diseases, Ninth and Tenth Revision (ICD-9-CM and ICD-10-CM) codes. Two sets of data were extracted. One dataset with all pediatric patients ages 2-21 and another with all pediatric patients ages 2-21 with a diagnosis of ASD (ICD-9/10-CM: 299*/ F84*). The data extracted included information from the entire lifetime of the database (2011-2018). The ages were chosen because these are the time points at which ASD is most likely to be detected. Ages 18-21 were included as most pediatricians can see patients up to 21 years old, and it also includes development of mental health trajectories into the young adult years.

Mental Health Comorbidity Analysis

The data were pre-processed to aggregate various comorbidities into major categories using Phenome Wide Association Studies (PheWAS) categories called PheCodes.20 The dataset consists of all unique patients and their associative PheCode categories. Table 1 shows an example of ICD diagnosis codes mapped to their correlative PheCode group.

Table 1.

Example of ICD-9/10-CM diagnosis billing codes linked to a PheWAS group for depression

PheWAS Mapping from ICD-9/10-CM to PheCode Example
ICD-9-CM ICD-10-CM PheCode Phenotype
311,296.36,296.35,296.34,296.33,296.32,296
.31,296.3,296.3,296.26,296.25,296.24,296.23
,296.22,296.21,296.2,296.2
F33.9,F32.4,F33,F32.5,F32.81,F32.8,F33.8,F33.1,F32.1,F
33.0,F32.3,F33.41,F33.3,F33.4,F32.2,F32.9,F33.42,F32.0
,F32,F33.40,F33.2, F32.89
296.2, 296.22 Depression/ Major
Depressive Disorder

The initial MySQL query included all diagnoses for each patient. The PheCodes were filtered to only include all of the mental health diagnoses diagnosed for each patient in the dataset. This was done for both the entire pediatric population (ages 2-21) and for the dataset with only patients diagnosed with ASD (ages 2-21). Subsets of the ASD patient population were also separated to take a closer look at gender (male, female) and age groups (toddler (2), preschooler (3-5), middle childhood 1 (6-8), middle childhood 2 (9-11), young teen (12-14), teenager (15-17), and young adult (18-21)). The age groups were determined based on the CDC definition of age groups.21 Mental health diagnoses and PheCodes were chosen based on a review article by Rosen et al. that identifies the different mental health diagnoses groups with a call to action that more research is needed to more comprehensively understand the risk and treatment for individuals diagnosed with ASD. These included: anxiety disorders, depressive disorders, bipolar disorder, externalizing/ behavior disorders, oppositional defiant disorder/ conduct disorder, schizophrenia/ psychotic disorders, and ADHD as well as a section with other co-occurring conditions (e.g., post-traumatic stress disorder [PTSD]).10 Table 2 describes the 38 PheWAS groups included for analysis in this study.

Table 2.

Mental health diagnoses PheWAS groups included in study

PheCode Phenotype PheCode Phenotype PheCode Phenotype
295.0 Schizophrenia and other psychotic disorders 300.11 Generalized anxiety disorder 305.2 Eating disorder
295.1 Schizophrenia 300.12 Agoraphobia; social phobia; and
panic disorder
306.0 Other mental disorder
295.2 Paranoid disorders 300.13 Phobia 312.0 Conduct disorders
295.3 Psychosis 300.2 Generalized anxiety & phobic
disorders
312.3 Impulse control disorder
296.0 Mood disorders 300.3 Obsessive-compulsive disorder 313.1 Attention-
deficit/hyperactivity disorder (ADHD)
296.1 Bipolar 300.4 Dysthymic disorder 313.3 Autism
296.2 Depression 300.8 Acute reaction to stress 315.0 Developmental delays and
disorders
296.22 Major depressive disorder 300.9 Posttraumatic stress disorder 315.1 Learning disorder
297.0 Suicidal ideation or attempt 301.0 Personality disorders 315.2 Speech and language
disorders
297.1 Suicidal ideation 302.0 Sexual and gender identity
disorders
315.3 Mental retardation
297.2 Suicide or self-injury 303.3 Psychogenic disorder 316.0 Substance addiction and
disorders
300.0 Anxiety; phobic and
dissociative disorder
303.4 Somatoform disorder 317.0 Alcohol-related disorders
300.1 Anxiety disorder 304.0 Adjustment reaction

Association Rule Mining

ARM is a data mining technique that can be useful for determining relationships between data elements through analysis of patterns in frequent itemsets in the data. This unsupervised learning technique, originally proposed by Aragwal et al. in 1993, uses the Apriori algorithm to find a rule or combination of transactions that are related.22 The classic example of ARM is the market basket analysis where association rules help uncover relationships of products purchased together at a grocery store. It is used in many other fields including healthcare. Several studies have used ARM to investigate comorbidities among certain patient populations;23–27 however, to the best of our knowledge, few studies have comprehensively studied the associations between mental health comorbidities and ASD. ARM uses an efficient algorithm to compute a complete set of frequent itemsets based on a minimum support parameter. An association rule can be visualized as {X→Y} (or commonly seen as: left hand side{lhs}→right hand side{rhs}), which is the likeliness of finding Y which also has X. An itemset, {X,Y}, is the list of all items that form the association rule. Table 3 is an example of transactions of mental health comorbidities seen in this study along with two arbitrary examples of returned association rules.

Table 3.

Examples of mental health diagnoses per patient in dataset (transactions) and arbitrary examples of how the association rules are returned (itemset {X,Y} as {X→Y} rule)

Patient ID Mental Health Diagnoses (Transactions)
1 ADHD,Anxiety disorder,Depression,Major depressive disorder,Adjustment reaction
2 Learning disorder,Other mental disorder,Speech and language disorder,Mental retardation
3 ADHD,Mental retardation,Mood disorders,Conduct disorders,Anxiety disorder
Example Rule Association Rules
1 {Developmental delays and disorders,Mood disorders} => {Mental retardation}
2 {ADHD,Anxiety disorder,Conduct disorders} => {Bipolar}

The support metric gives information on how frequent an itemset occurs in all the transactions. It is the fraction of transactions containing that particular itemset. Confidence is another metric commonly used to rank and analyze association rules. Confidence is the probability of the occurrence of {Y} given that {X} is present. A higher confidence means a higher probability that X and Y occur together throughout the dataset. Lift is another metric commonly used to look at the ratio of confidence to baseline probability of occurrence of {Y}. Lastly, this study also used the chisquare metric for purposes of evaluation and the final determination in statistical significance level of the association rules. Chi-square can be calculated directly from the values of support, confidence and lift.28 The chi-square statistic is used to test for independence between the left hand side and right hand side of the rule {X→Y}. The null hypothesis is that there is no relationship between the itemsets and they are independent. A higher chi-square value indicates that the {X} and {Y} are not independent. The contingency table below shows the relationship between itemsets {X,Y} and the formulas for support, confidence, lift, and chi-square (Table 4).

Table 4.

Contingency table for {X→Y} and associated formulas for measures of interestingness used for ranking and evaluation of the results from ARM and SPM rules

graphic file with name 058_3417099t4.jpg

The ARules and ARulesViz packages were used in Julia and R to run the Apriori algorithm and visualize the top ranked rules.29 ARules was run on the All pediatric patients dataset and then on the ASD dataset. The rules were joined in a table to remove the rules common between the two to leave only the unique rules for ASD. The same process was done for the gender and age groups. The “transactions” entered into the Apriori algorithm were comma separated phenotype descriptions of the PheCodes to run in the package to get an output of ranking rules (Table 3). A support parameter cutoff of 0.01 was used to include enough transactions in the dataset to be significant but also include diagnoses or combinations of diagnoses that had lower prevalence within the dataset, which could still be important for evaluation. Most commonly, confidence, support, and lift are used as the parameters to rank rules to see which ones are the most similar or meaningful as mentioned above. The association rules for each group were ranked and evaluated based on these statistics. Chen et al. found that these parameters are not always the best measures to use for ranking as important rules can be missed due to lower occurrence of the certain itemsets.30 In the case of medical comorbidities, it is important to not overlook some of the less frequent diagnoses as these can be important, and other more common diagnoses can overshadow interesting patterns. Instead the chi-square statistic was used as the final parameter to rank the rules and determine statistical significance or importance. ARulesViz was used to generate a visualization of the top rules (See GitHub repository for all code).

Sequential Pattern Mining

SPM is a data mining technique used to detect ordered patterns within a dataset. It was introduced by Rakesh Agrawal in 1995 as a follow-up to association rule mining, and was originally used in retail to determine the order of items purchased sequentially over time.31 Since that time there have been many applications in medicine to help determine risk of health related events with a temporal relationship. Applications are seen in pharmacology, hospital readmissions and comorbidity analyses.32–38 In one study, Wang et al. studied comorbidities commonly seen with Type 2 Diabetes.38 There have not been any applications with ASD to the best of our knowledge. The cSPADE (Sequential Pattern Discovery using Equivalence classes) algorithm was used using the R package, aRulesSequences, to get ordered items based on sequence ID (list of patients) and their corresponding items with transaction times, which was the age of the patient when the diagnosis first occurred in the medical claims and the list of mental health diagnoses respectively.

Since the goal of SPM was to identify common temporal or ordered patterns of diagnosis, the dataset was preprocessed to only include the diagnosis on the first time it appeared as an ICD-9/10-CM code in the medical claims dataset. SPM was performed to analyze the data for frequent diagnosis patterns across the entire ASD population. Further work will be done to break down and analyze more granular age and gender groups along with determining if a next diagnosis in a sequence can be accurately predicted based on the ordered data. Similarly, as mentioned above with the ARM analysis, a support parameter cutoff of 0.01 was used to include enough rules for analysis but still have enough cases within the dataset to be significant. Confidence was used as the measure for ranking the rules and evaluation of significance.

Results

Some initial descriptive statistics within the dataset were evaluated to examine the diagnosis rates of ASD per year to compare to the CDC national rise in the past decade (2011-2018). As seen in Figure 2, the rise in diagnosis rates since the inception of the HealthFacts RI database (2011) shows a similar trend to the CDC data. It cannot be compared 1-to-1 because the CDC collects survey data at specific time points and ages,2 but the numbers give insight into diagnosis rates within this specific database over time. From the data extracted from HealthFacts RI, there were medical claims data included for a total of 328,378 pediatric patients (ages 2-21), 8,806 ASD patients (ages 2-21), 6,853 males, and 1,976 females. Out of the 8,806 patients with ASD in the dataset, 6,213 (or ~70%) had a mental health diagnosis identified in one of the 38 PheWAS groupings listed in Table 2. Table 5 includes descriptive statistics for mental health disorders and total patients per gender and age group analyzed using ARM.

Figure 2.

Figure 2.

Diagnosis rates from HealthFacts RI database

Table 5.

HealthFacts RI database ASD prevalence by gender and age group

Group Number Percentage
Pediatric Patients (2-21) 328,378 --
ASD Total (Ages 2-21) 8806 2.68%
ASD-Mental Health 6213 70.55%
Males 6853 77.82%
Females 1976 22.44%
Toddler 387 4.44%
Preschooler 1138 13.04%
Middle Childhood 1 1418 16.25%
Middle Childhood 2 1634 18.73%
Young Teen 1619 18.56%
Teenager 1425 16.33%
Young Adult 1103 12.64%

Mental Health Comorbidities

Previous studies have shown mental health comorbidities to be more common in patients with a diagnosis of ASD compared to the general pediatric population. The HealthFacts RI database was analyzed to look at the basic statistics of diagnosis rates reported in the dataset. Some phenotypes were grouped together based on similarity of conditions. A majority of mental health conditions showed higher proportion or percentage within the ASD patient population compared to the general pediatric population (Figure 3). Out of the initial 38 mental health PheWAS groupings chosen for this study (Table 2), 14 groups were selected as being most relevant to the ASD population. Some of the groupings were combined (as seen in Table 6 PheCode column), and some were removed based on prevalence (<0.5%) and specificity of the diagnosis grouping. Table 6 describes the groups that showed higher prevalence in the ASD population.

Figure 3.

Figure 3.

Mental Health Comorbidities – All pediatric population compared to ASD pediatric population

Table 6.

Mental health condition statistics from the HealthFacts RI database for ASD pediatric patients ages 2-21

Mental Health Conditions Pediatric Population (Total: 328378) ASD Population (Total: 8806)
PheCode Phenotype Total Percentage Total Percentage
313.1 Attention-deficit/hyperactivity disorder (ADHD) 47110 14.35% 3072 34.89%
300.0, 300.1,
300.11, 300.2
Anxiety disorders 56141 17.10% 2335 26.52%
315.0 Developmental delays and disorders 16187 4.93% 1506 17.10%
315.2 Speech and language disorders 20120 6.13% 1454 16.51%
312.0 Conduct disorders 21221 6.46% 1135 12.89%
296.2, 296.22 Depression/Major Depressive Disorders 35456 10.80% 945 10.73%
296.0 Mood disorders 14652 4.46% 893 10.14%
315.3 Mental retardation 4081 1.24% 683 7.76%
297.0, 297.1,
297.2
Suicidal ideation or attempt/Suicide or self-injury 10575 3.22% 438 4.97%
300.3 Obsessive-compulsive disorder 4075 1.24% 413 4.69%
296.1 Bipolar 6301 1.92% 327 3.71%
300.9 Posttraumatic stress disorder (PTSD) 9085 2.77% 286 3.25%
312.3 Impulse control disorder 2843 0.87% 165 1.87%
295.0, 295.1 Schizophrenia/Psychotic Disorders 1182 0.36% 72 0.82%

Association Rules

ARM was performed on the pediatric population, ASD, gender, and age groups. Included below are the results for several of the top 30 association rules unique to the ASD patient population (Table 7). ARM analysis showed frequent combinations of anxiety, mood disorders, ADHD, conduct disorders, and suicidal ideation together. This is visualized in the networking diagram shown in Figure 4 of the top 25 rules ranked by chi-square statistic. The combination of disorders seen in the middle of the diagram occur frequently together in this dataset. This initial analysis of mental health conditions is extremely informative for future studies to design algorithms to help predict patients at risk for some of these disorders and the safety concerns associated with them. Anxiety, mood disorders, and conduct disorders is validated in the literature through multiple studies seen in patients with ASD.3–9 Suicidal ideation was a diagnosis that was somewhat unexpected. An interesting combination of disorders seen frequently in the dataset included ADHD, anxiety, mood disorders, and conduct disorders (left hand side) often associated with suicidal ideation due to the high chi-square statistic value showing that these are likely not independent of each other. There is little literature available on the risk of suicidal ideation or suicide attempt linked to ASD. Future studies are needed to better understand the risk of suicidal ideation/attempt or self-harm in this patient population and identify interventions that could be specifically tailored for patients with ASD. Figure 5 includes a visualization for the top 30 rules in the male and female ASD groups. The female gender group showed more diagnosis patterns centered around depression, which was not seen in the top rules for the entire ASD population. Other combinations of diagnoses seen in the female group included: anxiety, depression, mood disorders, dysthymic disorder, and suicidal ideation. Ten of the top 30 rules for the male group focused more on conduct disorders, impulse control disorders, and ADHD, but overall contained similar rules to the whole ASD population association rules. Association rules for each group can be found in the supplementary material in the GitHub repository.

Table 7.

Top association rules for ASD patient population

Association Rules
Rule lhs rhs chiSquare p-value(𝜒2) confidence support
1 {ADHD, Anxiety disorder, Mood disorders, Other mental disorder} {Suicidal ideation} 657.3981 5.50E-145 0.5096 0.0171
2 {ADHD, Conduct disorders, Mood disorders, Other mental disorder} {Suicidal ideation} 554.7208 1.18E-122 0.4757 0.0158
3 {ADHD, Anxiety disorder, Conduct disorders, Mood disorders} {Suicidal ideation} 548.5109 2.65E-121 0.4755 0.0156
5 {ADHD, Anxiety disorder, Conduct disorders, Other mental disorder} {Suicidal ideation} 520.7474 2.91E-115 0.4202 0.0174
8 {ADHD, Conduct disorders, Suicidal ideation} {Mood disorders} 452.2976 2.28E-100 0.7355 0.0183
11 {ADHD, Conduct disorders, Other mental disorder, Suicidal ideation} {Mood disorders} 415.3604 2.50E-92 0.7717 0.0158
12 {Developmental delays and disorders, Mood disorders} {Mental retardation} 413.8745 5.26E-92 0.5374 0.0185
20 {Anxiety disorder, Mental retardation, Mood disorders} {Conduct disorders} 358.6943 5.42E-80 0.7770 0.0185
21 {Anxiety disorder, Developmental delays and disorders, Mood
disorders}
{Conduct disorders} 356.8445 1.37E-79 0.7971 0.0177
29 {Conduct disorders, Mood disorders} {Bipolar} 316.2073 9.71E-71 0.2426 0.0159

Figure 4.

Figure 4.

Networking diagram of the top 30 association rules for the ASD patient population.

Figure 5.

Figure 5.

Networking diagram of the top 30 association rules for the male and female ASD groups.

Sequential Patterns

The top ordered items of SPM rules were analyzed to see sequential analysis of mental health condition diagnosis over time by age of when the first diagnosis appeared within the medical claims ICD-9/10-CM diagnoses (several of the top 50 rules can be seen in Table 8). The SPM analysis identified ordered sequences of mental health disorders in the dataset. Combinations of different diagnoses commonly showed to result in ADHD included conduct disorders, anxiety disorders, mood disorders, and suicidal ideation in a temporal manner. For example, the top SPM rule indicated that the combination of {Developmental delays and disorders, Speech and language disorder, and Autism} => {ADHD} temporally then resulted in a diagnosis of ADHD with a confidence of 40.5%. Another rule seen with a confidence of 29.3% was {Conduct disorders, Autism} => {ADHD} and {Mood Disorders, Autism} => {ADHD}. This is validated in literature and ADHD is a very common co-occurring diagnosis with ASD.9,26,39 A RHS rule of suicidal ideation was observed several times throughout the top 100 SPM rules. As stated with the ARM results, future work is needed to stratify the risk and design predictive models for the risk of suicidal ideation/attempt and self-harm behaviors. These are major safety concerns and limited studies have focused on the correlation between suicide and ASD.

Table 8.

Top Sequential Pattern Mining (SPM) rules for ASD patient population

Sequential Pattern Mining Rules
Rule lhs rhs confidence support
1 Developmental delays and disorders, Speech and language
disorder, Autism
ADHD 0.40566038 0.01333333
2 Autism ADHD 0.36491557 0.36186047
3 Mental retardation, Autism Conduct disorders 0.36082474 0.01085271
4 ADHD, Anxiety disorder, Autism Conduct disorders 0.33766234 0.01612403
5 Speech and language disorder, Autism ADHD 0.32231405 0.02418605
6 Developmental delays and disorders, Autism ADHD 0.3137931 0.02821705
7 Mood disorders, Autism Conduct disorders 0.30726257 0.01705426
8 Developmental delays and disorders, Autism Speech and language disorder 0.30344828 0.02728682
9 ADHD, Autism Anxiety disorder 0.29606299 0.05829457
41 Mood disorders, Autism Suicidal ideation 0.18994413 0.01054264

Discussion

Research has indicated that early intervention is key to improving outcomes for patients with ASD. Many research efforts have focused on the early diagnosis of ASD; however, not many studies have focused on studying the progression of the disorder. Studying the comorbidity profiles and characterizing the safety outcomes for patients with ASD is an important focus in the field. Data mining methods leveraging clinical data can be used to identify unknown patterns and characterize the risk factors and outcomes leading to these comorbidities. The progression of ASD is complex and risk stratification, or the development of tools to predict risk for divergent progressions of ASD, is needed to intervene as early as possible and improve outcomes for ASD patients. This study confirmed that mental health comorbidities in children with ASD have a high prevalence, generating concern for patient safety.10 Safety concerns include suicide attempt and aggressive behaviors that lead to harm to self or others. Implementation of clinical interventions using informatics techniques can help improve outcomes for patients with ASD at increased risk for these significant safety concerns.

Analyzing data within a large medical claims database such as HealthFacts RI shows valuable insight into patterns and similarities in the patient population. This database was created to be accessible by providers, state agencies, researchers, and health systems, and insurers to help identify areas to improve healthcare and reduce costs. Rhode Island is an ideal area to obtain more holistic data due to the wide variety of demographic, social and economic factors within one entire state. The use of medical claims data in healthcare research has shown to have advantages such as large amounts of data, diverse sample sizes, consistent variables, anonymity, and accessibility.40 Although challenges exist with medical claims data, the use of ICD-9/10-CM diagnosis codes allow us to gain knowledge about large patient cohorts. Previous literature has identified high rates of mental health disorders in patients that have ASD compared to the general pediatric population. The results from this study validated that for a majority of mental health conditions (PheWAS groupings), there was a higher proportion of diagnoses seen in the ASD population than the general pediatric population. ARM showed interesting combinations of comorbidities related to common outputs (rhs). Based on the observed patterns and temporal sequences, suicidal ideation, mood disorders, anxiety, and conduct disorders may need focused attention prospectively. Understanding more about groupings of ASD patients and their comorbidity burden can help bridge gaps in knowledge and make strides toward improved outcomes for patients with ASD. Rules with large chi-square included combinations of disorders with a highly similar relationship to suicidal ideation as the right-hand side in the majority of rules. This was often missed in the confidence, support, and lift measures due to frequency of the condition in the dataset. This is interesting because the unique combinations of diagnoses linked with suicidal ideation is a safety concern important to focus on in future work.

Medical claim datasets are a rich source of information for a large population. This makes it a great data source to study, but there are also limitations. Medical claims data only shows billing data, and it is well known that ICD-9/10-CM codes have some disadvantages and do not represent the whole picture. For example, there were a large number of patients excluded from the study because the only diagnosis on file was one for ASD. Mental health conditions are often hard to study because much information resides in clinical notes. Due to data incompleteness, future studies will look at other data sources including Electronic Health Record (EHR) data and data from current clinical studies for a cohort of ASD patients. These future studies can take into account factors other than diagnosis codes and use techniques such as natural language processing to better study patients with mental health comorbidities in the ASD population. This study was a preliminary analysis of mental health comorbidities seen in patients diagnosed with ASD. The results indicated that patients with ASD have a higher proportion of mental health diagnoses than the general pediatric population. ARM and SPM methods identified patterns of comorbidities that are commonly seen among ASD patients. Based on the patterns and temporal sequences seen, some disorders that need focused attention in future work include suicidal ideation, mood disorders, anxiety, and conduct disorders. Since it is now well-established that patients with ASD are at a higher risk of being diagnosed with another mental health disorder, other factors can be studied to stratify risk for being diagnosed with specific mental health disorders. Time series analysis and predictive modeling can help determine patients who may be at risk for some of these mental health disorders and help prevent safety concerns such as suicidal ideation and self-harm. Future work will include the use of clustering techniques to subcategorize patients based on their entire comorbidity profile. Additionally, behavioral and mental health clinicians will validate the patterns of mental health comorbidities identified in this study as part of formal evaluations.

Conclusion

This study aimed to use unsupervised machine learning methods to better characterize patients with ASD, and further expand to specifically characterize patterns and similarities for patients with multiple mental health comorbidities. Using the HealthFacts RI database, structured ICD-9/10-CM codes were extracted to look at correlated conditions across the pediatric patient population. The results indicated that there are common groupings of specific mental health disorders that are seen as unique combinations compared to the general pediatric population. We were able to identify common combinations of clinically accurate mental health conditions seen in patients with ASD. This was also the case for SPM and the order in which these diagnoses first appear by date in the dataset. Further analysis will be done to study some of the interactions of conditions not studied extensively up to this point, along with other medical comorbidities (other than mental health) for this patient population. These results validated current literature and help to inform future work for the design and development of models and algorithms to help predict trajectories and outcomes for patients with ASD.

Acknowledgements

This work was supported in part by National Institutes of Health grants U54GM115677 and R25MH116440, as well as the Hassenfeld Child Health Innovation Institute. Access and use of the Health Facts RI was with the permission of the Rhode Island Department of Health. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, Hassenfeld Child Health Innovation Institute, and Rhode Island Department of Health.

Figures & Table

References

  • 1.American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders (DSM-5®) American Psychiatric Pub; 2013. [Google Scholar]
  • 2.Baio J, Wiggins L, Christensen DL, et al. Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years - Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014. MMWR Surveill Summ. 2018;67(6):1–23. doi: 10.15585/mmwr.ss6706a1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tye C, Runicles AK, Whitehouse AJO, Alvares GA. Characterizing the Interplay Between Autism Spectrum Disorder and Comorbid Medical Conditions: An Integrative Review. Front Psychiatry. 2018;9:751. doi: 10.3389/fpsyt.2018.00751. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Doshi-Velez F, Ge Y, Kohane I. Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics. 2014;133(1):e54–63. doi: 10.1542/peds.2013-0819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kohane IS, McMurry A, Weber G, et al. The co-morbidity burden of children and young adults with autism spectrum disorders. PLoS One. 2012;7(4):e33224. doi: 10.1371/journal.pone.0033224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Aldinger KA, Lane CJ. Veenstra-VanderWeele J, Levitt P. Patterns of Risk for Multiple Co-Occurring Medical Conditions Replicate Across Distinct Cohorts of Children with Autism Spectrum Disorder. Autism Res. 2015;8(6):771–81. doi: 10.1002/aur.1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Soke GN, Maenner MJ, Christensen D, Kurzius-Spencer M, Schieve LA. Prevalence of Co-Occurring Medical and Behavioral Conditions/Symptoms Among 4- and 8-Year-Old Children with Autism Spectrum Disorder in Selected Areas of the United States in 2010. J Autism Dev Disord. 2018;48(8):2663–76. doi: 10.1007/s10803-018-3521-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Memari A, Ziaee V, Mirfazeli F, Kordi R. Investigation of autism comorbidities and associations in a school-based community sample. J Child Adolesc Psychiatr Nurs. 2012;25(2):84–90. doi: 10.1111/j.1744-6171.2012.00325.x. [DOI] [PubMed] [Google Scholar]
  • 9.Muskens JB, Velders FP, Staal WG. Medical comorbidities in children and adolescents with autism spectrum disorders and attention deficit hyperactivity disorders: a systematic review. Eur Child Adolesc Psychiatry. 2017;26(9):1093–103. doi: 10.1007/s00787-017-1020-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Rosen TE, Mazefsky CA, Vasa RA, Lerner MD. Co-Occurring psychiatric conditions in autism spectrum disorder. Int Rev Psychiatry. 2018;30(1):40–61. doi: 10.1080/09540261.2018.1450229. [DOI] [PubMed] [Google Scholar]
  • 11.McCormick CEB, Kavanaugh BC, Sipsock D, et al. Autism Heterogeneity in a Densely Sampled U.S. Population: Results From the First 1,000 Participants in the RI-CART Study. Autism Res. 2020;13(3):474–88. doi: 10.1002/aur.2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lingren T, Chen P, Bochenek J, et al. Electronic Health Record Based Algorithm to Identify Patients with Autism Spectrum Disorder. PLoS One. 2016;11(7):e0159621. doi: 10.1371/journal.pone.0159621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ou-Yang C, Wulandari CP, Hariadi RAR, Wang H-C, Chen C. Applying sequential pattern mining to investigate cerebrovascular health outpatients’ re-visit patterns. PeerJ. 2018;6:e5183. doi: 10.7717/peerj.5183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.All-Payer Claims Databases [Internet]. [cited 2020 Mar 24]; Available from: https://www.ahrq.gov/data/apcd/index.html .
  • 15.Porter J, Love D, Costello A, Peters A, Rudolph B. All-Payer Claims Database Development Manual: Establishing a Foundation for Health Care Transparency and Informed Decision Making. 2015. [cited 2020 Mar 24]; Available from: https://scholars.unh.edu/ihpp/6 .
  • 16.Love D, Custer W, Miller P. All-Payer claims databases: state initiatives to improve health care transparency. Issue Brief. 2010;99:1–14. [PubMed] [Google Scholar]
  • 17.Solomon JG, Monteiro KA, Zonfrillo MR. Prevalence of Tobacco Use and Overweight/Obesity in Rhode Island: Comparisons of Survey and Claims Data. R I Med J. 2019;102(2):19–23. [PubMed] [Google Scholar]
  • 18.Figueroa JF, Frakt AB, Lyon ZM, Zhou X, Jha AK. Characteristics and spending patterns of high cost, non-elderly adults in Massachusetts. Healthc (Amst) 2017;5(4):165–70. doi: 10.1016/j.hjdsi.2017.05.001. [DOI] [PubMed] [Google Scholar]
  • 19.HealthFacts RI Program: Department of Health [Internet]. [cited 2019 Apr 17]; Available from: http://www.health.ri.gov/programs/detail.php?pgm_id=117 .
  • 20.PheWAS - Phenome Wide Association Studies [Internet]. [cited 2019 May 11]; Available from: https://phewascatalog.org/phecodes .
  • 21.CDC. Child Development Positive Parenting Tips | CDC [Internet]. Centers for Disease Control and Prevention. 2019. [cited 2019 Sep 19]; Available from: https://www.cdc.gov/ncbddd/childdevelopment/positiveparenting/index.html .
  • 22.Agrawal R, Imieliński T, Swami A. Mining Association Rules Between Sets of Items in Large Databases. SIGMOD Rec. 1993;22(2):207–16. [Google Scholar]
  • 23.Wang C-H, Lee T-Y, Hui K-C, Chung M-H. Mental disorders and medical comorbidities: Association rule mining approach. Perspect Psychiatr Care. 2019;55(3):517–26. doi: 10.1111/ppc.12362. [DOI] [PubMed] [Google Scholar]
  • 24.Held FP, Blyth F, Gnjidic D, et al. Association Rules Analysis of Comorbidity and Multimorbidity: The Concord Health and Aging in Men Project. J Gerontol A Biol Sci Med Sci. 2016;71(5):625–31. doi: 10.1093/gerona/glv181. [DOI] [PubMed] [Google Scholar]
  • 25.Chen Y, Xu R. Mining cancer-specific disease comorbidities from a large observational health database. Cancer Inform. 2014;13(Suppl 1):37–44. doi: 10.4137/CIN.S13893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tai Y.-M., Chiu H.-W. Comorbidity study of ADHD: Applying association rule mining (ARM) to National Health Insurance Database of Taiwan. Int J Med Inform. 2009;78(12):e75–83. doi: 10.1016/j.ijmedinf.2009.09.005. [DOI] [PubMed] [Google Scholar]
  • 27.Lakshmi KS, Vadivu G. A novel approach for disease comorbidity prediction using weighted association rule mining. J Ambient Intell Humaniz Comput [Internet] 2019. Available from: https://doi.org/10.1007/s12652-019-01217-1 .
  • 28.Alvarez SA. Chi-Squared computation for association rules: preliminary results. Boston, MA: Boston College; [Internet] 2003. Available from: http://www.cs.bc.edu/~alvarez/ChiSquare/chi2tr.pdf . [Google Scholar]
  • 29.Hahsler M. arulesViz: Interactive Visualization of Association Rules with R. R J [Internet] 2017;9(2) Available from: https://journal.r-project.org/archive/2017/RJ-2017-047/RJ-2017-047.pdf . [Google Scholar]
  • 30.Wright A, Chen ES, Maloney FL. An automated technique for identifying associations between medications, laboratory results and problems. J Biomed Inform. 2010;43(6):891–901. doi: 10.1016/j.jbi.2010.09.009. [DOI] [PubMed] [Google Scholar]
  • 31.Agrawal R, Srikant R. Mining sequential patterns. Proceedings of the Eleventh International Conference on Data Engineering. 1995. pp. 3–14.
  • 32.Wright AP, Wright AT, McCoy AB, Sittig DF. The use of sequential pattern mining to predict next prescribed medications. J Biomed Inform. 2015;53:73–80. doi: 10.1016/j.jbi.2014.09.003. [DOI] [PubMed] [Google Scholar]
  • 33.Reps J, Garibaldi JM, Aickelin U, Soria D, Gibson JE, Hubbard RB. Discovering sequential patterns in a UK general practice database. Proceedings of 2012 IEEE-EMBS International Conference on Biomedical and Health Informatics. 2012. pp. 960–3.
  • 34.Batal I, Valizadegan H, Cooper GF, Hauskrecht M. A Pattern Mining Approach for Classifying Multivariate Temporal Data. Proceedings. 2011;2011:358–65. doi: 10.1109/BIBM.2011.39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Norén GN, Bate A, Hopstadius J, Star K, Edwards IR. Temporal pattern discovery for trends and transient effects: its application to patient records. Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08; New York, New York, USA. ACM Press; 2008. p. 963. [Google Scholar]
  • 36.Jin HW, Chen J, He H, Williams GJ, Kelman C, O’Keefe CM. Mining unexpected temporal associations: applications in detecting adverse drug reactions. IEEE Trans Inf Technol Biomed. 2008;12(4):488–500. doi: 10.1109/TITB.2007.900808. [DOI] [PubMed] [Google Scholar]
  • 37.Boytcheva S, Angelova G, Angelov Z, Tcharaktchiev D. Mining comorbidity patterns using retrospective analysis of big collection of outpatient records. Health Inf Sci Syst. 2017;5(1):3. doi: 10.1007/s13755-017-0024-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wang Y, Hou W, Wang F. Mining co-occurrence and sequence patterns from cancer diagnoses in New York State. PLoS One. 2018;13(4):e0194407. doi: 10.1371/journal.pone.0194407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Gordon-Lipkin E., Marvin A.R., Law J.K., Lipkin P.H. Anxiety and mood disorder in children with autism spectrum disorder and ADHD. Pediatrics [Internet] 2018;141(4) doi: 10.1542/peds.2017-1377. Available from: http://www.embase.com/search/results?subaction=viewrecord&from=export&id=L621531384 . [DOI] [PubMed] [Google Scholar]
  • 40.Stein JD, Lum F, Lee PP, Rich WL, 3rd, Coleman AL. Use of health care claims data to study patients with ophthalmologic conditions. Ophthalmology. 2014;121(5):1134–41. doi: 10.1016/j.ophtha.2013.11.038. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES