Abstract
Individuals with autism spectrum disorder (ASD) have heterogeneous comorbid conditions. This study examined whether comorbid conditions in ASD are associated with polygenic risk scores (PRS) of ASD or PRS of comorbid conditions in non-ASD specific populations. Genome-wide single nucleotide polymorphism (SNP) data were obtained from 1386 patients with ASD from the Autism Genetic Resource Exchange (AGRE) study. After excluding individuals with missing clinical information concerning comorbid conditions, a total of 707 patients were included in the study. A total of 18 subgroups of comorbid conditions (‘topics’) were identified using a machine learning algorithm, topic modeling. PRS for ASD were computed using a genome-wide association meta-analysis of 18,381 cases and 27,969 controls. From these 18 topics, Topic 6 (over-represented by allergies) (p = 1.72 × 10−3) and Topic 17 (over-represented by sensory processing issues such as low pain tolerance) (p = 0.037) were associated with PRS of ASD. The associations between these two topics and the multi-locus contributors to their corresponding comorbid conditions based on non-ASD specific populations were further explored. The results suggest that these two topics were not associated with the PRS of allergies and chronic pain disorder, respectively. Note that characteristics of the present AGRE sample and those samples used in the original GWAS for ASD, allergies, and chronic pain disorder, may differ due to significant clinical heterogeneity that exists in the ASD population. Additionally, the AGRE sample may be underpowered and therefore insensitive to weak PRS associations due to a relatively small sample size. Findings imply that susceptibility genes of ASD may contribute more to the occurrence of allergies and sensory processing issues in individuals with ASD, compared with the susceptibility genes for their corresponding phenotypes in non-ASD individuals. Since these comorbid conditions (i.e., allergies and pain sensory issues) may not be attributable to the corresponding comorbidity-specific biological factors in non-ASD individuals, clinical management for these comorbid conditions may still depend on treatments for core symptoms of ASD.
Subject terms: Genetics, Medical research, Risk factors, Signs and symptoms
Introduction
Psychiatric and medical comorbidities are a norm rather than an exception in autism spectrum disorder (ASD); a complex neurodevelopmental disorder characterized by social communication deficits and restricted/repetitive behaviors1. The importance of understanding medical comorbidities of ASD cannot be understated2. Appropriate management of comorbid medical conditions may lead to quality of life improvement for both children and their parents3. In this regard, understanding of shared etiologies—including potential genetic factors4,5—for ASD and comorbid conditions may be critical in management decisions. While research into the medical comorbidities of ASD has been ongoing2, research into the genetic bases of ASD’s psychiatric comorbidities is only now getting underway6–8. Understanding how genetic factors contribute to the comorbidities may provide novel insight into molecular mechanisms underlying heterogeneous clinical features of individuals with ASD. Such genetic components could be used to subgroup patients with ASD to generate clinical subtypes that reflect biological differences, which might provide opportunities for individualized treatment options9.
Prior studies that have investigated genetic contributions for comorbidities in ASD have implemented several different approaches. For instance, David et al. conducted an automated extraction of genes associated with ASD and its comorbid disorders, finding 1031 genes associated with ASD—262 of these genes were involved in ASD only—while the remaining 779 genes were also associated with other comorbid disorders10. Their study results suggest that the majority of candidate genes for ASD have pleiotropic effects. Diaz-Beltran et al. used a two-fold systems biology approach to perform a comparative analysis of ASD with 31 frequently encountered comorbid disorders and determined a multi-comorbidity subtype of ASD, which led to the discovery of novel candidate genes of ASD11. Tylee et al. used data from previous genome-wide association studies (GWAS) to determine whether commonly varying single nucleotide polymorphisms (SNPs) are shared between psychiatric and immune-related phenotypes, and found that ASD is most likely to correlate with allergy rather than all other major psychiatric disorders12. There is also evidence to suggest that ASD is often comorbid with the perception of pain, with ASD patients having a higher threshold of pain13,14. In this regard, Johnson et al. conducted a linkage disequilibrium score regression using GWAS associations for ASD and chronic pain to show that the two traits are genetically correlated15. Finally, a recent study used polygenic risk scores (PRS) derived from five psychiatric disorders, such as schizophrenia, major depressive disorder, attention deficit hyperactivity disorder, obsessive–compulsive disorders, and anxiety, and found that the polygenic contributions could distinguish Asperger syndrome (a diagnostic category in the Diagnostic and Statistical Manual (DSM) 4th Edition although this term is no longer used in the updated DSM-5) from individuals with other non-Asperger subtypes of ASD16,17.
Despite these advances in understanding the relationships between the genetic bases for ASD and associated comorbidities, it is clear that these approaches require further development to fully understand the role of comorbid genetic risk within ASD. In the present study, we used topic modeling—a spatial clustering process tolerant of feature sparsity (for e.g., diagnostic features in low prevalence medical conditions)—to identify clusters of comorbid conditions. Similar methods were utilized in an earlier study by McCoy and colleagues to investigate which types of comorbid conditions are attributable to polygenic loading of major depressive disorder18. The authors observed that using standard phenome-wide association studies (PheWAS) to test across all possible predictors such as individual risk variants or genome-wide variants increases Type I error as the process involves testing against all diagnostic codes (e.g. 1500+ codes), or Type II error as the approach needs to correct for all these codes. This approach also does not take into consideration the correlation between individual diagnosis codes and the inconsistency and lack of reliability of the codes per se. By first using topic modeling on a corpus of ICD-9 diagnostic codes, McCoy et al. were able to reduce the dimensionality of possible MDD comorbidity and thereby also mitigate risk of imprecision when examining associations with genetic indicants. However, it is important to note several limitations to this approach including that it cannot distinguish whether depression and associated co-morbidities were caused by shared versus unique but convergent genetic factors. Given that the number of topics was chosen arbitrarily, it is likely that an alternative number of topics would yield different results in terms of cluster composition. Moreover, it is possible to optimize the selection of the number of topics to cluster upon using a range of semantic and statistical indices developed by researchers in the applied topic modeling literature.
The present study extends the topic modeling approach applied by McCoy and colleagues by (1) implementing a novel method for constructing document data from epidemiological data; (2) determining the number of topics was determined by a semi-unsupervised process that maximizes the trade-off between topic sensitivity and term specificity, and (3) jointly investigating associations between topics of ASD and three sets of PRSs of ASD and its clinically co-morbid conditions of interest, namely, chronic pain and allergies, respectively.
Methods
All methods were carried out in accordance with relevant guidelines and regulations.
Sample description
The discovery sample of ASD comprised the whole-genome genotypic data retrieved from the Autism Genetic Resource Exchange (AGRE), of which the subject recruitment has been described elsewhere19. Briefly, AGRE is a joint effort of the Cure Autism Now (CAN) Foundation and the Human Biological Data Interchange (HBDI). The diagnosis was made by all of the NIH autism collaborative groups using the Autism Diagnostic Interview-Revised (ADI-R)20 and the Autism Diagnostic Observational Schedule (ADOS)21. We have downloaded the clinical and SNP data (generated by the Affymetrix SNP 5.0 platform). We implemented the same data-cleaning algorithm used in the discovery sample. A total of 325,971 valid SNPs for 1387 subjects, 97.3% had European ancestry22, diagnosed with ASD were obtained. The final dataset consisting of 707 subjects with comorbid physical and psychiatric features as well as perinatal factors were used to examine the patterns of comorbid conditions in the present study.
Data preparation
Data preparation was performed using the R software v3.6.123 within the RStudio integrated development environment24. Full reproducible code and session information is available upon request.
Pseudo-EMRs
Documents were constructed as electronic medical record (EMR) analogues on the basis of a multistep process. Each psychiatric feature (for e.g., symptoms, diagnoses, historical presentations) was considered a candidate term for inclusion in the ‘pseudo-EMR’ for each subject. Categorical data were collapsed into the presence versus absence of features. Continuous data were then dichotomized using k-means clustering (i.e. k = 2)25,26. Some continuous features required special consideration in relation to dichotomization, for e.g., features relating to developmental delays for which delay thresholds were drawn from clinical literature. Features were excluded where counts per feature were less than or equal to 1, as these were considered prohibitively sparse. Following dichotomization, psychiatric features were transcoded into labels indicating the presence of features by subject. Hyphenation was used to coerce features with complex clinical descriptions into single “terms” (e.g., ‘floppy infant’ became ‘infant-floppy’). This is done to ensure that subsequent topic modeling did not tokenize features in ways disrupting correspondence with the observed data.
Topic modelling
The optimal number of topics to model was explored prior to topic modeling27. Briefly, topic modeling, such as Latent Dirichlet Allocation (LDA), is used to identify ‘topics’ (i.e., clusters of comorbid conditions in the current study) that occur in a collection of documents28. Accordingly, a parallel process was run over a range of candidate models in order to evaluate the relationship between number of topics and model diagnostics according to recommendation by Chang et al. (Supplementary Fig. 1)29. The outcome of this procedure was the selection of 18 as the optimal number of topics to model given the observed data. LDA was then applied using Gibbs sampling with the following settings: 2000 iterations were performed with 500 iterations for thinning and a burn-in of 1000 iterations. Symmetric Dirichlet priors were applied to ensure topics would be well-separated; α = 0.1, β = 0.0130. A threshold of 16 terms (i.e., comorbid conditions) per topic was chosen following inspection of the resulting topic models (for term-to-topic probabilities, see Fig. 1. Topic modeling was conducted using the ‘topicmodels’ package v0.1231.
Genetic association analyses
Genotype and imputation
All subjects were genotyped using the Affymetrix GeneChip Human Mapping 500 K Array. We retained subjects with genotyping call rates exceeding 90% and single nucleotide polymorphisms (SNPs) with a call rate of 90% or greater, and Hardy–Weinberg equilibrium p-value > 1 × 10−6. We remapped the raw genotype data from the GRCh35 to GRCh37 and conducted quality control by removing SNPs (1) with ambiguous alleles, (2) with > 0.2 allele frequency difference from the reference panel, and (3) not available in the reference panel using the Pre-imputations checks toolbox (https://www.well.ox.ac.uk/~wrayner/tools/) and the 1000 Genome European reference panel. Genotypes were next imputed using the Michigan Imputation Server implementing Minimac432, based on the European subset from the 1000 Genomes Phase 3 v5 (GRCh37/hg19) as reference panel with an imputation filter of R2 > 0.3. Phasing of haplotypes was conducted using ‘Eagle’ v2.433.
Polygenic risk (PRS) calculation
We generated polygenic risk scores (PRS) for ASD (n = 18,381 cases and 27,969 controls)34, allergic disease (n = 180,128 cases and 180,709 controls)6 and chronic pain (n = 387,649)15, using seven tranches of SNPs (1 × 10−2, 1 × 10−3, 1 × 10−4, 1 × 10−5, 1 × 10−6, 1 × 10−7, 5 × 10−8, labelled as S2–S8) drawn from recent GWAS. The value for each p-value tranche represents the maximum p-value that is included in that tranche. This list was linkage-disequilibrium pruned using the ‘clump()’ function as implemented in ‘PLINK’ v1.9, with a 250 kb window and minimum R2 set at 0.5 by default35.
Statistical analyses
The relationship between PRS tranches and topics across ASD, allergic disease and chronic pain were tested using a linear regression model. All model topics were inverse normal transformed prior to analysis because they were not normally distributed. Covariates included age, sex, and the first five genetic principal components. We conducted analysis of variance (ANOVA) tests to compare the fits of models. Linear regression and ANOVA tests were conducted using the R language v3.6.123 within the RStudio IDE32. Further, we estimated the proportion of the variance of the dependent variable (i.e., topics of comorbid conditions) that could be explained by each predictor (i.e., PRS specific to ASD, PRS specific to the corresponding topic, age, and gender) using partial and semi-partial correlation coefficients of a specified predictor—which were used to compare relative contributions of PRS specific to different phenotypes to the topics. Finally, we examined genetic correlations between ASD and comorbid conditions associated with PRS specific to ASD based on the lists of candidate genes extracted using the web tool, Genepanel.iobio36. The correlation was inferred based on the probability of detecting significant phenotype–phenotype associations by random chance calculated using the hypergeometric distribution37.
Results
We found a strong association between PRSASD and the Topic 6, the main allergy factor (the first five terms: past-diet, allergies, gluten-free, abnormal-gastro, casein-free) (Fig. 2). The strongest associated PRSASD tranche S5 (p = 1.72 × 10−3) accounted for 1.41% of the variance in the Topic 6. We also found that Topic 17 (first 5 terms: handed-right, low-acoustic, low-pain, low-tactile, abnorm-skin, diet-preferences; minimum p = 0.037 for S4) accounted for 0.80% of the variance. Topic 14 was over-represented by maternal substance use and hence was excluded from the subsequent analyses (see Supplementary Table 1 for full results).
We further assessed the association between Topic 6 and PRSallergy and found no evidence for an association (minimum p = 0.083 for PRSallergy S2). When fitting both PRSASD and PRSallergy into the model, the strengths of the association for the two PRSs slightly increased (PRSASD S5: p = 1.48 × 10−3; PRSallergy S2: p = 0.070), but the overall model fit did not improve compared against the PRSASD only model (pANOVA = 0.070). Similarly, no evidence was found for an association between the Topic 17 and PRSpain (minimum p = 0.052 for S5). However, combining PRSASD and PRSpain improved the prediction of Topic 17 (pANOVA = 0.031; the two PRSs together accounts for 1.8% of the variance) and boosted the strength of associations for each of the two PRSs (PRSASD S4: p = 0.022; PRSpain S5: p = 0.031). Overall, PRSASD contributed to a greater degree to the variance in topic 6 than PRSallergy, and PRSASD contributed to a greater degree to the variance in Topic 17 than PRSpain (see Fig. 3).
We have identified 681 candidate genes for ASD, 374 candidate genes for allergies, and 346 candidate genes for pain-related disorder (e.g., abnormality in pain sensory perception or impaired pain sensory perception) (see Fig. 4). There were 59 overlapping genes between the lists of candidate genes for ASD and allergies, and 135 overlapping genes between the lists of candidate genes for ASD and pain-related disorder. The p-value for the correlation between ASD and allergies was 1.19 × 10−27, while the p-value for the correlation between ASD and pain-related disorder was 1.11 × 10−117.
Discussion
The current study shows that polygenic loading in ASD may play a larger role in certain subgroups of comorbid conditions in ASD, such as allergy-related conditions, and sensory processing issues (e.g., low pain tolerance) than other subgroups of comorbid conditions. We further found that these two subgroups (i.e., topics) of comorbid conditions could not be attributed to PRS of either allergies or chronic pain disorder. These findings suggest that ASD-associated genetic variants could contribute to ASD-related comorbidities including allergies and sensory processing issues such as low pain tolerance. In other words, these two types of comorbidities may share a proportion of risk variants with ASD. Notably, we found that both of these two subgroups (i.e., topics) of comorbid conditions in individuals with ASD were not correlated with their corresponding PRS constructed using GWAS of non-ASD specific individuals.
Nevertheless, although both pain and pain sensory issues may share susceptibility genes with ASD38,39, our findings suggest that ASD may be more likely to correlate with abnormality in pain tolerance at the gene level compared to allergies. This is consistent with our findings that PRSpain might contribute, at least to a slightly higher degree, to the corresponding topic enriched with sensory processing issues than that of allergies and that combining PRSASD and PRSpain strengthened the association with the corresponding comorbid conditions. These findings, to our knowledge, have not yet been published in other studies.
The current study has several limitations. First, characteristics of the present AGRE sample and those samples used in the original GWAS for ASD, allergies, and chronic pain disorder, may differ. For example, there may be significant clinical heterogeneity in the ASD population. This is further compounded by the significant changes in the diagnostic classification that has occurred in the past decade that would have potentially changed the ascertainment and phenotypic characterization of ASD. For example, in 2013 the Diagnostic and Statistical Manual 5th edition combined the different subcategories such as Asperger syndrome and disintegrative disorder etc. and collapsed it into a single category of ASD and further changes were made to the diagnostic criteria such as the inclusion of sensory issues16, all of which would have impacted the ascertainment of the sample and the phenotypic characterization. Second, the clinical data that relied on self-report information, might not provide the most accurate or robust information about comorbid conditions due to false clinical assumptions. This may hence lead to biased results that make it difficult to replicate the findings in independent populations. This could partially contribute to the failure to observe allergy-related and pain-related polygenic loads associated with the allergy-related topic and pain-related topic, respectively, in our sample. Further, based on the developmental stage in which the sample was recruited there may be differences in the phenotypic characterization as development is a dynamic process and phenotypic manifestations may emerge later in life or change over the life course. Therefore, the reference GWAS studies of ASD, allergies, and chronic pain disorder, which were based on samples, might not provide unbiased references for the test sample of ASD based on pediatric samples. Specifically, some comorbid conditions may only manifest after certain ages and hence reference effect sizes based on the adult sample may lead to biased prediction using PRS values in the test sample consisting of children. The null association between PRSpain and Topic 17 may indicate that age also played a substantial role in Topic 17; a cluster of comorbid conditions over-represented by sensory processing issues such as low pain tolerance, despite adjusting for age in the model. Second, each of the topics of comorbidities refers to a cluster of various correlated but distinct phenotypes, where the genetic architectures may not be well captured by PRS derived from a GWAS study of one single phenotype. Third, predictive values of PRS may substantially decrease if training (i.e. reference cohort) and testing data (i.e. scored cohort) sets are drawn from different populations40. Fourth, the AGRE sample may be underpowered to detect weak PRS associations due to its small sample size.
Nonetheless, our research has several noteworthy strengths. First and foremost, we successfully applied the methodology employed by McCoy and colleagues to comorbid conditions associated with ASD, which is an arguably more complex psychiatric category for diagnosis and detection than MDD41,42; the domain of mental illness in which this approach was first tested. Second, our implementation used a semi-supervised approach to the selection for the optimal number of topics based upon recent advances in the applied topic modeling literature. Thirdly, the use of a phenotypically well characterized sample to identify ASD subgroups using topic modeling is a substantial strength of the study. Finally, the way in which pseudo-EMRs were generated on the basis of clinical/diagnostic data for the purpose of topic modeling is novel and may well be useful in other analytic contexts.
In summary, the current findings suggest that the occurrence of two subgroups of ASD-related comorbidities—allergies and chronic pain—may be driven by shared underlying genetic risk factors for ASD. Notably, these two types of comorbid conditions could not be attributable to genetic variants associated with either allergies or chronic pain disorder in non-ASD populations. Such findings suggest that genetic mechanisms of certain comorbid conditions such as allergies and sensory processing issues in patients with ASD could differ from those of the non-ASD population. Further, it is possible that there is a subgroup where the specific co-morbidities may indicate an alternate converging process such as a common immune pathway43. In this regard, immune system deficiencies and immune dysregulation in ASD may result in a wide variety of co-morbidities such as allergic sensitivities, asthma, rashes, gastro intestinal and skin sensitivities as well as sensory issues. Thus, the findings of genetic contributors for comorbid conditions in ASD may inform clinical management strategies. For example, treatment of comorbid allergies in persons diagnosed with ASD may still depend on the clinical management of core symptoms of ASD rather than allergy-specific therapies despite previously reported genetic correlations between ASD and allergies12. On the other hand, the pharmacological treatment for ASD may need to take underlying immune related issues into account. Further, although emerging evidence suggests that peripheral somatosensory neurons are involved in tactile-related phenotypes in ASD44,45, genetic variants associated with pain disorder seems at best to only play a limited role in abnormalities in pain sensory perception, in children with ASD. This deserves further exploration through research involving larger population samples as better understanding of the underlying pathogenetic mechanisms involved in comorbid conditions in ASD may have clinical implications in the comprehensive assessment and management of these patients.
Supplementary Information
Acknowledgements
We appreciate all patients and families that provided their clinical data and biospecimen for this collaborative multi-site database. This project was not supported by any grant.
Author contributions
P.-I.L. conceptualized the study. L.K., S.D., L.-D.H., and P.-I.L. drafted the manuscript. L.K., S.D., L.-D.H., and P.-I.L. conducted analyses. P.-I.L. provided access to the data. V.E. provided expert advice and review of clinical implications. All authors revised and approved of the manuscript.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Louis Klein and Shannon D’Urso.
These authors jointly supervised this work: Liang-Dar Hwang and Ping-I Lin.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-022-07399-7.
References
- 1.Hodges H, Fealko C, Soares N. Autism spectrum disorder: Definition, epidemiology, causes, and clinical evaluation. Transl. Pediatr. 2020;9:S55–S65. doi: 10.21037/tp.2019.09.09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Al-Beltagi M. Autism medical comorbidities. WJCP. 2021;10:15–28. doi: 10.5409/wjcp.v10.i3.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Isaksen J, et al. Children with autism spectrum disorders—The importance of medical investigations. Eur. J. Paediatr. Neurol. 2013;17:68–76. doi: 10.1016/j.ejpn.2012.08.004. [DOI] [PubMed] [Google Scholar]
- 4.Ramaswami G, Geschwind DH. Genetics of autism spectrum disorder. In: Geschwind DH, Paulson HL, Klein C, editors. Handbook of Clinical Neurology, Vol. 147 of Neurogenetics, Part I, Chap. 21. Elsevier B. V.; 2018. pp. 321–329. [DOI] [PubMed] [Google Scholar]
- 5.Yoo H. Genetics of autism spectrum disorder: Current status and possible clinical applications. Exp. Neurobiol. 2015;24:257–272. doi: 10.5607/en.2015.24.4.257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Autism Spectrum Disorder Working Group of the Psychiatric Genomics Consortium et al. Identification of common genetic risk variants for autism spectrum disorder. Nat. Genet. 2019;51:431–444. doi: 10.1038/s41588-019-0344-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rylaarsdam L, Guemez-Gamboa A. Genetic causes and modifiers of autism spectrum disorder. Front. Cell. Neurosci. 2019;13:385. doi: 10.3389/fncel.2019.00385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Solberg BS, et al. Patterns of psychiatric comorbidity and genetic correlations provide new insights into differences between attention-deficit/hyperactivity disorder and autism spectrum disorder. Biol. Psychiatry. 2019;86:587–598. doi: 10.1016/j.biopsych.2019.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Masi A, DeMayo MM, Glozier N, Guastella AJ. An overview of autism spectrum disorder, heterogeneity and treatment options. Neurosci. Bull. 2017;33:183–193. doi: 10.1007/s12264-017-0100-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.David MM, et al. Comorbid analysis of genes associated with autism spectrum disorders reveals differential evolutionary constraints. PLoS ONE. 2016;11:e0157937. doi: 10.1371/journal.pone.0157937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Diaz-Beltran L, et al. Cross-disorder comparative analysis of comorbid conditions reveals novel autism candidate genes. BMC Genom. 2017;18:315. doi: 10.1186/s12864-017-3667-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tylee DS, et al. Genetic correlations among psychiatric and immune-related phenotypes based on genome-wide association data. Am. J. Med. Genet. 2018;177:641–657. doi: 10.1002/ajmg.b.32652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Clarke C. Autism spectrum disorder and amplified pain. Case Rep. Psychiatry. 2015;1–4:2015. doi: 10.1155/2015/930874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gu X, et al. Heightened brain response to pain anticipation in high-functioning adults with autism spectrum disorder. Eur. J. Neurosci. 2018;47:592–601. doi: 10.1111/ejn.13598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Johnston KJA, et al. Genome-wide association study of multisite chronic pain in UK Biobank. PLoS Genet. 2019;15:e1008164. doi: 10.1371/journal.pgen.1008164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders. 5. American Psychiatric Association; 2013. [Google Scholar]
- 17.González-Peñas J, et al. Psychiatric comorbidities in Asperger syndrome are related with polygenic overlap and differ from other Autism subtypes. Transl. Psychiatry. 2020;10:258. doi: 10.1038/s41398-020-00939-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.McCoy TH, et al. Polygenic loading for major depression is associated with specific medical comorbidity. Transl. Psychiatry. 2017;7:e1238. doi: 10.1038/tp.2017.201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Geschwind DH, et al. The autism genetic resource exchange: A resource for the study of autism and related neuropsychiatric conditions. Am. J. Hum. Genet. 2001;69:463–466. doi: 10.1086/321292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lord C, Rutter M, Le Couteur A. Autism diagnostic interview-revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. J. Autism Dev. Disord. 1994;24:659–685. doi: 10.1007/BF02172145. [DOI] [PubMed] [Google Scholar]
- 21.Lord C, et al. The autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism. J. Autism. Dev. Disord. 2000;30:205–223. doi: 10.1023/A:1005592401947. [DOI] [PubMed] [Google Scholar]
- 22.Exchange, A. G. R. AGRE: Autism Genetic Resource Exchange (2008).
- 23.RCore. R: A language and environment for statistical computing (2018).
- 24.RStudio. RStudio: Integrated development for R (2020).
- 25.Hartigan JA, Wong MA. Algorithm AS 136: A K-means clustering algorithm. Appl. Stat. 1979;28:100. doi: 10.2307/2346830. [DOI] [Google Scholar]
- 26.MacQueen, J. Some methods for classification and analysis of multivariate observations. in Proc. Fifth Berkeley Symposium on Mathematical Statistics Probability, Vol. 1, pp. 281–297 (1967).
- 27.De Battisti F, Ferrara A, Salini S. A decade of research in statistics: A topic model approach. Scientometrics. 2015;103:413–433. doi: 10.1007/s11192-015-1554-1. [DOI] [Google Scholar]
- 28.Blei D, Ng A, Jordan M. Latent Dirichlet allocation. J. Mach. Learn. Res. 2001;3:601–608. [Google Scholar]
- 29.Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei D. Reading tea leaves: How humans interpret topic models. Neural Inf. Process. Syst. 2009;32:288–296. doi: 10.5555/2984093.2984126. [DOI] [Google Scholar]
- 30.Tang, J., Meng, Z., Nguyen, X., Mei, Q. & Zhang, M. Understanding the limiting factors of topic modeling via posterior contraction analysis. in 31st International Conference on Machine Learning (ICML 2014), 190–198 (Stroudsburg, 2014).
- 31.Grün B, Hornik K. Topicmodels: An R package for fitting topic models. J. Stat. Soft. 2011 doi: 10.18637/jss.v040.i13. [DOI] [Google Scholar]
- 32.Fuchsberger C, Abecasis GR, Hinds DA. minimac2: Faster genotype imputation. Bioinformatics. 2015;31:782–784. doi: 10.1093/bioinformatics/btu704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Loh P-R, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 2016;48:1443–1448. doi: 10.1038/ng.3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.23andMe Research Team et al. Shared genetic origin of asthma, hay fever and eczema elucidates allergic disease biology. Nat Genet. 2017;49:1752–1757. doi: 10.1038/ng.3985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chang CC, et al. Second-generation PLINK: Rising to the challenge of larger and richer datasets. GigaSci. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Ekawade A, et al. Genepanel.iobio—An easy to use web tool for generating disease- and phenotype-associated gene lists. BMC Med. Genom. 2019;12:190. doi: 10.1186/s12920-019-0641-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Garcia-Albornoz M, Nielsen J. Finding directionality and gene-disease predictions in disease associations. BMC Syst. Biol. 2015;9:35. doi: 10.1186/s12918-015-0184-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Australian Asthma Genetics Consortium (AAGC) et al. Meta-analysis of genome-wide association studies identifies ten loci influencing allergic sensitization. Nat. Genet. 2013;45:902–906. doi: 10.1038/ng.2694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Brown CO, Uy J, Singh KK. A mini-review: Bridging the gap between autism spectrum disorder and pain comorbidities. Can. J. Pain. 2020;4:37–44. doi: 10.1080/24740527.2020.1775486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Gola D, et al. Population bias in polygenic risk prediction models for coronary artery disease. Circ. Genom. Precis. Med. 2020 doi: 10.1161/CIRCGEN.120.002932. [DOI] [PubMed] [Google Scholar]
- 41.Happé F, Ronald A, Plomin R. Time to give up on a single explanation for autism. Nat. Neurosci. 2006;9:1218–1220. doi: 10.1038/nn1770. [DOI] [PubMed] [Google Scholar]
- 42.Waterhouse L. Rethinking Autism. Elsevier; 2013. [Google Scholar]
- 43.Mead J, Ashwood P. Evidence supporting an altered immune response in ASD. Immunol. Lett. 2015;163:49–55. doi: 10.1016/j.imlet.2014.11.006. [DOI] [PubMed] [Google Scholar]
- 44.Orefice LL, et al. Peripheral mechanosensory neuron dysfunction underlies tactile and behavioral deficits in mouse models of ASDs. Cell. 2016;166:299–313. doi: 10.1016/j.cell.2016.05.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Orefice LL, et al. Targeting peripheral somatosensory neurons to improve tactile-related phenotypes in ASD models. Cell. 2019;178:867–886.e24. doi: 10.1016/j.cell.2019.07.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.