Abstract
Rationale:
Comorbid insomnia and sleep apnea (COMISA) is reported to have worse outcomes than either condition alone. The local genetic correlations of these disorders is unknown.
Objectives:
To identify local genomic regions with heritability for clinically diagnosed sleep apnea (SA) and insomnia; and identify local genetic correlations between these disorders and/or hypersomnia.
Methods:
50,217 patients of European ancestry were examined. Global and local heritability and genetic correlations for independent regions were calculated adjusting for obesity and other covariates.
Results:
SA and insomnia were significantly globally heritable and had 118 and 168 genetic regions with local heritability p-values < 0.05 respectively. One region had a significant genetic correlation for SA and hypersomnia (p-value = 9.85 × 10−4).
Conclusions:
Clinically diagnosed SA and insomnia have minimal shared genetic architecture, supporting genetically distinct COMISA components. However, additional correlated regions may be identified with additional sample size and methodological improvements.
Keywords: Sleep apnea, Insomnia, COMISA, Genetics, Heritability
Introduction
Sleep apnea (SA) is a common disease characterized by repetitive upper airway obstructions, leading to intermittent hypoxemia and increased risk of a range of disorders [1]. SA may present both with insomnia symptoms (comorbid insomnia and sleep apnea termed “COMISA” [2]) or hypersomnia symptoms. Patients with comorbid SA presentations may conceivably have distinct comorbidity profiles and/or SA endophenotypes when compared with patients presenting with SA without insomnia or hypersomnia. SA as measured through objective recordings is heritable, with a family-based estimate of ≈ 0.35, over half of which is explained by non-obesity pathways [3]. Multiple genetic associations with SA have been identified, e.g. [4]. Complex trait analyses may require >250,000 participants to fully discover grouped pathway effects [5]. Electronic health record studies may boost study power, but issues including data completeness, billing diagnoses, and the ability of ICD codes to fully describe complex phenotypes must be addressed [6]. We have demonstrated a SA phenotyping algorithm based on natural language processing (NLP) with improved positive predictive value in chart reviews [7] that may provide utility in large-scale genetic analyses.
Insomnia is also a common disorder with a significant genetic basis [8]. Recent epidemiological studies have identified increased mortality risks for individuals with COMISA compared to individuals with OSA [9-11]. Insomnia hazard ratio point estimates were sometimes higher compared to OSA, underscoring a need to further characterize the potential relationship between insomnia and SA. Global genetic correlation analyses are one means of understanding heritable and shared genetic architecture between two traits. Whether these clinically overlapping phenotypes map to distinct genetic regions is unknown. Genome-wide genetic relationships between pairs of traits may be nuanced, and local interactions may differentially contribute to risk [12]. Local genetic correlation analyses can be used to identify and prioritize specific regions with positive or negative genetic heritability correlations between two traits that may be obscured when performing global genetic correlation analyses.
In this study, we performed global and local genetic heritability analyses of SA and insomnia considering obesity and other covariates in a large clinical biobank. We consider potential local genetic correlations between the two disorders. Since cluster analyses indicate increased risk for patients with SA who report sleepiness [13,14], we also evaluated genetic correlations between SA and hypersomnia.
Methods
This study was approved by the Mass General Brigham Institutional Review Board (2021P013546). Studied participants from the MGB biobank [15] were aged ≥ 18 with European ancestry with a minimum data floor of three clinical notes, diagnoses, and encounters and available age, body mass index (BMI), and sex at birth information.
We performed NLP on clinic notes, extracted relevant SA ICD diagnoses (seven ICD9 and four ICD10 codes, Table S1, and phenotyped SA as per our published algorithm [7], excluding patients with SA ICD codes who were classified as controls by the algorithm. We used grouped ICD codes and used a semi-supervised NLP algorithm to classify patients with insomnia (eight ICD9 and ten ICD10 codes) and hypersomnia (eleven ICD9 and twelve ICD10 codes, Table S1) [16,17]. Insomnia and hypersomnia NLP terms matched the text of relevant ICD diagnoses.
Age was set to the first diagnosis date for a given case or the last encounter date for a control. BMI was set as the average of the two measurements from the two closest dates to the assigned age. Current and former smoking status was assigned based on a questionnaire administered by the MGB biobank supplemented by tobacco use disorder diagnoses (Table S1).
Biobank participants were genotyped using Illumina MEGA and GSA arrays and imputed to the NHLBI Trans-Omics for Precision Medicine (TOPMed) panel using the TOPMed Imputation Server [18]. Imputed variants with a minimum r2 of 0.95 and present in high-coverage whole-genome sequencing of 1000 Genomes European ancestry participants at 0.5% minor allele frequency were retained. All regions are listed using Build 38 coordinates.
Global heritability analyses were performed using PCGC and LDAK [19,20], adjusting for age, sex at birth, BMI, and 10 population principal components [21], both with and without smoking status to evaluate potential heritability improvements in this clinically diagnosed sample. LAVA was used to calculate local genetic heritability analyses [22]. Bivariate genetic correlation analyses were conducted for independent genomic regions. Genetic correlations considering SA versus insomnia and hypersomnia were performed for loci with univariate heritability p-values < 0.05.
Results
Sample characteristics
We examined 50,225 adult patients with European ancestry. The median (interquartile range) of age was 61.2 (24.1) and BMI was 27.3 (7.8). 54.2% of this sample were women, with 17.6% classified as current smokers and 20.2% classified as former smokers. The prevalence rates of NLP-defined clinically diagnosed SA, insomnia, hypersomnia, and COMISA were 8.3%, 7.4%, 1.4%, and 1.3% respectively. Patients with both insomnia and SA diagnoses on average had later insomnia diagnoses (men median 1.1 years later, IQR 5.2 years; women 0.2 years later, IQR 5.5 years), although a large portion of the patients had earlier insomnia diagnoses (Figure S1).
Global heritability estimates
Global heritability estimates are listed in Table S2. SA was significantly heritable (h2g [SE] 0.168 [0.067], p = 2.79 × 10−5), as was insomnia (h2g [SE] 0.077 [0.031], p = 1.24 × 10−4). Hypersomnia, which was much rarer, had a non-significant heritability estimate of 0.126 (SE 0.102; p = 0.08). Adjustment for smoking did not appreciably change the heritability estimates.
Local heritability estimates and genetic correlations
Local heritability analyses can highlight specific regions of biological interest, even when genome-wide effects are largely neutral. We calculated local genetic heritability estimates for SA and insomnia in 2,818 autosomal regions. Associated regions are shown in Table S3. Lead results among the 118 SA-associated regions and 168 insomnia-associated regions are summarized in Table 1. The most heritable region was on chromosome 5 (81,174,678 – 82,414,318 SA h2g p-value = 1.86 × 10−11).
Table 1.
Lead sleep apnea and insomnia local genetic heritability regions.
| Phenotype | Chromosome | Start | End | h2g p- value |
Protein-coding Genes in Region |
|---|---|---|---|---|---|
| Sleep apnea | 5 | 81,174,678 | 82,414,318 | 1.86 × 10−11 | ACOT12, ATG10, CKMT2, RASGRF2, RPS23, SSBP2, ZCCHC9 |
| Sleep apnea | 3 | 168,549,584 | 169,880,814 | 2.67 × 10−6 | ACTRT3, LRRC31, LRRC34, LRRIQ4, MECOM, MYNN |
| Sleep apnea | 6 | 27,518,868 | 28,703,605 | 3.58 × 10−5 | GPX5, GPX6, H1-5, H2AC13, H2AC14, H2AC15, H2AC16, H2AC17, H2BC13, H2BC14, H2BC15, H2BC17, H3C10, H3C11, H3C12, H4C11, H4C12, H4C13, NKAPL, OR2B2, OR2B6, OR2B8P, PGBD1, ZBED9, ZKSCAN3, ZKSCAN4, ZKSCAN8, ZKSCAN8P1, ZNF165, ZSCAN12, ZSCAN16, ZSCAN23, ZSCAN26, ZSCAN31, ZSCAN9 |
| Sleep apnea | 8 | 85,490,346 | 86,552,020 | 1.36 × 10−4 | ATP6V0D2, CPNE3, PSKH2, RMDN1, SLC7A13, WWP1 |
| Sleep apnea | 6 | 11,217,931 | 11,775,181 | 2.36 × 10−4 | ADTRP, NEDD9, TMEM170B |
| Sleep apnea | 9 | 104,298,903 | 105,440,109 | 2.76 × 10−4 | ABCA1, NIPSNAP3A, NIPSNAP3B, OR13C2, OR13C3, OR13C4, OR13C5, OR13C8, OR13C9, OR13D1, OR13F1, SLC44A1 |
| Sleep apnea | 8 | 9,979,082 | 10,620,638 | 2.79 × 10−4 | MSRA, PRSS51, PRSS55, RP1L1 |
| Sleep apnea | 8 | 66,215,247 | 67,662,602 | 3.06 × 10−4 | ADHFE1, ARFGEF1, C8orf44-SGK3, COPS5, CPA6, CSPP1, MCMDC2, MYBL1, PPP1R42, RRS1, SGK3, TCF24, VCPIP1, VXN |
| Sleep apnea | 12 | 4,858,524 | 5,775,562 | 4.50 × 10−4 | ANO2, KCNA1, KCNA5, NTF3 |
| Sleep apnea | 1 | 158,930,438 | 159,762,876 | 8.92 × 10−4 | ACKR1, AIM2, APCS, CADM3, CRP, FCER1A, IFI16, OR10J1, OR10J3, OR10J4, OR10J5, PYDC5, PYHIN1 |
| Insomnia | 10 | 102,437,650 | 103,700,556 | 3.71 × 10−5 | ACTR1A, ARL3, AS3MT, ATP5MK, BORCS7, BORCS7-ASMT, C10orf95, CALHM1, CALHM2, CALHM3, CNNM2, CYP17A1, INA, MFSD13A, NEURL1, NT5C2, PCGF6, PDCD11, RPEL1, SFXN2, SH3PXD2A, SUFU, TAF5, TRIM8, WBP1L |
| Insomnia | 9 | 94,710,721 | 95,785,833 | 4.02 × 10−5 | AOPEP, FANCC, PTCH1 |
| Insomnia | 2 | 1,377,422 | 1,991,457 | 4.73 × 10−5 | MYT1L, PXDN, TPO |
| Insomnia | 2 | 51,442,356 | 52,267,286 | 1.26 × 10−4 | |
| Insomnia | 11 | 11,337,460 | 12,128,955 | 1.65 × 10−4 | CSNK2A3, DKK3, GALNT18, MICAL2, USP47 |
| Insomnia | 4 | 148,808,693 | 149,733,447 | 1.88 × 10−4 | IQCM |
| Insomnia | 8 | 18,428,040 | 19,106,735 | 2.02 × 10−4 | PSD3 |
| Insomnia | 14 | 65,832,798 | 66,559,780 | 2.76 × 10−4 | CCDC196, GPHN |
| Insomnia | 7 | 128,263,781 | 129,144,853 | 7.51 × 10−4 | ATP6V1F, ATP6V1FNB, CALU, CCDC136, FLNC, GARIN1A, GARIN1B, HILPDA, IMPDH1, IRF5, KCP, METTL2B, OPN1SW, PRRT4, RBM28, TNPO3, TSPAN33 |
| Insomnia | 1 | 58,073,339 | 58,847,823 | 8.69 × 10−4 | DAB1, JUN, MYSM1, OMA1, TACSTD2 |
All calculations adjusted for age, sex, body mass index, and 10 population principal components. Region borders were calculated by LAVA [22] using the Mass General Brigham data with Build 38 coordinates. Full results for nominally significant regions are provided in Table S3. h2g: single nucleotide variant-based heritability.
We calculated genetic correlations for regions with both univariate heritability p-values < 0.05 and performed equivalent SA-hypersomnia calculations. Six regions had overlapping insomnia and SA univariate heritability p-values < 0.05, and 9 regions had overlapping hypersomnia and SA univariate heritability p-values < 0.05 (Table S4). Within these 15 regions, one region had a significant genetic correlation for SA and hypersomnia (chr3:168,549,584 – 169,880,814; r2 = 0.35; p-value = 9.85 × 10−4).
Discussion
In this study, we used genetic analyses to begin to parse out the extent to which SA and its symptom-based presentations- COMISA and SA-hypersomnia represent genetically determined complex phenotypes or the co-occurrence of unique traits. We performed genome-wide local genetic heritability calculations considering clinically diagnosed sleep apnea and insomnia adjusted for obesity and other covariates in a large biobank and asked whether there may be a shared genetic architecture for comorbid insomnia and sleep apnea (COMISA). We also examined local genetic correlations between sleep apnea and hypersomnia, performed global heritability analyses, and examined the potential effects of adjustment for smoking. We identified regions with significant heritability, including regions with overlapping signals between SA and insomnia and SA and hypersomnia. We identified a single region that was significantly genetically correlated between SA and hypersomnia but did not identify significant local genetic correlations between SA and insomnia. These results indicate that COMISA is likely comprised of separate genetic contributions from SA and insomnia pathways, based on our current methods and sample size.
Accurate phenotyping using billing code information must address multiple challenges, including data completeness, biases arising from differential healthcare utilization, rule-out diagnoses, sparsity of individual ICD diagnoses and the ability of ICD codes to fully describe complex phenotypes [6]. We address these issues using healthcare utilization data floors and adjustments [7,17], grouped PheCode diagnoses, and NLP-informed phenotyping algorithms. Our SA phenotyping algorithm demonstrated improved positive predictive value compared to single-ICD case assignments using chart reviews [7] and here demonstrates elevated heritability relative to other biobank estimates [23], despite a smaller sample size, exclusion of some patients with uncertain case/control status, and adjustment for obesity and other covariates. Additional algorithm refinements that consider SA endotypes reflected in clinical note phrases and patterns of multimorbidity are possible and may further improve performance. Insomnia phenotyping would likely improve based on a similar strategy that separately considers the effects of early, middle, and late insomnia that may have heterogeneous genetic architecture and differential effects on comorbid disease risk.
The prevalence of insomnia, SA, and COMISA varies by study [9-11]. Insomnia, SA, and COMISA have reported prevalences of 6-36.5%, 23.4-53.5%, and 3-19.3% respectively. There are a number of design differences to consider when comparing the prevalences of these disorders, including but not limited to community versus hospital ascertainment, use of symptoms and/or AHI > 5 criteria, oversampling of self-reported snorers [24], and changing prevalences across years (e.g. [25]).
We identified multiple regions with local genetic heritability for SA and insomnia (Tables 1 and S3). Additional studies are required to identify if one or more protein coding genes or other mechanisms may be associated. Aggregating analyses into units and pathways is a pragmatic means of building power for analyses that will likely require hundreds of thousands of samples to saturate signals [5]. Local genetic correlations provide a means of identifying effects that may be obscured at a global level [22], improve power by reducing comparisons, and highlight regions of potential biological significance. Despite identifying regions where both SA and insomnia have local genetic heritability, we did not identify regions that have significant correlations between the two traits. LAVA has been used to identify local genetic correlations using smaller samples of trait pairs with strong biological plausibility [22], but we cannot rule out identifying shared genetic architecture between SA and insomnia in the future as we refine our methods and begin to meta-analyze results with other biobanks.
Strengths of this study include a large sample of clinically defined patients with SA and insomnia with NLP and healthcare utilization phenotyping refinements. We considered local genetic correlations between SA and insomnia for the first time. Limitations of this study include a focus on European-ancestry patients (our largest sample), and potential limits on study power. We will address these limitations as we refine our methods and begin to perform meta-analyses with data from other biobanks.
Conclusions
We have identified multiple regions of local genetic heritability for SA and insomnia. None of these regions have genetic correlations between SA and insomnia, indicating that the genetic architecture of COMISA as identified in medical records is likely comprised of separate SA and insomnia contributions.
Supplementary Material
Public Health Relevance Statement.
Insomnia and sleep apnea are both common, heritable diseases that increase the risk of a range of disorders. Some studies indicate that comorbid insomnia and sleep apnea (COMISA) increases mortality risks relative to the risks associated with either diseases alone. This could conceivably be due to shared genetic mechanisms that contribute to a distinct COMISA endotype. To our knowledge, this is the first study to examine local genetic architecture potentially shared between the two disorders. We identified novel genomic regions with clinically diagnosed sleep apnea and insomnia heritability but with minimal shared genetic architecture, supporting genetically distinct COMISA components.
Acknowledgments
The authors wish to thank the MGB Biobank and patients for providing health information data. This work was supported by funding from the National Heart, Lung, and Blood Institute (R01 HL153805, R35 HL135818).
Funding:
Brian Cade is supported by grants from the National Institutes of Health (R01-HL153805). Susan Redline is supported by grants from the National Institutes of Health (R35-HL135818).
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Competing Interest Statement: The authors declare no competing interest.
References
- 1.Peppard PE, Hagen EW. The Last 25 Years of Obstructive Sleep Apnea Epidemiology-and the Next 25? Am J Respir Crit Care Med. 2018;197:310–2. doi: 10.1164/rccm.201708-1614PP [DOI] [PubMed] [Google Scholar]
- 2.Sweetman AM, Lack LC, Catcheside PG, et al. Developing a successful treatment for co-morbid insomnia and sleep apnoea. Sleep Med Rev. 2017;33:28–38. [DOI] [PubMed] [Google Scholar]
- 3.Patel SR, Larkin EK, Redline S. Shared genetic basis for obstructive sleep apnea and adiposity measures. Int J Obes. 2008;32:795–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Cade BE, Lee J, Sofer T, et al. Whole-genome association analyses of sleep-disordered breathing phenotypes in the NHLBI TOPMed program. Genome Med. 2021;13:136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Yengo L, Vedantam S, Marouli E, et al. A saturated map of common genetic variants associated with human height. Nature. 2022;610:704–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Liao KP, Cai T, Savova GK, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ. 2015;350:h1885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Cade BE, Hassan SM, Dashti HS, et al. Sleep apnea phenotyping and relationship to disease in a large clinical biobank. JAMIA Open. 2022;5:ooab117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Van Someren EJW. Brain mechanisms of insomnia: new perspectives on causes and consequences. Physiol Rev. 2021;101:995–1046. [DOI] [PubMed] [Google Scholar]
- 9.Lechat B, Appleton S, Melaku YA, et al. Comorbid insomnia and sleep apnoea is associated with all-cause mortality. Eur Respir J. 2022;60:2101958. [DOI] [PubMed] [Google Scholar]
- 10.Lechat B, Loffler KA, Wallace DM, et al. All-Cause Mortality in People with Co-Occurring Insomnia Symptoms and Sleep Apnea: Analysis of the Wisconsin Sleep Cohort. Nat Sci Sleep. 2022;14:1817–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sweetman A, Lechat B, Appleton S, et al. Association of co-morbid insomnia and sleep apnoea symptoms with all-cause mortality: Analysis of the NHANES 2005-2008 data. Sleep Epidemiol. 2022;2:100043. [Google Scholar]
- 12.Goodman MO, Cade BE, Shah NA, et al. Pathway-Specific Polygenic Risk Scores Identify Obstructive Sleep Apnea-Related Pathways Differentially Moderating Genetic Susceptibility to Coronary Artery Disease. Circ Genomic Precis Med. 2022;15:e003535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mazzotti DR, Keenan BT, Lim DC, et al. Symptom Subtypes of Obstructive Sleep Apnea Predict Incidence of Cardiovascular Outcomes. Am J Respir Crit Care Med. 2019;200:493–506. doi: 10.1164/rccm.201808-1509OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Labarca G, Dreyse J, Salas C, et al. A Validation Study of Four Different Cluster Analyses of OSA and the Incidence of Cardiovascular Mortality in a Hispanic Population. Chest. 2021;160:2266–74. [DOI] [PubMed] [Google Scholar]
- 15.Boutin NT, Schecter SB, Perez EF, et al. The Evolution of a Large Biobank at Mass General Brigham. J Pers Med. 2022;12:1323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wu P, Gifford A, Meng X, et al. Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation. JMIR Med Inform. 2019;7:e14325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Liao KP, Sun J, Cai TA, et al. High-throughput multimodal automated phenotyping (MAP) with application to PheWAS. J Am Med Inform Assoc JAMIA. 2019;26:1255–62. doi: 10.1093/jamia/ocz066 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Das S, Forer L, Schönherr S, et al. Next-generation genotype imputation service and methods. Nat Genet. 2016;48:1284–7. doi: 10.1038/ng.3656 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Weissbrod O, Flint J, Rosset S. Estimating SNP-Based Heritability and Genetic Correlation in Case-Control Studies Directly and with Summary Statistics. Am J Hum Genet. 2018;103:89–99. doi: 10.1016/j.ajhg.2018.06.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Zhang Q, Privé F, Vilhjálmsson B, et al. Improved genetic prediction of complex traits from individual-level data or summary statistics. Nat Commun. 2021;12:4192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhang D, Dey R, Lee S. Fast and robust ancestry prediction using principal component analysis. Bioinforma Oxf Engl. 2020;36:3439–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Werme J, van der Sluis S, Posthuma D, et al. An integrated framework for local genetic correlation analysis. Nat Genet. 2022;54:274–82. [DOI] [PubMed] [Google Scholar]
- 23.Strausz S, Ruotsalainen S, Ollila HM, et al. Genetic analysis of obstructive sleep apnoea discovers a strong association with cardiometabolic health. Eur Respir J. Published Online First: 10 December 2020. doi: 10.1183/13993003.03091-2020 [DOI] [PubMed] [Google Scholar]
- 24.Quan SF, Howard BV, Iber C, et al. The Sleep Heart Health Study: design, rationale, and methods. Sleep. 1997;20:1077–85. [PubMed] [Google Scholar]
- 25.Ford ES, Cunningham TJ, Giles WH, et al. Trends in insomnia and excessive daytime sleepiness among U.S. adults from 2002 to 2012. Sleep Med. 2015;16:372–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
