Skip to main content
Sleep logoLink to Sleep
. 2011 Nov 1;34(11):1469–1477. doi: 10.5665/sleep.1378

Identification of Causal Genes, Networks, and Transcriptional Regulators of REM Sleep and Wake

Joshua Millstein 1, Christopher J Winrow 2, Andrew Kasarskis 3, Joseph R Owens 4, Lili Zhou 4, Keith C Summa 4, Karrie Fitzpatrick 4, Bin Zhang 1, Martha H Vitaterna 4, Eric E Schadt 3, John J Renger 2, Fred W Turek 4,
PMCID: PMC3198032  PMID: 22043117

Abstract

Study Objective:

Sleep-wake traits are well-known to be under substantial genetic control, but the specific genes and gene networks underlying primary sleep-wake traits have largely eluded identification using conventional approaches, especially in mammals. Thus, the aim of this study was to use systems genetics and statistical approaches to uncover the genetic networks underlying 2 primary sleep traits in the mouse: 24-h duration of REM sleep and wake.

Design:

Genome-wide RNA expression data from 3 tissues (anterior cortex, hypothalamus, thalamus/midbrain) were used in conjunction with high-density genotyping to identify candidate causal genes and networks mediating the effects of 2 QTL regulating the 24-h duration of REM sleep and one regulating the 24-h duration of wake.

Setting:

Basic sleep research laboratory.

Patients or Participants:

Male [C57BL/6J × (BALB/cByJ × C57BL/6J*) F1] N2 mice (n = 283).

Interventions:

None.

Measurements and Results:

The genetic variation of a mouse N2 mapping cross was leveraged against sleep-state phenotypic variation as well as quantitative gene expression measurement in key brain regions using integrative genomics approaches to uncover multiple causal sleep-state regulatory genes, including several surprising novel candidates, which interact as components of networks that modulate REM sleep and wake. In particular, it was discovered that a core network module, consisting of 20 genes, involved in the regulation of REM sleep duration is conserved across the cortex, hypothalamus, and thalamus. A novel application of a formal causal inference test was also used to identify those genes directly regulating sleep via control of expression.

Conclusion:

Systems genetics approaches reveal novel candidate genes, complex networks and specific transcriptional regulators of REM sleep and wake duration in mammals.

Citation:

Millstein J; Winrow CJ; Kasarskis A; Owens JR; Zhou L; Summa KC; Fitzpatrick K; Zhang B; Vitaterna MH; Schadt EE; Renger JJ; Turek FW. Identification of causal genes, networks, and transcriptional regulators of REM sleep and wake. SLEEP 2011;34(11):1469-1477.

Keywords: Systems genetics, mouse, REM sleep, gene expression

INTRODUCTION

Sleep is a fundamental, evolutionarily conserved, homeostatically regulated process characterized behaviorally by decreased activity and reduced responsiveness to environmental stimulation. A rapidly growing body of evidence links sleep disturbances with a range of physical and psychiatric disease states, and sleep disorders are common in humans. Although sleep-wake traits are well-known to be under substantial genetic control,1 the specific genes and/or gene networks underlying primary sleep-wake traits have largely eluded identification using conventional approaches, especially in mammals. Studies exploring the genetic basis of sleep in rodents have typically either examined the effect of single genes or documented the role of individual genomic loci without identifying the specific gene(s) responsible for their effects.2 Recently, a systems biology approach was utilized to uncover candidate genes and gene networks responsible for regulating sleep in Drosophila.3 Similar approaches have led to significant advances in our understanding of the circadian timing system4,5; however, this methodology has not been applied to mammalian sleep.

In a previous study, we reported the identification of 52 Quantitative Trait Loci (QTL) for a variety of sleep-wake related traits in mice from a segregating [C57BL/6J × (BALB/cByJ × C57BL/6J*)F1]N2 cross (n = 269; while phenotype data were collected from 283 mice, full genotype data were obtained in 269 mice).6 Here, we combine genome-wide RNA expression data from 3 brain regions relevant for sleep-wake regulation (anterior cortex [referred to as “cortex” hereafter]), hypothalamus and thalamus/midbrain ([referred to as “thalamus” hereafter]) collected in a subset of these animals (n = 101) with high-density genotyping to identify candidate causal genes and networks that mediate the effects of three of these QTL regulating primary sleep traits: 2 for the 24-h duration of rapid eye movement (REM) sleep and one for the 24-h duration of wake (Table 1). The overall analysis involved 5 steps: (1) transcripts were identified that were regulated by the three 24-h sleep-state duration QTL; (2) the causal inference test (CIT)7 was used to identify those transcripts from each tissue most likely to mediate the effects of the QTL; (3) co-expression network analysis was conducted to identify modules of co-regulated genes within each tissue; (4) modules significantly enriched for genes from step 2 were identified; and (5) the CIT was used to reconstruct transcriptional regulatory networks around genes from step 2. This integrative systems genetics approach has allowed us to identify candidate genes, gene networks, and transcriptional regulators involved in the primary sleep traits of REM sleep and wake duration in the mouse.

Table 1.

Quantitative trait loci (QTL) for 24-hour REM sleep and wake duration

QTL Trait LOD QTL Peak
SNP ID
Start (MB) End (MB) Max (MB)
Q5@49 REM 4.22 4.09 137.03 73.50 rs13478324
Q13@2 Wake 5.14 4.69 83.45 16.72 rs13481708
Q13@23 REM 6.30 4.69 105.55 68.69 rs13481861

Peak boundaries are defined by a LOD score of 1 or less (see Winrow6 for full details).

MATERIALS AND METHODS

Mice and Tissue Samples

The electroencephalogram (EEG) and electromyogram (EMG) were recorded continuously over 48-h in 283 male mice from a segregating [C57BL/6J × (BALB/cByJ × C57BL/6J*)F1] N2 population (referred to as N2 hereafter) as described previously.6 The first 101 mice completing the in vivo study were euthanized, unanesthetized, by conscious decapitation 6-7-h after light onset, and dissected in a protocol that extracted thalamus, hypothalamus, and frontal cortex.8 Brain tissue samples were immediately flash-frozen in liquid nitrogen and stored at -80°C before being shipped to Rosetta Inpharmatics in a single batch. At the Rosetta Gene Expression Laboratory, mouse brain tissues were homogenized, and total RNA was extracted using Trizol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. All mice were housed and handled according to the Federal Animal Welfare guidelines, and all studies were approved in advance by the Animal Care and Use Committee at Northwestern University.

Sleep-Wake Recordings

At 10-12 weeks of age, mice were implanted with EEG/EMG recording electrodes as described previously.9 A 10-day recovery period was observed after surgery before sleep recording was initiated. Mice were individually housed in cylindrical (25.5 cm diameter) sleep recording cages with adlibitum access to food and water for ≥ 5 days to ensure acclimation. EEG/EMG data were collected continuously for 48-h starting at light onset.9 With the use of a custom software package (SleepReport, Actimetrics, Evanston, IL), EEG and EMG recordings were divided into 10-sec epochs and scored via visual inspection as wake, non-REM (NREM) sleep, or REM sleep.

Genotype Analysis

All DNA samples were genotyped on the Affymetrix MegAllele genotyping mouse 5K SNP panel (www.affymetrix.com/support/technical/datasheets/parallele_mouse5k_datasheet.pdf), which consists of approximately 5,500 SNPs evenly distributed across the genome with approximately 2,310 of them being informative for the C57BL/6J and BALB/cByJ inbred strains. Small tail biopsies were obtained from each mouse for genotyping. Tail tissue was stored frozen until DNA isolation, which was performed using the DNeasy Kit according to the manufacturer's instructions (Qiagen, Valencia, CA). After isolation, DNA was quantified for quality control by fluorometry using PicoGreen (Invitrogen, Carlsbad, CA) and stored at -20°C. It was shipped on dry ice, and the concentration was adjusted according to the manufacturer's instructions prior to genotyping.

Gene Expression Profiling

RNA preparation and array hybridizations were performed at Rosetta Inpharmatics. The custom inkjet microarrays were manufactured by Agilent Technologies (Palo Alto, CA). Each custom array consisted of 39,280 non-control oligonucleotides, constructed from sequence data extracted from the mouse Unigene clusters combined with RefSeq sequences and RIKEN full-length cDNA clones.10 Three micrograms of total RNA were reverse transcribed and labeled with either Cy3 or Cy5 fluorochrome. Labeled complementary RNA (cRNA) from each animal was hybridized against a pool of labeled cRNAs constructed from equal-mass aliquots of RNA from random N2 animals. The hybridizations were performed in fluor reversal for 24-h in a hybridization chamber, washed and scanned using a confocal laser scanner. Arrays were quantified on the basis of spot intensity relative to background, adjusted for experimental variation between arrays using average intensity over multiple channels and fitted to a previously described error model to determine significance (type I error).11 After excluding probes that contained known SNPs and probes that were not considered to be poly-A reliable, the resulting dataset included 28,053 probes.

Statistical Analysis

In the past few years, significant progress has been made in the analysis of large-scale genotype and gene expression datasets in the context of various complex traits in genetically segregating mouse populations. In particular, mathematical procedures and statistical tools have been developed to investigate the relationships between the variations in DNA sequence and mRNA transcript abundance.12 For example, Schadt and colleagues describe a multistep model selection procedure using complex trait, DNA sequence, and gene expression data to identify key drivers of the complex trait(s) of interest.12 The procedure determines if the relationships between DNA variation and transcript abundance statistically support causal, reactive, or independent models of regulation relative to the trait(s) under examination. In the causal model, the transcript is under the control of the DNA locus and acts upon the trait of interest. In the reactive model, the transcript responds to the trait of interest which is directly under the control of the DNA locus. In the independent model, both the transcript and the trait of interest are controlled by the DNA locus, but are done so independently of one another. This approach (see12 for a detailed description and examples), termed causal inference methodology, can be used in an unbiased manner to identify causal genes for complex traits of interest.

In the present study, all causal inference was conducted using the causal inference test (CIT), an omnibus test composed of a set of component hypothesis tests, which implies statistical consistency with causal mediation when all component tests are rejected.7 The most essential component test detects conditional independence between the QTL and the trait given the expression levels of the candidate causal gene. As implemented by Millstein et al., the CIT depends on a novel semi-parametric equivalence testing approach to estimate the non-centrality parameter of an F distribution.7 In the implementation of the CIT reported here, the conditional independence component test was fully nonparametric. That is, rather than estimating the non-centrality parameter, we used the empirical null distribution itself as described by Millstein et al., built from as many as 50,000 data points, to test the observed F statistic.7 The CIT was implemented as an R function that calls C++ routines (source code and precompiled R library for Linux is freely available for download, http://cran.r-project.org/). All enrichment tests were conducted in R using the Fisher exact test, based on the hypergeometric distribution.

Co-Expression Network Analysis

Significant progress has also been made in the development of analytical tools to investigate patterns of gene expression in different tissues. One such approach, termed weighted network analysis, uses correlations between individual gene pairs to create groups, or modules, of interacting genes in specific tissues. The weighted network analysis begins with a matrix of the Pearson correlations between all gene pairs, then converts the correlation matrix into an adjacency matrix using a power function: f(x) = x^β. The parameter β of the power function is determined in such a way that the resulting adjacency matrix (i.e., the weighted co-expression network) is approximately scale-free. To measure how well a network satisfies a scale-free topology, we use the fitting index 13 (i.e., the model fitting index R2 of the linear model that regresses log(p(k)) on log(k) where k is connectivity and p(k) is the frequency distribution of connectivity). The fitting index of a perfect scale-free network is 1. For this dataset, we select the smallest β which leads to an approximately scale-free network. The distribution p(k) of the resulting network approximates a power law: p(k) ∼ kγ. To explore the modular structures of the co-expression network, the adjacency matrix is further transformed into a topological overlap matrix. As the topological overlap between 2 genes reflects not only their direct interaction, but also their indirect interactions through all the other genes in the network, previous studies have shown that topological overlap leads to more cohesive and biologically meaningful modules. To identify modules of highly co-regulated genes, we used average linkage hierarchical clustering to group genes based on the topological overlap of their connectivity, followed by a dynamic cut-tree algorithm to cut clustering dendrogram branches into gene modules.

To distinguish between modules, each module is assigned a unique color identifier, with the remaining poorly connected genes colored gray. The hierarchical clustering over the topological overlap matrix (TOM) allows for identification of modules in a graphic format. In this type of map, the rows and the columns represent genes in a symmetric fashion, and the color intensity represents the interaction strength between genes. This connectivity map highlights the property that genes fall into distinct network modules, where genes within a given module are more interconnected with each other (blocks along the diagonal of the matrix) than with genes in other modules. There are several network connectivity measures, but a particularly important one is the within module connectivity (k.in). The k.in of a gene was determined by taking the sum of its connection strengths (co-expression similarity) with all other genes in the module that to which the gene belonged.

RESULTS

Sleep-Wake Analysis

EEG/EMG analysis over 48-h revealed considerable variability within the population of 283 mice in the segregating cross (Figure 1A); however, the amount of time spent in a single state was consistent between the first and second 24-h period for individual mice (Figure 1B).

Figure 1.

Figure 1

(A) The mean sleep-state duration at 2-h intervals over a 48-h period in mice maintained on a 14:10 light: dark (LD) cycle for wake (top), NREM sleep (middle) and REM sleep (bottom). Error bars represent the standard deviation, indicating a high degree of inter-individual variability in state length in this N2 population (n = 283). Shaded bars beneath the histogram indicate dark periods. (B) The duration in minutes of wake (top), NREM sleep (middle) and REM sleep (bottom) during the first (x-axis) and second (y-axis) 24-h periods of the 48-h recording for individual animals (represented as dots, n = 283). The range along either axis denotes the extensive variability within the population, while the significant correlation between the first and second day within individuals demonstrates sleep-state consistency over time, together indicating a strong genetic component to sleep-wake state duration.

Candidate Causal Sleep Gene Identification

Each of the 3 tissues was profiled for genome-wide RNA expression using microarrays (28,053 probes) in animals that were also genotyped with the high-density SNP panel (n = 101). Three SNPs were selected to represent the previously identified REM and wake duration QTL (Table 1).6 The 3 SNPs were chosen among all SNPs underlying each QTL, such that the position of those chosen SNPs corresponded to the maximum LOD scores of the respective QTL, thus representing the best estimate for the QTL position. All 3 SNPs were then tested against expression levels of all transcripts in the 3 tissues using the Kruskal-Wallis test and permutation-based significance thresholds with a false discovery rate (FDR) of < 0.01 (a total of 2329 met the significance criteria; see Supplemental Table S1), yielding sets of transcripts associated with the QTL (step 1 of the analysis). The FDR concept is a statistical construct developed to avoid the proliferation of false positive results without being overly stringent when large numbers of tests are conducted. Informally, the FDR is the expected number of false discoveries divided by the total number of discoveries. Thus, with an FDR of 0.01, if M tests were conducted and 100 of those tests were positive, then one of those 100 results would be expected to be a false positive. This approach revealed both cis- and trans-expression QTL (eQTL) (Supplemental Figures S1-S3), and differed from conventional eQTL methodology in that only genomic loci shown to regulate REM sleep and wake duration were examined to determine their role in the quantitative regulation of transcript expression. The CIT7 was then applied to identify which of these transcripts were most likely to mediate the QTL effects (step 2 of the analysis). Due to the conservative nature of this particular test,7 a permissive significance threshold of P < 0.1 was used, thus avoiding over-stringency. Accordingly, 65 genes were identified as statistically consistent with causal mediation (see Supplemental Table S2 for the full list) and are referred to hereafter as “candidate causal sleep genes” (CCSGs).

Weighted Gene Co-Expression Network Analysis and Testing for Enrichment in Candidate Causal Sleep Genes

Independently of the CIT to uncover CCSGs, Weighted Gene Co-Expression Network Analysis (WGCNA)13 was utilized to identify modules of highly correlated transcripts (described below) within each brain region in an unsupervised manner, thus revealing sets of similarly expressed genes (Figure 2), which, by implication, may be under the same regulatory controls. Spearman correlation estimates were generated to describe the magnitude of dependencies in gene expression across the tissues. In general, there was evidence of co-regulation of genes across tissues (Supplemental Figure S4), but interestingly, in the 3 distributions of tissue/tissue correlation estimates, a small mode occurred at approximately 0.75, indicating a group of genes that were highly co-regulated across the tissues. To investigate the functional relevance of these modules of co-expressed genes, tests for enrichments in CCSGs by tissue and trait were conducted using Fisher tests and a significance threshold of FDR < 0.01.14 For example, CCSGs for REM in the cortex were tested as a set for enrichment in cortex modules only. Four modules across the 3 tissues were significantly enriched in CCSGs for REM sleep and wake duration (Figure 2, Supplemental Table S3): in anterior cortex, the plum module was enriched for REM CCSGs (P = 1.3e-17); in hypothalamus, the deeppink module was enriched for REM CCSGs (P = 2.3e-6) and the maroon module was enriched for wake CCSGs (P = 1.8e-11); and in thalamus, the deeppink (coincidentally named) module was enriched in REM CCSGs (P = 2.6e-6). For clarity we will refer to the CCSG-enriched REM modules as CREMM-ctx, CREMM-hypo, and CREMM-thal, for cortex, hypothalamus, and thalamus, respectively.

Figure 2.

Figure 2

Weighted Gene Co-expression Network Analysis (WGNCA) for cortex (A), hypothalamus (B), and thalamus (C). Along the x- and y-axes are transcripts ordered according to hierarchical clustering using an adjacency metric. Axis color bars represent assignment of transcripts to modules of co-expressed genes, where the color name is used as the module identifier. The circled modules were found to be significantly enriched for candidate causal sleep genes (CCSGs) and are expanded in the blown-up squares designated as plum (A), deeppink and maroon (B), and deeppink (C). For this analysis, genes were filtered such that the top 50% of most varying transcripts were included.

Though there was no overlap between CCSG sets identified for REM in each of the 3 tissues, these sets nevertheless mapped to modules of co-expressed genes that themselves were highly overlapped between the tissues. That is, though the intersection between the co-expression modules in the 3 tissues was large, the individual genes that were members of that intersecting set were not CCSGs. The overlap between REM modules was highly significant (CREMM-ctx:CREMM-hypo, P = 6.5e-51; CREMMctx:CREMM-thal, P = 4.4e-78; CREMM-hypo:CREMM-thal, P = 1.4e-66), with the intersection between all CCSG-enriched REM modules totaling 20 genes (Figure 3), the majority of which were cis-regulated by the chromosome 5 locus (Supplemental Table S1). Further evidence linking these modules to REM sleep are associations between the first principal components of the modules and REM sleep (P = {0.01, 0.014, 0.001} for the CREMM-ctx, CREMM-hypo, and CREMM-thal modules, respectively; Supplementary Figure S5).

Figure 3.

Figure 3

Venn diagram of the CCSG-enriched REM modules (CREMM-ctx, CREMM-hypo, and CREMM-thal). The table lists the 20 genes common to hypothalamus, thalamus, and cortex modules. The full list of genes for each module is provided in Supplementary Table S5.

Transcriptional Regulatory Networks involving Candidate Causal Sleep Genes

Given that certain critical genes may affect sleep-wake traits through mechanisms involving transcriptional regulation, CCSGs were next investigated as transcriptional regulators by using the CIT to build local transcriptional regulatory networks around the CCSGs. Within each tissue, all transcripts controlled by a QTL with a LOD score ≥ 3 that overlapped a CCSG QTL with peak-to-peak distance of 15 cM or less were identified. The CIT was then applied (using a nominal P < 0.05 significance threshold) to the QTL and transcript pairs within each tissue to test whether either of the transcripts regulates the other. In this manner, transcriptional networks were reconstructed around CCSGs for each brain tissue (Figure 4). In addition, the number of genes directly regulated by each CCSG, which provides a simple metric in ranking the relative importance of individual CCSGs as regulators of the transcription of downstream genes, was determined (Table 2).

Figure 4.

Figure 4

Transcriptional regulatory networks around candidate causal sleep genes (CCSGs) for REM (A) and wake (B), in cortex (square), hypothalamus (triangle), and thalamus (diamond). The causal inference test (CIT) was used with a significance threshold of 0.01 to identify transcriptional regulation, indicated by edge direction. Light blue node color indicates a CCSG whereas pink denotes a transcript with a regulatory link to a CCSG (either upstream or downstream). Genes are displayed as separate nodes in each tissue where they were significant. Blue and red lines represent reactive and causal CIT results, respectively.

Table 2.

Top 10 most connected causal transcriptional regulators

Gene Symbol Gene Name Trait Tissue Causal* Reactive*
    Ncor2 nuclear receptor co-repressor 2 REM Thalamus 36 4
    Acad10 acyl-CoA dehydrogenase family, member 10 REM Thalamus 27 2
    Zfp759 zinc finger protein 759 REM Cortex 26 1
    Amph Amphiphysin Wake Hypothalamus 19 0
    Tbc1d7 TBC1 domain family, member 7 REM Hypothalamus 19 3
    D5Ertd236e DNA segment, Chr 5, ERATO Doi 236, expressed REM Cortex 14 0
    5830416I19Rik RIKEN cDNA 5830416I19 gene REM Cortex 13 0
    Zfp738 zinc finger protein 738 Wake/REM Hypothalamus/Thalamus 12 2
    Cnga2 cyclic nucleotide gated channel alpha 2 Wake Cortex 10 2
    Pebp1 phosphatidylethanolamine binding protein 1 REM Cortex 10 0
*

“Causal” denotes regulation by the CCSG, whereas “reactive” denotes regulation of the CCSG. This table arbitrarily excludes causal transcriptional regulators with < 10 gene targets (full results available upon request).

DISCUSSION

Consistent with previous studies in model organisms,2,3 as well as longitudinal observations and twin studies in humans,1 we find evidence supporting a role for genetic variability in regulating primary sleep-wake traits, in particular REM sleep and wake duration. In a large number of mice from a genetically segregating cross that underwent 48-h of continuous EEG/EMG recording, we found a high degree of inter-individual variability in state duration, which occurred in the presence of significant stability within each individual over time (Figure 1). Taken together, this implies that genetic variability underlies sleep state duration. Furthermore, it is consistent with observations in humans that nearly all sleep variables exhibit stable and robust inter-individual differences.15 While it has long been appreciated that genetic factors exert a significant contribution to sleep-wake traits, the specific genes and networks underlying these traits remain largely unknown. Therefore, we utilized a systems genetics approach, combining detailed gene expression, genotype, and phenotype data in order to identify candidate causal genes, networks, and transcriptional regulators of REM sleep and wake duration. Importantly, by concentrating our analysis on specific genomic loci shown to regulate these traits of interest,6 we were able to precisely interrogate transcriptional patterns controlled by identified sleep-regulating genomic loci, providing a link between the genome, the transcriptome, and the phenotype of interest, in this case, 24-h REM sleep and wake duration.

Causal inference methodology was used to identify CCSGs (Supplemental Table S2), while in parallel, and independent of the CIT, the WGCNA revealed modules of highly correlated transcripts in each of the brain regions examined. Testing these modules for enrichment in the CCSGs led to the identification of specific modules comprised of causal genes associated with the regulation of REM sleep and wake duration in the cortex, hypothalamus, and thalamus (Figure 2). Surprisingly, we observed that 20 genes were located in the REM modules across each of these tissues (Figure 3), suggesting the presence of a conserved transcriptional module involved the regulation of REM sleep duration.

Examination of individual RNA expressed transcripts in this conserved REM module yields novel insight into the genes that may underlie and maintain genetic variation for REM sleep. For example, γ-aminobutyric acid (GABA) A receptor, subunit alpha 2 (Gabra2), represented by two probes on the Agilent array, is present in this conserved REM module (Figure 3). In fact, both probes were represented in all three CREMMs. The within-module connectivity score13 can be used as a rough indicator of the predictive power of a particular gene over transcription levels of other genes in the module. After ordering module genes by this score, we found that the ranks of both Gabra2 probes were high in all three modules (CREMM-ctx = {1,9} of 45; CREMM-hypo = {2,5} of 40; CREMM-thal = {8,11} of 47). Though there are 16 GABA subunits represented on the Agilent array, only Gabra2 probes were found in the three REM modules. These findings support the well-established role of the GABA receptor in sleep,16 and strengthen the hypothesis that Gabra2 is involved in the regulation of REM sleep duration, which interestingly was first suggested 14 years ago in the first QTL study published in sleep research.17 Furthermore, the identification of Gabra2 as a key gene in the conserved REM module is intriguing given the recent associations of Gabra2 haplotypes with alcoholism1820 and the observation of increased REM sleep in alcoholics.21

Our analysis also uncovered candidate transcriptional regulators of REM sleep and wake duration (Table 2, Supplemental Table S5). Nuclear receptor co-repressor 2 (Ncor2), a CCSG for REM sleep in the thalamus, was identified as the top transcriptional regulator of REM duration, affecting 36 downstream transcripts. The next most important regulator, also a CCSG for REM sleep in the thalamus, was acyl-coenzyme A dehydrogenase 10 (Acad10), with 27 regulated transcripts. These genes have both been implicated in sleep processes: ACAD influences theta oscillations during REM,22 while NCORs interact with peroxisome proliferator-activated receptors (PPARs),23 nuclear receptors that bind ligands potentially involved in the homeostatic regulation of sleep.24 In addition, the graphical representation of the transcriptional regulatory network shows Ncor2 and Acad10 to be highly interconnected (Figure 4). Intriguingly, their common role in metabolism suggests an integral role for metabolic processes in the regulation of REM sleep duration. Also, the critical upstream role proposed for Ncor2, a molecule known to influence gene expression through interactions with nuclear receptors as well as histone deacetylases that modify chromatin state,25 supports the hypothesis that the dynamic regulation of chromatin, working in unison with transcriptional control, underlies the regulation of REM sleep duration. This also links the REM network to the regulation of the molecular circadian clock, which has recently been shown to be heavily influenced by chromatin state.26 Indeed, REM sleep is known to be tightly regulated by the circadian clock.27

Pebp1 (also known as hippocampal cholinergic neurostimulating peptide, Hcnp) is a central gene in the cortical REM network, a member of all three REM modules, and one of the top CCSG transcriptional regulators. PEBP1 has been reported to be involved in the regulation of acetylcholine synthesis in the medial septal nucleus,28 a region implicated in the regulation of sleep, including the duration of REM bouts, on the basis of lesion studies in rats.29 Other central nodes of REM networks in the cortex and hypothalamus are genes that are not well-characterized (for example, D5Ertd236e, Zfp759 and Tbc1d7).Taken together, these results implicate a number of genes, many of unknown function, in the control of REM sleep, suggesting that multiple factors, including acetylcholine synthesis, metabolic processes, and chromatin modification, as well as important but poorly understood transcriptional regulators, together contribute to the regulation of REM sleep.

In the network for wake duration in cortex (Figure 4), the central CCSG regulator is cyclic nucleotide gated channel alpha 2 (Cnga2), which has been shown to mediate epinephrine-induced calcium influx in vascular endothelial cells30 as well as adenosine signaling,31 the latter being linked to the homeostatic regulation of sleep.32 The central CCSG regulator in the wake duration network for the hypothalamus (Figure 4) is amphiphysin (Amph), which has a neuronal isoform believed to play a role, via its interactions with clathrin and dynamin, in regulating the re-uptake of synaptic vesicles after neurotransmitter release,33 and has been linked to GABA-signaling.34 Membrane functions, including vesicular recycling, adenosine signaling, and calcium homeostasis, are thus heavily implicated in the regulation of wake duration.

In summary, the present report represents the first attempt to utilize genome-wide genotype and gene expression variation to identify causal genes regulating the primary sleep-wake traits of REM sleep and wake duration. By leveraging sleep-state phenotypic variation against the genetic variation of an N2 mapping cross, as well as quantitative gene expression measurement in key brain regions, we have detected multiple causal sleep-state regulatory genes, including several surprising novel candidates, which have implicated a range of physiological and cellular processes in the modulation of sleep and wake. We report a core network module involved in the regulation of REM sleep duration that is conserved across the cortex, hypothalamus, and thalamus.

The dramatic influence of sleep on vast aspects of CNS and peripheral physiology, as well as the multitude of factors known to affect sleep-wake characteristics, indicates that hundreds, if not thousands, of genes will be involved in regulating sleep-wake phenotypes. While single genes will indeed be important in controlling specific sleep-wake phenotypes, the present results indicate that large numbers of genes interacting as complex networks underlie core sleep-wake traits and are involved in maintaining genetic variation at the population level. The networks reported here present a set of genes implicated in the regulation of REM sleep and wake duration that are immediately useful as candidates for further investigation by targeted gene disruption or other experimental interventions.

Indeed, we are presently using the information gained from this analysis to further investigate the effects of those genes identified here using conventional reverse genetics approaches, such as single-gene knock-out models. The causal genes generated from this approach serve as a resource with predictive value for future experiments in a wide variety of genetic models in organisms harboring specific disruptions of genes implicated here. In particular, we anticipate that tissue-specific knock-outs of causal genes in key CNS structures and/or cell types will be particularly informative in deciphering the exact function of individual genes. Furthermore, small molecule libraries can be examined for pharmacological agents that specifically target the networks and pathways described here. Taken together, these genetic and pharmacologic strategies offer ample opportunity to verify and validate the role of the genes identified in our network analysis. One of the expectations for making publically available such a massive dataset is the anticipation that other investigators will be able to identify genes/pathways of particular interest and subsequently perform follow-up studies to more precisely characterize the role of these genes in sleep-wake regulation.

DISCLOSURE STATEMENT

This study was supported in part by Merck & Co., Inc (USA). Dr. Zhang has received grant support from Merck & Company and Pfizer Pharmaceuticals. Dr. Turek acknowledges receiving research support from Merck & Co., Inc. and DARPA. He has made presentations for Servier Pharmaceutical and has received consultant fees from Ingram Barge Company. Drs. Winrow and Renger are employees of Merck & Co., Inc. (USA) and potentially own stock and/or stock options in the company. Dr. Kasarskis is an employee of Pacific Biosciences and owns stock in the company. Dr. Schadt is the Chief Scientific Officer of Pacific Biosciences and owns stock in the company. The other authors have indicated no financial conflicts of interest.

ACKNOWLEDGMENTS

The authors thank the following individuals for technical assistance: Susan Losee-Olson, Janna Arbuzova, Norman Atkins Jr., Daniel Radzicki, Deanna Williams, and He Yang. The authors also thank Elena Nikonova and Anthony Gotter for helpful comments on the manuscript. This work was supported by the Defense Advanced Research Projects Agency (DARPA) and the Army Research Office (ARO), award number DAAD 19-02-1-0038, as well as by Merck & Co., Inc (USA). Institutions where work was performed: Northwestern University – animal procedures, sleep recordings and analysis, tissue collections; Merck Research Laboratories – gene expression measurements; Sage Bionetworks, Pacific Biosciences – statistical analysis.

Footnotes

A commentary on this article appears in this issue on page 1453.

Figure S1

Frontal Cortex. Substantial numbers of cis effect, where the physical position of the gene was proximal to the QTL and on the same chromosome are present.

aasm.34.11.1469s1.tif (433.3KB, tif)
Figure S2

Hypothalamus. Substantial numbers of cis effect, where the physical position of the gene was proximal to the QTL and on the same chromosome are present.

aasm.34.11.1469s2.tif (437.2KB, tif)
Figure S3

Thalamus. Substantial numbers of cis effect, where the physical position of the gene was proximal to the QTL and on the same chromosome are present.

aasm.34.11.1469s3.tif (437.9KB, tif)
Figure S4

Density plots of Spearman correlations relating all probes across the 3 brain regions in pairwise assessments. For comparison, data were randomly permutated and the analysis was repeated to generate the distribution of the correlations under the null.

aasm.34.11.1469s4.tif (286.6KB, tif)
Figure S5

Scatter plots of sleep traits against CCSG enriched module eigen genes.

aasm.34.11.1469s5.tif (778.4KB, tif)

Table S1.

Kruskall-Wallace tests (-log10[P-value] reported) of SNPs vs. transcripts meeting permutation-based FDR < 0.01 significance level

aasm.34.11.1469ts1.pdf (468.6KB, pdf)

Table S2.

Genes statistically consistent with causal mediation according to the CIT at a nominal P < 0.1 level

aasm.34.11.1469ts2.pdf (100.9KB, pdf)

Table S3.

CCSG Enriched Modules

aasm.34.11.1469ts3-1.tif (112.4KB, tif)

REFERENCES

  • 1.Andretic R, Franken P, Tafti M. Genetics of sleep. Annu Rev Genet. 2008;42:361–88. doi: 10.1146/annurev.genet.42.110807.091541. [DOI] [PubMed] [Google Scholar]
  • 2.O'Hara BF, Turek FW, Franken P. Genetic basis of sleep in rodents. In: Kryger MH, Roth T, Dement WC, editors. Principles and Practice of Sleep Medicine. St. Louis, MO: Elsevier Saunders; 2010. [Google Scholar]
  • 3.Harbison ST, Carbone MA, Ayroles JF, Stone EA, Lyman RF, Mackay TF. Co-regulated transcriptional networks contribute to natural genetic variation in Drosophila sleep. Nat Genet. 2009;41:371–5. doi: 10.1038/ng.330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Baggs JE, Hogenesch JB. Genomics and systems approaches in the mammalian circadian clock. Curr Opin Genet Dev. 2010;20:581–7. doi: 10.1016/j.gde.2010.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zhang EE, Kay SA. Clocks not winding down: unravelling circadian networks. Nat Rev Mol Cell Biol. 2010;11:764–76. doi: 10.1038/nrm2995. [DOI] [PubMed] [Google Scholar]
  • 6.Winrow CJ, Williams DL, Kasarskis A, et al. Uncovering the genetic landscape for multiple sleep-wake traits. PLoS One. 2009;4:e5161. doi: 10.1371/journal.pone.0005161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Millstein J, Zhang B, Zhu J, Schadt EE. Disentangling molecular relationships with a causal inference test. BMC Genet. 2009;10:23. doi: 10.1186/1471-2156-10-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zapala MA, Hovatta I, Ellison JA, et al. Adult mouse brain gene expression patterns bear an embryologic imprint. Proc Natl Acad Sci U S A. 2005;102:10357–62. doi: 10.1073/pnas.0503357102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Laposky AD, Shelton J, Bass J, Dugovic C, Perrino N, Turek FW. Altered sleep regulation in leptin-deficient mice. Am J Physiol Regul Integr Comp Physiol. 2006;290:R894–903. doi: 10.1152/ajpregu.00304.2005. [DOI] [PubMed] [Google Scholar]
  • 10.Su WL, Sieberts SK, Kleinhanz RR, et al. Assessing the prospects of genome-wide association studies performed in inbred mice. Mamm Genome. 2010;21:143–52. doi: 10.1007/s00335-010-9249-7. [DOI] [PubMed] [Google Scholar]
  • 11.He YD, Dai H, Schadt EE, et al. Microarray standard data set and figures of merit for comparing data processing methods and experiment designs. Bioinformatics. 2003;19:956–65. doi: 10.1093/bioinformatics/btg126. [DOI] [PubMed] [Google Scholar]
  • 12.Schadt EE, Lamb J, Yang X, et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet. 2005;37:710–7. doi: 10.1038/ng1589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005;4 doi: 10.2202/1544-6115.1128. Article17. [DOI] [PubMed] [Google Scholar]
  • 14.Storey JD, Tibshirani R. Statistical methods for identifying differentially expressed genes in DNA microarrays. Methods Mol Biol. 2003;224:149–57. doi: 10.1385/1-59259-364-X:149. [DOI] [PubMed] [Google Scholar]
  • 15.Tucker AM, Dinges DF, Van Dongen HP. Trait interindividual differences in the sleep physiology of healthy young adults. J Sleep Res. 2007;16:170–80. doi: 10.1111/j.1365-2869.2007.00594.x. [DOI] [PubMed] [Google Scholar]
  • 16.Saper CB, Scammell TE, Lu J. Hypothalamic regulation of sleep and circadian rhythms. Nature. 2005;437:1257–63. doi: 10.1038/nature04284. [DOI] [PubMed] [Google Scholar]
  • 17.Tafti M, Franken P, Kitahama K, Malafosse A, Jouvet M, Valatx JL. Localization of candidate genomic regions influencing paradoxical sleep in mice. Neuroreport. 1997;8:3755–8. doi: 10.1097/00001756-199712010-00019. [DOI] [PubMed] [Google Scholar]
  • 18.Edenberg HJ, Dick DM, Xuei X, et al. Variations in GABRA2, encoding the alpha 2 subunit of the GABA(A) receptor, are associated with alcohol dependence and with brain oscillations. Am J Hum Genet. 2004;74:705–14. doi: 10.1086/383283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Enoch MA, Hodgkinson CA, Yuan Q, Albaugh B, Virkkunen M, Goldman D. GABRG1 and GABRA2 as independent predictors for alcoholism in two populations. Neuropsychopharmacology. 2009;34:1245–54. doi: 10.1038/npp.2008.171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Enoch MA, Schwartz L, Albaugh B, Virkkunen M, Goldman D. Dimensional anxiety mediates linkage of GABRA2 haplotypes with alcoholism. Am J Med Genet B Neuropsychiatr Genet. 2006;141B:599–607. doi: 10.1002/ajmg.b.30336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Colrain IM, Turlington S, Baker FC. Impact of alcoholism on sleep architecture and EEG power spectra in men and women. Sleep. 2009;32:1341–52. doi: 10.1093/sleep/32.10.1341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Tafti M, Petit B, Chollet D, et al. Deficiency in short-chain fatty acid beta-oxidation affects theta oscillations during sleep. Nat Genet. 2003;34:320–5. doi: 10.1038/ng1174. [DOI] [PubMed] [Google Scholar]
  • 23.Yu C, Markan K, Temple KA, Deplewski D, Brady MJ, Cohen RN. The nuclear receptor corepressors NCoR and SMRT decrease peroxisome proliferator-activated receptor gamma transcriptional activity and repress 3T3-L1 adipogenesis. J Biol Chem. 2005;280:13600–5. doi: 10.1074/jbc.M409468200. [DOI] [PubMed] [Google Scholar]
  • 24.Koethe D, Schreiber D, Giuffrida A, et al. Sleep deprivation increases oleoylethanolamide in human cerebrospinal fluid. J Neural Transm. 2009;116:301–5. doi: 10.1007/s00702-008-0169-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kao HY, Downes M, Ordentlich P, Evans RM. Isolation of a novel histone deacetylase reveals that class I and class II deacetylases promote SMRT-mediated repression. Genes Dev. 2000;14:55–66. [PMC free article] [PubMed] [Google Scholar]
  • 26.Alenghat T, Meyers K, Mullican SE, et al. Nuclear receptor corepressor and histone deacetylase 3 govern circadian metabolic physiology. Nature. 2008;456:997–1000. doi: 10.1038/nature07541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Czeisler CA, Buxton OM. The human circadian timing system and sleep-wake regulation. In: Kryger MH, Roth T, Dement WC, editors. Principles and Practice of Sleep Medicine. 5 ed. St. Louis, MO: Elsevier Saunders; 2011. pp. 402–19. [Google Scholar]
  • 28.Uematsu N, Matsukawa N, Kanamori T, et al. Overexpression of hippocampal cholinergic neurostimulating peptide in heterozygous transgenic mice increases the amount of ChAT in the medial septal nucleus. Brain Res. 2009;1305:150–7. doi: 10.1016/j.brainres.2009.09.112. [DOI] [PubMed] [Google Scholar]
  • 29.Srividya R, Mallick HN, Kumar VM. Sleep changes produced by destruction of medial septal neurons in rats. Neuroreport. 2004;15:1831–5. doi: 10.1097/01.wnr.0000135698.68152.86. [DOI] [PubMed] [Google Scholar]
  • 30.Shen B, Cheng KT, Leung YK, et al. Epinephrine-induced Ca2+ influx in vascular endothelial cells is mediated by CNGA2 channels. J Mol Cell Cardiol. 2008;45:437–45. doi: 10.1016/j.yjmcc.2008.06.005. [DOI] [PubMed] [Google Scholar]
  • 31.Cheng KT, Leung YK, Shen B, et al. CNGA2 channels mediate adenosine-induced Ca2+ influx in vascular endothelial cells. Arterioscler Thromb Vasc Biol. 2008;28:913–8. doi: 10.1161/ATVBAHA.107.148338. [DOI] [PubMed] [Google Scholar]
  • 32.Porkka-Heiskanen T, Alanko L, Kalinchuk A, Stenberg D. Adenosine and sleep. Sleep Med Rev. 2002;6:321–32. doi: 10.1053/smrv.2001.0201. [DOI] [PubMed] [Google Scholar]
  • 33.Takei K, Slepnev VI, Haucke V, De Camilli P. Functional partnership between amphiphysin and dynamin in clathrin-mediated endocytosis. Nat Cell Biol. 1999;1:33–9. doi: 10.1038/9004. [DOI] [PubMed] [Google Scholar]
  • 34.Geis C, Beck M, Jablonka S, et al. Stiff person syndrome associated anti-amphiphysin antibodies reduce GABA associated [Ca(2+)]i rise in embryonic motoneurons. Neurobiol Dis. 2009;36:191–9. doi: 10.1016/j.nbd.2009.07.011. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1

Frontal Cortex. Substantial numbers of cis effect, where the physical position of the gene was proximal to the QTL and on the same chromosome are present.

aasm.34.11.1469s1.tif (433.3KB, tif)
Figure S2

Hypothalamus. Substantial numbers of cis effect, where the physical position of the gene was proximal to the QTL and on the same chromosome are present.

aasm.34.11.1469s2.tif (437.2KB, tif)
Figure S3

Thalamus. Substantial numbers of cis effect, where the physical position of the gene was proximal to the QTL and on the same chromosome are present.

aasm.34.11.1469s3.tif (437.9KB, tif)
Figure S4

Density plots of Spearman correlations relating all probes across the 3 brain regions in pairwise assessments. For comparison, data were randomly permutated and the analysis was repeated to generate the distribution of the correlations under the null.

aasm.34.11.1469s4.tif (286.6KB, tif)
Figure S5

Scatter plots of sleep traits against CCSG enriched module eigen genes.

aasm.34.11.1469s5.tif (778.4KB, tif)

Table S1.

Kruskall-Wallace tests (-log10[P-value] reported) of SNPs vs. transcripts meeting permutation-based FDR < 0.01 significance level

aasm.34.11.1469ts1.pdf (468.6KB, pdf)

Table S2.

Genes statistically consistent with causal mediation according to the CIT at a nominal P < 0.1 level

aasm.34.11.1469ts2.pdf (100.9KB, pdf)

Table S3.

CCSG Enriched Modules

aasm.34.11.1469ts3-1.tif (112.4KB, tif)

RESOURCES