Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2020 Jan 13;16(1):e1008577. doi: 10.1371/journal.pgen.1008577

High-throughput discovery of genetic determinants of circadian misalignment

Tao Zhang 1,#, Pancheng Xie 1,#, Yingying Dong 1,#, Zhiwei Liu 1, Fei Zhou 1, Dejing Pan 1, Zhengyun Huang 1, Qiaocheng Zhai 1, Yue Gu 1, Qingyu Wu 2,3, Nobuhiko Tanaka 4, Yuichi Obata 4, Allan Bradley 5, Christopher J Lelliott 5; Sanger Institute Mouse Genetics Project, Lauryl M J Nutter 6, Colin McKerlie 6, Ann M Flenniken 6, Marie-France Champy 7, Tania Sorg 7, Yann Herault 7, Martin Hrabe De Angelis 8,9, Valerie Gailus Durner 8, Ann-Marie Mallon 10, Steve D M Brown 10, Terry Meehan 11, Helen E Parkinson 11, Damian Smedley 12, K C Kent Lloyd 13, Jun Yan 14, Xiang Gao 14, Je Kyung Seong 15, Chi-Kuang Leo Wang 16, Radislav Sedlacek 9, Yi Liu 17, Jan Rozman 8,9,18,*, Ling Yang 1,*, Ying Xu 1,3,*
Editor: Achim Kramer19
PMCID: PMC6980734  PMID: 31929527

Abstract

Circadian systems provide a fitness advantage to organisms by allowing them to adapt to daily changes of environmental cues, such as light/dark cycles. The molecular mechanism underlying the circadian clock has been well characterized. However, how internal circadian clocks are entrained with regular daily light/dark cycles remains unclear. By collecting and analyzing indirect calorimetry (IC) data from more than 2000 wild-type mice available from the International Mouse Phenotyping Consortium (IMPC), we show that the onset time and peak phase of activity and food intake rhythms are reliable parameters for screening defects of circadian misalignment. We developed a machine learning algorithm to quantify these two parameters in our misalignment screen (SyncScreener) with existing datasets and used it to screen 750 mutant mouse lines from five IMPC phenotyping centres. Mutants of five genes (Slc7a11, Rhbdl1, Spop, Ctc1 and Oxtr) were found to be associated with altered patterns of activity or food intake. By further studying the Slc7a11tm1a/tm1a mice, we confirmed its advanced activity phase phenotype in response to a simulated jetlag and skeleton photoperiod stimuli. Disruption of Slc7a11 affected the intercellular communication in the suprachiasmatic nucleus, suggesting a defect in synchronization of clock neurons. Our study has established a systematic phenotype analysis approach that can be used to uncover the mechanism of circadian entrainment in mice.

Author summary

Synchronization to environmental changes such as day and night cycles and seasonal cycles is critical for survival. Organisms have therefore evolved a specialized circadian system to anticipate and adapt to daily changes in the environment. Loss of synchrony between the internal circadian clock and environment day and night changes is responsible for jet lag, but may also promote sleep disorders, metabolic disorders and many diseases. The availability of large amounts of mouse data from the International Mouse Phenotype Consortium provides new opportunities to identify novel genetic components of mouse behaviour and metabolism. In this study, we performed a high-throughput identification of genetic components of circadian misalignment by developing a machine learning-based algorithm. By analyzing the indirect calorimetry parameters from more than 2000 C57BL/6N mice and mice from 750 mutant lines, we identified 5 genes involved in circadian misalignment of activity and feeding behaviour. Further analyzing genetic knock-out mice for one of these genes, we were able to validate our screening method by functional studies. Our systemic analysis thus paves the way for searching the genetic determinants for circadian misalignment.

Introduction

The circadian clock is one of the best-characterized mechanisms that can mediate the influence of environmental cues on molecular, physiological and behavioural activities in almost all organisms. The suprachiasmatic nucleus (SCN) is the central circadian pacemaker in mammals that receives photic information via the retina, integrates time-related information of tissues and organs, and then transmits timing information to cells and tissues to regulate physiology and behaviour to entrainment of animals to the daily changes of environmental cues [1,2]. Chronic misalignment between the circadian clock and the environment has been implicated in many pathological processes such as sleep disorders, cardiovascular diseases, metabolic disorders, and cancer [3,4]. Mice with a defective light input pathway (lacking rods, cones, and melanopsin (Opn4-/- Gnat1-/- Cnga3-/-)), cannot be entrained to light/dark cycles [5,6]. On the other hand, arrhythmicity at the tissue or behavioural level and attenuated light-induced phase shift can result from impaired expression of coupling peptides in the SCN, such as genetic ablation of vasoactive intestinal peptide (VIP), Gastrin-releasing peptide (GRP) or VIP receptors VPAC in mice [79]. In humans, dysfunction or misalignment of the circadian clock with environmental cues alters the timing of the sleep-wake cycle [1012]. Mice with mutations orthologous to the human mutations (PER2S662G, CK1δT44A) recapitulate human phase-advanced behavioural rhythms and transgenic mice carrying PER1S714G mutation advances feeding behaviour [1316], indicating that mice are a good model for human circadian functions. In addition, activity, feeding, temperature and glucocorticoid signals can also affect the circadian phase [1721]. These are all zeitgebers of the circadian clock and will impart phase information on their target tissues [22]. Circadian misalignment is a consequence of conflicting signals of these zeitgebers.

The International Mouse Phenotyping Consortium (IMPC) project is generating a genome-wide annotation of gene functions by systematically generating and phenotyping a collection of standardized single-gene knockout mice [2325]. Indirect calorimetry (IC) datasets are collected by the IMPC pipelines with standardized protocols (https://www.mousephenotype.org/impress/protocol/86). Activity parameters are monitored using a metabolic chamber equipped with infrared beam instead of using a running wheel to avoid artificially enhanced or weakened activity. A food intake monitoring system is also integrated to investigate diurnal patterns of feeding rhythms and behaviours. The availability of the IC datasets from mice generated and phenotyped by the IMPC offered a unique opportunity to perform a large-scale screen for mutants with defective circadian misalignment. By analyzing the IC data from more than 2,000 wild-type mice available from the IMPC, we identified two reliable parameters of circadian misalignment. We developed a machine learning algorithm for circadian parameter recognition (SyncScreener). Using this algorithm, we screened mice from 750 mutant lines from five IMPC centres and identified five novel genes involved in circadian misalignment. Among these genes, the function of Slc7a11 in circadian entrainment was confirmed by creating a knockout mouse, demonstrating that our approach is effective in uncovering mechanistic insights into circadian entrainment.

Results

Analyzing mouse indirect calorimetry datasets to identify reliable parameters for circadian misalignment

Misalignment between different components of the circadian systems and sleep-wake cycles or food intake behaviour as a result of genetic, environmental or behavioural factors might be an important contributor to diseases as well as how we can treat and prevent them. The mouse mutants with circadian misalignment would result in phase changes in locomotor activity and/or food intake behaviour under the light/dark cycle. We attempt to provide a systematic approach to identify genes underlying misalignment between circadian systems and behavioural factors. We thus collected available IC data from five IMPC centres: the Riken BioResource Center (RBRC), The Centre for Phenogenomics (TCP), the Institut Clinique de la Souris (ICS), the Wellcome Trust Sanger Institute (WTSI) and the Helmholtz Zentrum Munich (HMGU) and designed a workflow for systematic and unbiased analysis of circadian misalignment for more than 2,000 wild-type C57BL/6N mice and mice from 750 mutant mouse lines (Fig 1).

Fig 1. Screening strategy diagram.

Fig 1

IC data sets were collected from five centres (WTSI, ICS, TCP, RBRC and HMGU). Baseline data from >2000 wild-type C57BL/6N mice were used to determine parameters for identification of circadian misalignment by visual assessment and developing a machine learning. Discovery of phenodeviance was achieved by screening datasets from 750 mutant lines against primary criteria (mean data) and secondary criteria (at least 50% similar phenotypes, effect size > 1.2 and p-value < 0.001). Validation was conducted by further light entrainment experiments using mutant mice. WTSI: Wellcome Trust Sanger Institute; ICS: Institut Clinique de la Souris; TCP: The Centre for Phenogenomics; RBRC: Riken Bio-Resource Center; HMGU: German Mouse Clinic (Helmholtz Zentrum München). The light schedules in the procedure room were labelled as indicated.

Baseline data from wild-type C57BL/6N were used to compare the reliability of data between centres and/or within centres to choose robust parameters of circadian misalignment. We first analyzed the activity/food intake cycle data for wild-type mice from the five IMPC centres. The total activity and food intake for each mouse was calculated and expressed as 1-hour averages over at least 21 hours according to the IMPC protocol. As expected, both activity and food intake rhythms displayed two clear peaks: a strong early evening (E) peak and a weak early morning (M) peak [26,27] (Fig 2A and 2B). The onset time at which mouse activity or food intake behaviour transitioned from the resting state to the active state could be clearly seen. These observations suggest that the onset time and E peak phase are potentially reliable parameters for circadian behaviour despite the diversity of data from different centres.

Fig 2. The distribution of the onset time and peak phase of activity and food intake from IC data from five IMPC centres.

Fig 2

(A, B) Heat map showing the consistent trend of the onset, peak phase and amplitude of activity (A) and food intake (B) recorded by IC of more than 2000 C57BL/6N mice across the five IMPC centres. Mice were ordered according to their evening peak phases. 1 (red) represents the strongest; 0 (blue) represents the weakest. Zeitgeber time (ZT), ZT 0: light on, ZT12: light off. (C, D) The distribution of the onset and peak phase of activity (C) and food intake (D) from the five centres identified by SyncScreener (pink: onset; red: peak phase). Also see S1 Fig with visual assessment. (E-H) The varied range of onset and peak phase of activity and food intake from the five centres identified by SyncScreener.

By optimizing conditions to draw scatter plot maps, we further assessed the activity and food intake parameters using 2201 activity/rest scatter plots and 2160 food intake scatter plots at 1-hour intervals over 21 hours (ZT0: light on, ZT12: light off, plots deposited in the Cam-Su GRC database, http://gofile.me/2F1pE/RP2URKxV2) (S1 and S2 Files). The onset times and peak phases for activity and food intake were manually evaluated by observing the curve of activity or food intake. We found that the onset time points of activity and food intake were reliably detected with the lowest variance as compared to other parameters from the five different IMPC centres (S1 Fig, pink column, S1 Table). E peak phases of activity/food intake showed a broader distribution than that of the activity onset (S1 Fig, red column, S2 Table). Thus, the onset time and E peak phase are reliable parameters for comparison of circadian misalignment. In addition, because light schedules in the procedure rooms are different in the five centres (light cycle, WTSI: 7:30–19:30; TCP: 7:00–19:00; ICS: 7:00–19:00; RBRC 11:00–23:00; HMGU: 6:00–18:00), our data analyses were restricted to within-centre comparisons instead of between-centre comparisons. We also limited IC data analysis to male mice because the variability of these parameters in females was great, likely due to the estrous cycles.

Empirical visual assessment for onset time and peak phase is time consuming, technically demanding and subjective. To resolve these issues, we employed machine learning to develop predictive models to identify the onset time and peak phase from IC datasets. In our algorithm, a convolutional neural network (CNN) was used to learn from synthetic rhythmic data sets to generate predictive models to investigate the utility of IMPC resources for large-scale screening. Our algorithm needs to learn from labelled training data to build predictive models. Since the IC data are unlabelled and not very diverse, we generated synthetic training data by simulating the pattern of raw data separately in five different centres with labelled onset times and peak phases. The onset time was defined as a transition from rest to active state, like the interim of a piecewise function, while the peak phase was the end of the transition (S2 Fig). Then, we simulated the data before the peak phase by an Ordinary Differential Equation (ODE) with a piecewise function and fitted the remaining data by Gaussian functions. Next, we produced various rhythmic patterns by random disturbance of the onset and peak phase. In addition, measurement noise was modelled as white noise that was added to synthetic training data. Predictive models were generated by learning these large synthetic training data sets. Furthermore, we applied predictive models to the IC datasets from 2201 mice for activity/rest and 2160 mice for food intake in five different centres. The results for the onset time and peak phase of activity and food intake for each centre were determined by SyncScreener (Fig 2C–2H, S3 and S4 Tables).

Next, we compared the results from visual assessment and machine learning algorithm (SyncScreener, S3 File) using Bland–Altman plots to determine whether the automated parameter determination was accurate compared to visual assessment [28]. Bland–Altman plots showed that the differences between the two methods were acceptable within a 95% limit of agreement with more acceptable results for the onset time than the peak phase (S3 Fig and S4 Fig).

Finally, since we did not find known clock regulators with activity phenotypes from screened mutants, we used several known circadian mutant lines to validate our predictive models and parameters: hPER2S662G mice (advanced onset of activity), Fbxl3-/- mice (delayed onset of activity), Nestin-Cre;Zbtb20-/- mice (delayed peak phase) and hPER1S714G mice (advanced onset of food intake) [15,16,29,30]. These clock mutant mice were placed in calorimetry cages to generate IC data plotted at 1-hour intervals over 24 hours (Fig 3A–3J). The onset and peak phase were estimated by SyncScreener (Fig 3K) and visual assessment for comparison (S5 Table). The standardized effect size (d) was used to estimate the phenodeviance where the absolute difference between the mutant and wild-type control was scaled in units of the phenotypic standard deviation (a statistical power analysis according to IMPC protocol) [31]. The p value was employed to measure a false-positive risk. As shown by the SyncScreener results in Fig 3K, these known circadian mutants showed the estimated detectable d value (effect size: 1.2–6.64) and clearly statistically significance (p < 0.05–0.001) for the onset time and peak phase that were consistent with their known phenotypes (Fig 3K). These results suggest that determination of these two parameters by our machine learning algorithm and IC data are reliable for identification of mouse mutants with impaired circadian misalignment behaviour.

Fig 3. Established positive control for circadian misalignment of activity and food intake behaviour.

Fig 3

(A-J) Recording of activity (A-E) and food intake (F-J) in known mutant lines and analyses of onset and peak phase by CLAMS. Data represent the mean ± SEM measured by IC under light-dark cycles from 8–14 mutant male mice for the indicated genotype. (K) The onset time (Ton) and peak phase (Tph) identified by SyncScreener. d: detectable value (effect size) for the onset (don) and peak phase (dph); p: statistical significance by t-test for the onset (pon) and peak phase (pph). Also see S5 Table for visual assessment results.

Identification of mouse mutants with impaired circadian misalignment behaviour

Primary criteria: We applied the machine learning algorithm to detect the onset and peak phase of activity/food intake in mutant mouse lines. By comparing the parameters of the wild-type control mice from each centre, we screened 440 homozygous and 310 heterozygote mutant strains, representing loss-of-function of 726 unique genes (S6 and S7 Tables). We obtained a mean scatter dot curve generated from 7–8 male mice for each genotype available in the IMPC database to evaluate the onset times and peaks by SyncScreener. The distribution of phase deviations of onset and peak phases between the wild-type and mutant mouse lines at each centre are shown in Fig 4A–4E and S8 Table. The onset time or peak phase which falls in the tails beyond ~+/- 2 s.d. from mean (approximate 5%) were designated outliers, where s.d. represents the standard deviation of differences between the mutants and wild-type control in each centre. Among the 750 mutant lines, 12 existed at two or more centres and they all exhibited the same phenotypes as the wild-type (S9 Table). 88 (11.7%) of the 750 mutant mouse lines falls in the tail beyond ~+/- 2 s.d (approximate 5%) from mean for either the onset and/or peak phase of activity and food intake. Those mutant lines were selected against the secondary criteria (S10 Table, 88 genes).

Fig 4. Systematic identification of the onset and peak phase phenotypes.

Fig 4

(A-E) Distribution of the number of mutant lines with onset or peak phase deviated from wild-type mice at ±0, ±σ, and ±2σ. IC data from RBRC (A), ICS (B), TCP (C), WTSI (D) and HMGU (E) are presented separately. Mutants were defined as outliers when the parameters deviated from the mean greater than 2σ. Five candidates meeting the secondary criteria are labelled by arrows.

Secondary criteria: The candidate mouse lines were further examined by the following criteria: (1) 50% of individual mice within a line displayed similar phenotypes as one of phenotyping baseline as described previously [24,32,33], (2) standardized effect size (Cohen's d), where d=xm¯xwt¯s is the absolute difference between mutant and baseline means scaled in units of phenotypic standard deviation. A larger value of effect size always indicates a stronger phenodeviance [24,32]. Effect size (d) >1.2 suggested that the group difference is large between the mutant mice and the wild-type mice, and (3) the two-tailed t-test was used to examine the statistical significance (p value), where p<0.001 indicated statistical significance between the mutant line and phenotyping baseline. Five mutant lines (Slc7a11tm1b/tm1b, Rhbdl1+/tm1.1, Spop+/tm1b, Oxtrtm1.1/tm1.1, Ctc1+/tm1b) met the above three criteria (50% mice with similar phenotype, d > 1.2, and p < 0.001) (Fig 5 and S11 Table). For the Slc7a11tm1b/tm1b mice, the time of activity onset is advanced compared to the wild-type at ICS (from ZT8 to ZT11) and the phase of food intake is indistinguishable between mutant and wild-type mice (Fig 5A and 5B). Rhbdl1+/tm1.1 mice displayed a delayed onset of activity compared to wild-type mice at TCP (Fig 5C and 5D). Onset of activity and food intake in Spop+/tm1b mice were delayed (Fig 5E and 5F). The Oxtrtm1.1/tm1.1 mice exhibited a trend towards more daytime activity tendency than wild-type mice at the RBRC (Fig 5G and 5H, S5 Fig). The Ctc1+/tm1b mice showed a delayed onset of both activity and food intake (Fig 5I and 5J). The effective size and p value from all candidate mutants fall within positive ranges (S11 Table, compared to Fig 3K for positive controls) and these mutant mice were labelled as the outliers in the Fig 4. These results suggest that these five genes are potentially involved in circadian misalignment. The Spop, Ctc1 and Rhbdl1 homozygous mice were preweaning or embryonic lethality, so the phenotypes of homozygous knockout is not known. In addition, we found that the Slc7a11tm1b/tm1b, and Rhbdl1+/tm1.1 mutant mice showed altered glucose tolerance (IMPC data, S12 Table), suggesting that the deletion or haploinsufficiency of these genes impaired metabolism.

Fig 5. Analysis of onset and peak phase phenotypes of mice from identified mutant lines.

Fig 5

Profiles of oscillating activity (A, C, E, G, I) and food intake over time (B, D, F, H, J) for Slc7a11 (A, B), Rhbdl1 (C, D), Spop (E, F), Oxtr (G, H) and Ctc1 (I, J) mutant mice. Blue and red lines represent the wild-type and mutant mice in the same centre, respectively. Data are normalized and presented as the mean value ± SEM (n = 7–8 for each genotype). The time of day is indicated in hours, and the dark period is indicated by shading.

Confirmation of the role of Slc7a11 in circadian behaviour

To confirm our screening results from the IMPC IC datasets, we generated the Slc7a11tm1a/tm1a mice using targeted C57BL/6N ES cells by Cam-Su Genomic Research Center [34] (S6 Fig). Consistent with that of the IMPC results with advanced onset of activity in Fig 5A, Slc7a11tm1a/tm1a mice exhibited advanced onset of activity on the first and third day under LD cycles by the Comprehensive Lab Animal Monitoring System (CLAMS, IC) (Fig 6A and 6B). Although the onset was comparable between the Slc7a11tm1a/tm1a and wild-type littermates in the second day, the mutant mice showed a phase advance for the declining activity phase for the next dawn (Fig 6A, red arrow). Consistent with the results of SyncScreener, the food intake was comparable between mutant and wild-type mice, including VO2, VCO2 (S7 Fig). We next analyzed voluntary wheel-running activity to evaluate the onset time, and free-running period of the circadian clock for the Slc7a11tm1a/tm1a mice. Under LD cycles, the activity onset times were significantly advanced and unstable with a larger variance in the Slc7a11tm1a/tm1a mice compared with wild-type littermates (Fig 6C and 6D). This result is consistent with the above observations, suggesting that the mutant mice have reduced sensitivity to photic entrainment. The mice were subsequently released to constant darkness (DD) to determine circadian period. Of note, there is no significant difference in the circadian period (Fig 6E–6G). To fully examine the role of Slc7a11 in circadian entrainment, we subjected Slc7a11tm1a/tm1a mice and wild-type littermates to a simulated jet-lag environment. In response to a 6 h advance shift of the LD cycle, Slc7a11tm1a/tm1a mice immediately showed phase advance (S8 Fig). Furthermore, to minimize the masking effect of light, Slc7a11tm1a/tm1a mice were subjected to a skeleton photoperiod with 15-min light arms from clock time 07:45 to 08:00 and from clock time 20:00 to 20:15 (ZT0, clock time 08:00) with darkness at all other times. We found that the activity onset was significantly advanced in Slc7a11tm1a/tm1a mice compared with wild-type littermates (Fig 6H–6J). Furthermore, mice were exposed to an alternating cycle of 3.5 h light and 3.5 h dark for T-cycle experiments. Slc7a11tm1a/tm1a confined their activity mostly to the dark phase as did their littermates (S9 Fig), suggesting that disruption of Slc7a11 did not affect masking behaviour. Altogether, these data suggested that disruption of Slc7a11 changes sensitivity to circadian entrainment. Since the suprachiasmatic nucleus (SCN) is responsible for entraining mice activity rhythm by light/dark cycles [35], we examined the expression of Slc7a11 in the SCN by in situ hybridization at ZT6 (day time) and ZT18 (night time). We found that Slc7a11 hybridization signals were higher at ZT 18 than those at ZT6, suggesting cyclic expression of Slc7a11 mRNA under LD cycle (Fig 7A and 7B). In addition, we also examined the expression profiles of Slc7a11 in the wild-type mice under constant darkness by RT-PCR [36,37]. The profiles of Slc7a11 expression showed rhythmicity in the wild-type SCN by Jonckheere-Terpstra-Kendall (JTK) cycle analysis (Fig 7C, p<0.05). In Slc7a11tm1a/tm1a mice, we found that the mRNA levels of most core clock genes were comparable to wild-type mice in the SCN and liver tissues (Fig 7D and S10 Fig). This is consistent with the non-altered circadian period. However, the expression phases of genes with high-amplitude such as Per2 and Dbp were advanced in the SCN of Slc7a11tm1a/tm1a mice compared with wild-type littermates (Fig 7D), in consistent with the results of advanced activity onset, suggesting that disruption of Slc7a11 affects the circadian entrainment. These phase advances are not found in the Slc7a11tm1a/tm1a liver tissues (S10 Fig). In addition, the SCN mRNA levels of Grp, Grpr, Vip, Pk2 and Pkr2 were significantly altered at some time points in DD in Slc7a11 knockout mice (Fig 7E). The profiles of GRP and GRPR are rhythmic under LD with a peak at ZT12 in the SCN that regulate the circadian phase [38,39]. The misalignment between the expression of Grp and Grpr observed in the SCN of Slc7a11 knockout mice may result in missing the best binding timing and result in a defect in entrainment. In addition, VIP participates the synchrony in mammalian clock neurons and mediates the entrainment of circadian rhythms [8,40]. The plateau of Vip expression in the SCN from Slc7a11tm1a/tm1a mice instead of the sharp peak of Vip expression at ZT 12, as well as the phase shift of Avp, appears to weaken the optimal onset of activity (Fig 7E). The function of prokineticin 2 (Pk2) and its receptor Pkr2 have been shown to be under the dual regulation of both light and the circadian clock and affect the circadian entrainment [41,42]. The altered expression of these genes involved in SCN intercellular communication suggest that the SCN neuron synchronization might be affected in the mutant mice [43]. Together, our results suggest that although Slc7a11 is not required for core clock function, it is involved in mediating circadian entrainment of behaviour.

Fig 6. Validation of the role of Slc7a11 in circadian entrainment of activity onset.

Fig 6

(A) The activity profiles of Slc7a11tm1a/tm1a and wildtype littermates under a 12hr light/12hr dark cycle using CLAMS (IC data). Rhythms were plotted over a 24 hr time frame as the mean ± SEM for three days (n = 10 for each genotype). (B) The onset of activity was shown as mean ± SEM. The Student t-test is used to determine the significance, *p < 0.05, **p < 0.01. (C) The representative actograms for analyzing the onset times under LD cycles in wild-type mice (left) and Slc7a11tm1a/tm1a mice (right). (D) The onset times of activity were measured by Clocklab analysis. Data was shown as means with SEM (n = 8 for WT, n = 6 for Slc7a11tm1a/tm1a) for 8 continuous days. The pink and blue shadow indicates the range of onset times for the Slc7a11tm1a/tm1a mice and their wild-type littermates. (E) The representative actograms of wheel-running activity for period determination. The mice were first entrained to an LD cycle for 14 days and then released in DD for approximately 3 weeks. Black shading indicates the time when lights were off, and the white box indicates the time when lights were on. (F-G) Period was determined by line fitting of activity onset (F) and chi-square periodogram (G) from day 11 to day 21 in DD. (H-I) Representative actograms (H) and activity profiles (I) of wheel-running activity for wild-type mice and Slc7a11tm1a/tm1a mice under skeleton photoperiod with 15-min entraining arms for an 20-d period (n = 6 for each genotype). Each row represents a single day. The mice were first entrained to a LD cycle for 10 days and then released to the skeleton photoperiod with 15-min light arms from clock time 07:45 to 08:00 and again from clock time 20:00 to 20:15 (ZT0, clock time 08:00). The black and yellow bars in I (below) represent periods of darkness and light, respectively. (J) The onset times of activity under skeleton photoperiod were measured by Clocklab analysis. Blue: WT; Red: Slc7a11tm1a/tm1a. Data was shown as means with SEM. Two-way ANOVA was employed to analyze statistical significance.

Fig 7. Altered expression of the coupling genes in the Slc7a11tm1a/tm1a mice.

Fig 7

(A) Expression of Slc7a11 in mouse SCN detected by in situ hybridization at ZT6 and ZT18. ZT, Zeitgeber Time. Coronal brain sections containing the SCN were hybridized with the cRNA sense (upper) or antisense probe (middle and lower) of Slc7a11 at ZT6 and ZT18. (B) Quantification of in situ hybridization signal of Slc7a11 by Image J from 3–4 coronal brain sections. *: p < 0.05. (C) Real-time PCR analysis of the expression of Slc7a11 in SCN of wild-type mice. Error bars represent the s.d. for each time point from three biological independent replicates. The rhythmicity of gene expression was determined based on the JTK algorithm (pJTK < 0.05). (D) Expression profiles of the core clock genes in the SCN from control and Slc7a11tm1a/tm1a mice. Also see the expression in the liver (S10 Fig). Error bars represent the s.d. for each time point from three independent replicates. (E) Expression profiles of the coupling factors in the SCN from control and Slc7a11tm1a/tm1a mice. Error bars represent the s.d. for each time point from three independent replicates. Two-way ANOVA was employed to test the statistical significance. *: P < 0.05; **: P < 0.01; ***: P <0.001.

Discussion

In this study, we demonstrated the feasibility of large-scale characterization of mouse mutants with impaired circadian alignment under light/dark cycles as part of an IMPC collaborative effort to generate a genome-wide catalogue of gene function. By analyzing IC datasets of 2201 C57BL/6N mice for activity/rest and 2160 C57BL/6N mice for food intake from five IMPC centres, we identified two robust parameters of circadian misalignment of behaviours. Our machine learning based SyncScreener enables fast, objective and large-scale behavioural screening of mutant mouse lines. We identified five genes (approximately 0.66%) among 750 mutant lines that are potentially involved with circadian misalignment of activity and food intake behaviour. Previous studies have mainly focused on measuring circadian period and not misalignment as the target phenotype. Because IMPC plans to generate ~20,000 mutant lines, many for genes thus far uncharacterized, our results have laid the foundation for future a comprehensive screen of circadian behaviour mutants under light/dark cycles.

Anticipating and adapting to light/dark cycles is a major function of circadian clocks. Recent discoveries have highlighted how the internal coincidence of the circadian clock can change phases to synchronize with external environmental cycles[44], as has been shown by mutations of PER2 and CSN1KD in familial advanced sleep phase syndrome[14,15], mutations or SNPs of ARNTL, DEC1 and RORB in bipolar disorder[45], and mutations of CRY1 in familial delayed sleep phase disorder[12]. The identification of these genes provides important insights into how molecular clocks affect human health and behaviour. In this study, we discovered that Slc7a11, Spop, Rhbdl1, Oxtr and Ctc1 are potential candidate genes that are involved in the precision and adaptability of circadian behaviours under light-dark cycles. Validation of one of these candidate genes, by generating Slc7a11 mutant mice in Cam-Su Genomic Resource Center, revealed that Slc7a11 is involved in interfering intercellular coupling factors in the SCN. Further studies of these candidate genes will uncover new insights into the mechanism of circadian misalignment of behaviours. A critical future next step will be to determine how these genes, which may be involved in distinct pathways, can influence the phase of behaviour and physiology in response to light/dark cycles. Ultimately, the availability of more than 20,000 mouse lines and our screening method established here should allow a comprehensive identification of genes involved in misalignment of mouse behaviour under light/dark cycles.

Materials and methods

Ethics statement

Mouse studies were approved by the Animal Care and Use Committee of the CAM-SU Genomic Resource Center (CAM-SU-AP#: YX-2017-1), The Centre for Phenogenomics (TCP) (Approval committee: Animal Care Committee (ACC) of The Centre for Phenogenomics. Approval License: Animal Use Protocol (AUP) 0275 and 0279H), GMC Helmholtz Zentrum München (HMGU) (Approval License: 144–10), ICS Mouse Clinical Institute (ICS) (Approval Committee: Com'Eth N°17 and French Ministry for Superior Education and Research (MESR). Approval licenses: MESR: APAFIS#4789–2016040511578546), RBRC RIKEN Tsukuba Institute, BioResource Center (RBRC) (Approval License: Exp11-002, 12–002, 13–002, 14–002, 15–002, 16–002 Collection, maintenance, storage, breeding and distribution of the mouse resources Exp11-011, 12–011, 13–011, 14–009, 14–017, 15–009, 16–008 Phenotyping analyses and related studies in mice), WTSI Wellcome Trust Sanger Institute (WTSI) (Approval License: PPL 80/2076 Valid 27th Nov 2006 - 3rd Jan 2012; PPL 80/2485 valid 3rd Jan 2012 - 5th Dec 2016). Every effort was made to minimize the number of animals used, and their suffering.

Animals

Mice were housed in specific pathogen-free animal facilities. Slc7a11tm1a/tam1a mice were generated according to the standard protocol in CAM_SU Genomic Resource Center. Fbxl3, Zbtb20, hPER1S730G mutant mice were generated as described previously and hPER2S662G was generated by Dr. Xu in Fu & Ptacek lab in the UCSF. IC data from these known mutant mice were generated by Xu lab in the CAM-SU Genomic Resource Center. All mutant mice used in these studies have been described previously [15,16,29,30].

Phenotype data acquisition

Indirect Calorimetry raw data was collected from five centres (WTSI, TCP, ICS, RBRC and HMGU). Five centres follow the pipeline of IMPC. IC equipment: WTSI, ICS and HMGU: TSE PhenoMaster/Labmaster CaloSys, TCP: Columbus Oxymax/CLAMS, RBRC: O’hara FWI-3002 & IA-16M, CAM-SU: Oxymax/CLAMS. Light schedules are WTSI: 7:30–19:30; TCP: 7:00–19:00; ICS: 7:00–19:00; RBRC 11:00–23:00; HMGU: 6:00–18:00. Recording in each center started and ended at different times. In the indirect calorimetry module standard measurements begin five hours before lights-off (lights off = T0) and are finished at T16 i.e. four hours after lights-on the next morning. Data analysis were restricted to within-centre comparisons between controls and mutants.

Data pre-processing

We improved the accuracy of the automatic identification of the peak phase and onset time by detecting and removing the bad data points (measurement errors and environment interferences) in raw data for activity and food intake before running any analyses. First, we excluded out-of-range values that exceeded a realistic scope, such as a food intake of more than 0.65 g and an activity higher than 4000 counts per hour during the daytime. Second, we carefully assessed the data obtained at the light-off time point ZT12 (denoted as y12), particularly the data obtained by ICS, because abnormal data generated at ZT12 resulting from an unstable environment, such as unstable light intensity, may lead to an inaccurate assessment of the onset time and peak phase. If y12 satisfied the following conditions: 1) a local maximum, 2) higher than four-fifths of the peak value (denoted as ymax), and 3) no other local maximum between ZT12 and ZT17, we considered it a bad data point and removed it from the dataset. Finally, we deleted the pulse breakup data point that was far from the neighbouring data points. In brief, a data point can be regarded as breakup data when the difference, ydiff_τ = min{|yτyτ−1|,|yτyτ+1|}, reaches a certain threshold, yth_p = α1ymax, where yτ represents the data value corresponding to ZTτ, and the parameter α1 = 0.57 is chosen empirically.

Onset time and peak phase recognition by deep learning (CNNs)

We applied a machine learning algorithm to identify the onset time and peak phase of daily cyclic data. In our algorithm, a CNN learned synthetic rhythmic data sets was used to predict the two biological parameters.

Training data set

The inputs to a machine learning algorithm are thousands of labelled measurements of samples that are the same type as measurements used to predict. Here, however, the diversity of chorotypes in our data was poor, the rhythm of most tested mice was similar to that of wild-type, and the rhythm of most volunteers was normal. Furthermore, our work was to label (onset times and peak phases) the daily rhythmic mice activity/food intake data. Therefore, we generated training data sets according to the raw activity and food intake data from five centres. We were interested in the onset times and peak phases of mouse behaviour. The onset time is the first transition point at which the state of mouse behaviour transfers from the rest state to the active state, and the transition process ends at the peak phase. As the transition process is kinetic, we simulated the data before the first peak by an ODE and fitted other data by Gaussian functions. Finally, we employed white noise to simulate measurement errors. To produce proper training data, we performed the following steps:

First, we calculated the average values (yave,i, i = 1,2,3…25) of all measurements at 25 zeitgeber times.

Second, we simulated the state transition process (average data before the first peak) via an OED with a piecewise function. ODE can be described as follows:

dxdt=ε(Fx)
F(t)={u2h2u1h1t2t1t+u1h1t2u2h2t1t2t1tt2u3h3t2<t<t3

F is a piecewise function presenting the endogenous switch where t1, t2 and t3stand for the zeitgeber times of the first average data point one hour before the onset time and peak phase. h2 and are data values at t2 and t3. h1 is set empirically. u1, u2 and u3 are all set to 1.1. x is the measurements of mouse behaviour, which follow the biological switch F in this model.

Mouse behaviour is bimodal in the day/night cycle. However, in some cases, there are three peaks. Thus, we employed three Gaussian functions for fitting the remaining average data. Gaussian functions f1(a1,t), f2(a2,t) and f3(a3,t) can be described as follows:

f1(a,t)=a1e(ta3)22a22+a4
f2(a,t)=a5e(ta7)22a62+a8
f3(a,t)=a9e(ta11)22a102+a12

where a3, a7 and a11 represent peak phases of three peaks; three terms a1, a5 and a9 stand for the amplitudes of three peaks; a2, a6 and a10 describe the width of each peak; and a4, a8 and a12 are the minima of the three fitting curves. Then, we found the initial values and parameter ranges from raw data for Gaussian fitting. We detected the local maximums that were higher than four neighbouring data points. The corresponding ZT values of local maximums (ZTpeak1, ZTpeak2 and ZTpeak3) were used as initial values for a3, a7 and a11. We took the initial values of a2, a6 and a10as 2 empirically. The initial a5 and a9 are m2−min(yave) and m3−min(yave) where m2 and m3 are the measured peak values of the second and third peaks from raw data, and the initial a1 is max(xonset)−min(xonset). The initial values of a4, a8 and a12 are 0.05 for food intake, 750 for activity, respectively. Then, we set the lower and upper bounds for the above parameters. The ranges of a7 and a11 are [ZTpeak2,-3, ZTpeak2+3] and [ZTpeak3,-3, ZTpeak3+3], and the ranges of a3 are [ZTpeak1, ZTpeak1]. The ranges of a5 and a9 are [0,10(m2−min(yave))] and [0,10(m3−min(yave))], and the ranges of a1 are [(max(xonset)−min(xonset)),(max(xonset)−min(xonset))]. The ranges of a2, a6 and a10 are all [0, 4]. The ranges of a4, a8 and a12 are [0, 0.15] for food intake. If there are only two peaks, m3 is set to zero. In the end, raw average data after the first peak were fitted with multiple Gaussian functions via MATLAB’s lsqcurvefit function with above initial values and parameter ranges. Accordingly, we obtained standard parameters p* to best fit the ODE and Gaussian functions to raw average data.

Next, we generated various parameter sets p to mimic different chorotypes. The parameters for ODE can be described as follows:

  1. t1 is the zeitgeber time of the first data point. t1 is set to 6 for mice and 0 for humans;

  2. t2 is the zeitgeber time of one data point before onset. t2 is uniformly distributed between 9 and 13 for mice, and

  3. t3 is the zeitgeber time of the peak phase. t3 = t2t, Δt is uniformly distributed between 0 and 3.

  4. h3 is the peak height of the first peak. h3 = ζh3h3*. h*3 represents the peak height of the first peak of the standard curve. ζh3 is a random variable following uniform distribution U (0.5, 1.5).

  5. h2 is the height of the data point at t2 h2 = ζh2h3. ζh2 is a random variable following uniform distribution U (0.2, 0.9).

  6. h1 is the height of the data point at t1. h1 = h2/ζh1. ζh1 is a random variable following uniform distribution U (1, 1.2).

  7. u1 = u2 = u3 = 1.1.

  8. ε = ζεε*. ζε is a random variable following uniform distribution U (0.8, 1.2).

  9. The solution of ODE with the above parameters is a time series y1, which represents the pattern before the first peak.

Parameters for three Gaussian functions can be described as follows:

  1. a1 is the amplitude of the first peak, a1 = max(y1)−y1(1).

  2. a2 is the width of the first peak. a2 = max{twidth1,twidth2}, twidth2 is uniformly distributed between 2 and 3. twidth1 = ζwidthtwidth0, ζwidth is uniformly distributed between 0 and 3. twidth0 = t3tmid, tmid is the zeitgeber time of ymid, ymid=h2+h32.

  3. a3 is the zeitgeber time of the first peak phase, a3 = t3.

  4. a4 is the minima of the curves, a4 = y1(1).

  5. a5 is the amplitude of the second peak, a5 = ζa5a1. ζa5 is a random variable following uniform distribution U (0.8, 1.2).

  6. a6 is the width of the second peak. a6 = ζa6a6*. a*6 represents the width of the second peak of the standard curve, and ζa6 is a random variable following a uniform distribution U (0.8, 1.2).

  7. a7 is the zeitgeber time of the second peak phase, a7 = ξa7+a7*. a*7 is the zeitgeber time of the second peak phase of standard curve, and ζa7 is a random variable following uniform distribution U (-1, 2).

  8. a8 is the minima of the curves, a8 = a4. f3(a,t)=a9e(ta11)22a102+a12.

  9. a9 is the amplitude of the third peak, a9 = ζa9(a4+a1)−a8. ζa9 is a random variable following uniform distribution U (0.5, 1.5).

  10. a10 is the width of third peak, a10 = ζa10a10*. a*10 represents the width of the third peak of the standard curve, and ζa10 is a random variable following uniform distribution U (0.8, 1.2).

  11. t11 is the zeitgeber time of the third peak phase, t11 = t1+ζa11 is a random variable following uniform distribution U (0, 1).

  12. a11 is the minima of the curves, a11 = a8.

Then, we created curves via ODE and Gaussian functions with the above parameters, p, and extracted 25 data points at each ZT. We generated 10 W training curves; therefore, training sets can be described as (Yi,j, i = 1,2,3…25, j = 1,2,3…105). Additionally, a standard normal random perturbation was used to simulate experimental noise. Thus, we created the final training sets, including the circadian patterns (parameter variation) and measurement errors (white noise).

Convolutional neural network architecture

The CNN in our deep learning model has three convolution layers, two pooling layers and two fully connected layers. Each fully connected layer consists of 1024 neurons. The dropout was inserted into the first fully connected layer to avoid overfitting, and the probability was set (1—drop probability) to 0.5. The parameters of each layer are as specified in Table 1 and Table 2. Loss function, loss=1ni=1n(yiiypi)

Table 1. Convolutional neural network structure.

neurons filters filter size strides padding activation function
convolution layer 1 (25−5)×16 16 6×1 1 SAME ReLU
convolution layer 2 (20−3)×32 32 4×1×16 1 SAME ReLU
convolution layer 2 (17−3)×64 64 4×1×32 1 SAME ReLU

Table 2. Convolutional neural network parameters.

pooling function pooling size strides
pooling layer 1 max_pool 1x2 2
pooling layer 2 max_pool 1x2 2

n is the number of training data, yii and ypi represent the i-th input data and i-th predicted result. Our optimizer was Adam, and the learning rate was set to 0.0001.

Effect size

To quantitate the strength of phenodeviance in mutant mice, we used standardized effect size (Cohen's d), d=xm¯xwt¯s, to measure the difference between mutant and wild-type mice. xm¯ and xwt¯ is mean value of phases or onset times of mutant and wild-type mice in corresponding center. s is the pooled standard deviation, as

s=(nm1)sm2+(nwt1)swt2nm+nwt2

where nm and nwt is the number of mutant and wild-type mice in corresponding center, sm and swt is the standard deviation of peak phases and onset times of mutant and wild-type mice in corresponding center. A larger value of effect size always indicates a stronger phenodeviance.

Metabolic rhythm measure and analysis

Mice were housed in individual metabolic cages in a temperature-controlled animal facility for an adaptation period of 3 days and were continuously recorded for another 3 days in 20 min time bins. The activity and food intake rhythmicities were calculated as described in our previous studies [16,29].

Locomotor activity analysis

For wheel-running activity assay, as previously described [15], six to ten four-month-old mice were individually housed in cages equipped with running wheels, and they were initially entrained to a LD cycle for at least 7 days, followed by constant darkness for several weeks. To exam light entrainment in mice, a skeleton photoperiod with two light pulses from clock time 07:45 to 08:00 and from clock time 20:00 to 20:15 (ZT0, clock time 08:00) was used. Mice were initially entrained to LD 12:12 for at least 14 d and then released to skeleton photoperiod for 21d [46]. For the jetlag experiments, mice were entrained to a LD 12:12 cycle for 10 days and then LD cycle was advanced 6 hr for 20 days before getting back to the original setting [9]. To assess masking in mice, LD 3.5:3.5 cycle was used. mice were entrained to a LD 12:12 cycle for 10 days, then released to LD 3.5:3.5 cycle for 7 days [47]. Wheel rotation was recorded using ClockLab software (Actimetrics, RRID:SCR_014309).

RNA isolation, RT-PCR and mRNA expression analyses

RNA isolation and RT-PCR (including primers for mRNA profiling) were carried out as previously described [48]. The relative levels of each RNA were normalized to the corresponding Actin levels. Each value used for these calculations was the mean of at least three replicates of the same reaction. Relative RNA levels are expressed as the percentage of the maximal value obtained for each experiment. Each mean ± s.d. was obtained from three biological independent experiments.

In situ hybridization of the SCN

Mice were euthanized by cervical dislocation at the indicated time points. Coronal sections containing the SCN were processed for in situ hybridization with cRNA sense or antisense probes from nucleotides 581–1412 (NM_011990.1) for Slc7a11. Hybridization steps were performed as in our previous study [48].

Supporting information

S1 File. Activity/rest scatter plots.

(RAR)

S2 File. Food intake scatter plots.

(RAR)

S3 File. SyncScreener.

(RAR)

S1 Data. The numerical data underlying graphs and summary statistics.

(XLSX)

S1 Table. Onset times of wild-type mice from visual assessment.

(DOCX)

S2 Table. Peak phases of wild-type mice from visual assessment.

(DOCX)

S3 Table. Onset times of wild-type mice from machine learning algorithm.

(DOCX)

S4 Table. Peak phases of wild-type mice from machine learning algorithm.

(DOCX)

S5 Table. Effect size and p value of known mutants (visual assessment).

(DOCX)

S6 Table. Number of mutant lines in each center.

(DOCX)

S7 Table. Mutant lines.

(DOCX)

S8 Table. Onset times and peak phases of mutant lines.

(DOCX)

S9 Table. Mutant lines existing in at least two centers.

(DOCX)

S10 Table. Mutant lines for the secondary criteria.

(DOCX)

S11 Table. Hits from secondary criteria.

(DOCX)

S12 Table. Phenotype associated assay.

(DOCX)

S1 Fig. Distribution of the onset time and peak phase in five IMPC centres identified by visual assessment.

Histogram of the onset and peak phase results obtained from five IMPC centres (ICS, WTSI, RBRC, TCP and HMGU) under 12-hour light and 12-hour dark cycles. n = 2201 C57BL/6N mice for activity and n = 2160 C57BL/6N mice for food intake measured by indirect calorimetry over time. Pink column: onset time, red column: peak phase.

(TIF)

S2 Fig. Onset time and peak phase identification in synthetic training sets.

(A) Mean data of food intake for wild-type mice from HMGU as an example. Red dots are averaged raw data and multicoloured curve is the fitted curve. Part data before peak phase (data between two red dashed lines) are fitted by a piecewise function in (B).Blue curve is the fitting curve. Onset is identified as the first data entering transition (green arrow), while peak phase is at the end of transition (blue arrow). Other data in (A) are fitted by Gaussian functions (green curves). (B) Piecewise function for fitting data points before E peak phase. Dividing points of two stages are indicated by two black arrows in (A) and (B).

(TIF)

S3 Fig. Bland–Altman plots of the bias between the two methods, the visual and SyncScreener for activity.

The 95% limits of agreement (1.96 s.d.) were calculated to determine whether the SyncScreener could replace visual assessment. Activity onset data and peak phase activity data obtained by the five centres (ICS, WTSI, RBRC, TCP and HMGU).

(TIF)

S4 Fig. Bland–Altman plots of the bias between the two methods, the visual and SyncScreener of food intake.

Food intake onset data and peak phase data obtained by the five centres (ICS, WTSI, RBRC, TCP and HMGU).

(TIF)

S5 Fig. Daytime activity and food intake of Oxtrtm1.1/tm1.1 mice.

(A and B) Ratios of daytime activity (A) and food intake (B) to that at night were calculated using data from Oxtrtm1.1/tm1.1 and wild-type mice data from RBRC. Two-way ANOVA was employed to test the statistical significance. ****: P < 0.0001.

(TIF)

S6 Fig. Generation of Slc7a11tm1a/tm1a mice.

(A) Schematic of knockout strategy for Slc7a11 based on knockout-first design. (B) Forward and reverse primers for genotyping. (C) PCR analysis of tail genomic DNA for wild-type and Slc7a11tm1a/tm1a alleles in wild-type, heterozygous and homozygous knockout mice.

(TIF)

S7 Fig. Profiles of energy expenditure parameters using a CLAMS.

(A) food intake; (B) the volume of CO2; (C) the volume of O2; (D) the heat. Rhythms of food intake, VCO2, VO2 and the heat were plotted over a 72 hr time frame as the mean ± SEM (n = 10). Two-way ANOVA was used to determine the statistical significance, *p < 0.05.

(TIF)

S8 Fig. Reentrainment of Slc7a11tm1a/tm1a mice to a new light-dark cycle.

(A) Representative actograms of wheel-running activity of wild-type and Slc7a11tm1a/tm1a mice subjected to a 6-hr phase advance and delay in LD cycle. At day 22, the recording was disrupted for about 24 hours. (B and C) Re-entrainment traces of phase advance (B) and delay (C) of wild-type (Blue) and Slc7a11tm1a/tm1a (Red) mice. n = 9 for wild-type mice, n = 5 for Slc7a11tm1a/tm1a mice. Two-way ANOVA was employed to test the statistical significance. *: P < 0.05.

(TIF)

S9 Fig. Masking of wild-type and Slc7a11tm1a/tm1a mice during LD 3.5:3.5.

(A and B) Representative actograms of daily wheel-running activity of wild-type (A) and Slc7a11tm1a/tm1a mice (B). Light phases are indicated in yellow to show the structure of the LD 3.5:3.5 cycle as well as to help visualize the occurrence of wheel-running activity under this schedule. (C) Masking ratios of wild-type and Slc7a11tm1a/tm1a mice during LD 3.5:3.5, which are calculated by dividing total activity during light phases with that during dark phases. n = 7 for each genotype. Two-way ANOVA was employed to test the statistical significance. n.s.: P >0.05.

(TIF)

S10 Fig. Expression profiles of the core clock genes in the liver tissues.

Error bars represent the s.d. for each time point from three biological independent replicates. Two-way ANOVA was employed to test the statistical significance. n.s.: P >0.05.

(TIF)

Acknowledgments

The authors thank all IMPC members and partners for their contribution to the consortium effort and thank members of Cambridge -Suda GRC for their assistance in animal facility and members of the Xu laboratory for discussion. Sanger Institute Mouse Genetics Project Members are as follows: David Lafont, Valerie E. Vancollie, Robbie S.B. McLaren, Emma Sanderson, Christine Rowley, Mark Griffiths, Brendan Doe, Nicola Cockle, Joanna Bottomley, Edward Ryder, Diane Gleeson, Ramiro Ramirez-Solis, Hannah Wardle-Jones, David J. Adams, Graham Duddy

Data Availability

The data underlying the results presented in this study are available from the IMPC consortium (https://www.mousephenotype.org/help/api-access/) or Cambridge-suda genomic resource center (http://gofile.me/2F1pE/RP2URKxV2). Numerical data that underlying graphs or summary statistics are provided in spreadsheet form as Supporting Information.

Funding Statement

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript" to the end of Funding information as: "This work was supported by grants from the Ministry of Science and Technology (2018YFA0801100 to YX) and the National Natural Science Foundation of China (31630091 to Y.X, 31871185 to Y.D., 31600958 to Z.L., 11671417 to L.Y.). We also thank the Priority Academic Program Development of the Jiangsu Higher Education Institutes (PAPD) and National Center for International Research (2017B01012). AMM, TFM, DS, HP, PF and the IMPC Data Coordination Centre are supported by the NIH Common fund grant (UM1HG006370). Infrafrontier grant 01KX1012, support by the German Center for Diabetes Research (DZD), EU Horizon2020: IPAD-MD funding 653961 (MHA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Takahashi JS (2017) Transcriptional architecture of the mammalian circadian clock. Nat Rev Genet 18: 164–179. 10.1038/nrg.2016.150 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Welsh DK, Takahashi JS, Kay SA (2010) Suprachiasmatic nucleus: cell autonomy and network properties. Annu Rev Physiol 72: 551–577. 10.1146/annurev-physiol-021909-135919 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Roenneberg T, Merrow M (2016) The Circadian Clock and Human Health. Curr Biol 26: R432–443. 10.1016/j.cub.2016.04.011 [DOI] [PubMed] [Google Scholar]
  • 4.Morris CJ, Purvis TE, Hu K, Scheer FA (2016) Circadian misalignment increases cardiovascular disease risk factors in humans. Proc Natl Acad Sci U S A 113: E1402–1411. 10.1073/pnas.1516953113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hattar S, Lucas RJ, Mrosovsky N, Thompson S, Douglas RH, et al. (2003) Melanopsin and rod-cone photoreceptive systems account for all major accessory visual functions in mice. Nature 424: 76–81. 10.1038/nature01761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Panda S, Provencio I, Tu DC, Pires SS, Rollag MD, et al. (2003) Melanopsin is required for non-image-forming photic responses in blind mice. Science 301: 525–527. 10.1126/science.1086179 [DOI] [PubMed] [Google Scholar]
  • 7.Harmar AJ, Marston HM, Shen S, Spratt C, West KM, et al. (2002) The VPAC(2) receptor is essential for circadian function in the mouse suprachiasmatic nuclei. Cell 109: 497–508. 10.1016/s0092-8674(02)00736-5 [DOI] [PubMed] [Google Scholar]
  • 8.Aton SJ, Colwell CS, Harmar AJ, Waschek J, Herzog ED (2005) Vasoactive intestinal polypeptide mediates circadian rhythmicity and synchrony in mammalian clock neurons. Nat Neurosci 8: 476–483. 10.1038/nn1419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Yamaguchi Y, Suzuki T, Mizoro Y, Kori H, Okada K, et al. (2013) Mice genetically deficient in vasopressin V1a and V1b receptors are resistant to jet lag. Science 342: 85–90. 10.1126/science.1238599 [DOI] [PubMed] [Google Scholar]
  • 10.Jones CR, Campbell SS, Zone SE, Cooper F, DeSano A, et al. (1999) Familial advanced sleep-phase syndrome: A short-period circadian rhythm variant in humans. Nat Med 5: 1062–1065. 10.1038/12502 [DOI] [PubMed] [Google Scholar]
  • 11.Toh KL, Jones CR, He Y, Eide EJ, Hinz WA, et al. (2001) An hPer2 phosphorylation site mutation in familial advanced sleep phase syndrome. Science 291: 1040–1043. 10.1126/science.1057499 [DOI] [PubMed] [Google Scholar]
  • 12.Patke A, Murphy PJ, Onat OE, Krieger AC, Ozcelik T, et al. (2017) Mutation of the Human Circadian Clock Gene CRY1 in Familial Delayed Sleep Phase Disorder. Cell 169: 203–215 e213. 10.1016/j.cell.2017.03.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hirano A, Shi G, Jones CR, Lipzen A, Pennacchio LA, et al. (2016) A Cryptochrome 2 mutation yields advanced sleep phase in humans. Elife 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Xu Y, Padiath QS, Shapiro RE, Jones CR, Wu SC, et al. (2005) Functional consequences of a CKIdelta mutation causing familial advanced sleep phase syndrome. Nature 434: 640–644. 10.1038/nature03453 [DOI] [PubMed] [Google Scholar]
  • 15.Xu Y, Toh KL, Jones CR, Shin JY, Fu YH, et al. (2007) Modeling of a human circadian mutation yields insights into clock regulation by PER2. Cell 128: 59–70. 10.1016/j.cell.2006.11.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Liu Z, Huang M, Wu X, Shi G, Xing L, et al. (2014) PER1 phosphorylation specifies feeding rhythm in mice. Cell Rep 7: 1509–1520. 10.1016/j.celrep.2014.04.032 [DOI] [PubMed] [Google Scholar]
  • 17.Crosby P, Hamnett R, Putker M, Hoyle NP, Reed M, et al. (2019) Insulin/IGF-1 Drives PERIOD Synthesis to Entrain Circadian Rhythms with Feeding Time. Cell 177: 896–909 e820. 10.1016/j.cell.2019.02.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Balsalobre A, Brown SA, Marcacci L, Tronche F, Kellendonk C, et al. (2000) Resetting of circadian time in peripheral tissues by glucocorticoid signaling. Science 289: 2344–2347. 10.1126/science.289.5488.2344 [DOI] [PubMed] [Google Scholar]
  • 19.Saini C, Morf J, Stratmann M, Gos P, Schibler U (2012) Simulated body temperature rhythms reveal the phase-shifting behavior and plasticity of mammalian circadian oscillators. Genes Dev 26: 567–580. 10.1101/gad.183251.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Buhr ED, Yoo SH, Takahashi JS (2010) Temperature as a universal resetting cue for mammalian circadian oscillators. Science 330: 379–385. 10.1126/science.1195262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Stokkan KA, Yamazaki S, Tei H, Sakaki Y, Menaker M (2001) Entrainment of the circadian clock in the liver by feeding. Science 291: 490–493. 10.1126/science.291.5503.490 [DOI] [PubMed] [Google Scholar]
  • 22.Golombek DA, Rosenstein RE (2010) Physiology of circadian entrainment. Physiol Rev 90: 1063–1102. 10.1152/physrev.00009.2009 [DOI] [PubMed] [Google Scholar]
  • 23.de Angelis MH, Nicholson G, Selloum M, White JK, Morgan H, et al. (2015) Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics (vol 47, pg 969, 2015). Nature Genetics 47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, et al. (2016) High-throughput discovery of novel developmental phenotypes. Nature 537: 508–514. 10.1038/nature19356 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Beckers J, Wurst W, de Angelis MH (2009) Towards better mouse models: enhanced genotypes, systemic phenotyping and envirotype modelling. Nat Rev Genet 10: 371–380. 10.1038/nrg2578 [DOI] [PubMed] [Google Scholar]
  • 26.Helfrich-Forster C (2009) Does the morning and evening oscillator model fit better for flies or mice? J Biol Rhythms 24: 259–270. 10.1177/0748730409339614 [DOI] [PubMed] [Google Scholar]
  • 27.Inagaki N, Honma S, Ono D, Tanahashi Y, Honma K (2007) Separate oscillating cell groups in mouse suprachiasmatic nucleus couple photoperiodically to the onset and end of daily activity. Proc Natl Acad Sci U S A 104: 7664–7669. 10.1073/pnas.0607713104 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Myles PS, Cui J (2007) Using the Bland-Altman method to measure agreement with repeated measures. Br J Anaesth 99: 309–311. 10.1093/bja/aem214 [DOI] [PubMed] [Google Scholar]
  • 29.Qu Z, Zhang H, Huang M, Shi G, Liu Z, et al. (2016) Loss of ZBTB20 impairs circadian output and leads to unimodal behavioral rhythms. Elife 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Shi G, Xing L, Liu Z, Qu Z, Wu X, et al. (2013) Dual roles of FBXL3 in the mammalian circadian feedback loops are important for period determination and robustness of the clock. Proc Natl Acad Sci U S A 110: 4750–4755. 10.1073/pnas.1302560110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sawilowsky SS (2009) New effect size rules of thumb. Journal of Modern Applied Statistical Methods 8: 467–474. [Google Scholar]
  • 32.de Angelis MH, Nicholson G, Selloum M, White J, Morgan H, et al. (2015) Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics. Nat Genet 47: 969–978. 10.1038/ng.3360 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Potter PK, Bowl MR, Jeyarajan P, Wisby L, Blease A, et al. (2016) Novel gene function revealed by mouse mutagenesis screens for models of age-related disease. Nat Commun 7: 12444 10.1038/ncomms12444 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Skarnes WC, Rosen B, West AP, Koutsourakis M, Bushell W, et al. (2011) A conditional knockout resource for the genome-wide study of mouse gene function. Nature 474: 337–342. 10.1038/nature10163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Herzog ED, Hermanstyne T, Smyllie NJ, Hastings MH (2017) Regulating the Suprachiasmatic Nucleus (SCN) Circadian Clockwork: Interplay between Cell-Autonomous and Circuit-Level Mechanisms. Cold Spring Harb Perspect Biol 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bae K, Jin X, Maywood ES, Hastings MH, Reppert SM, et al. (2001) Differential functions of mPer1, mPer2, and mPer3 in the SCN circadian clock. Neuron 30: 525–536. 10.1016/s0896-6273(01)00302-6 [DOI] [PubMed] [Google Scholar]
  • 37.Bunger MK, Wilsbacher LD, Moran SM, Clendenin C, Radcliffe LA, et al. (2000) Mop3 is an essential component of the master circadian pacemaker in mammals. Cell 103: 1009–1017. 10.1016/s0092-8674(00)00205-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Karatsoreos IN, Romeo RD, Mcewen BS, Rae S (2010) Diurnal regulation of the gastrin-releasing peptide receptor in the mouse circadian clock. European Journal of Neuroscience 23: 1047–1053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Aida R, Moriya T, Araki M, Akiyama M, Wada K, et al. (2002) Gastrin-Releasing Peptide Mediates Photic Entrainable Signals to Dorsal Subsets of Suprachiasmatic Nucleus via Induction ofPeriod Gene in Mice. Molecular pharmacology 61: 26–34. 10.1124/mol.61.1.26 [DOI] [PubMed] [Google Scholar]
  • 40.Mazuski C, Abel JH, Chen SP, Hermanstyne TO, Jones JR, et al. (2018) Entrainment of Circadian Rhythms Depends on Firing Rates and Neuropeptide Release of VIP SCN Neurons. Neuron 99: 555–563 e555. 10.1016/j.neuron.2018.06.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Cheng MY, Bullock CM, Li C, Lee AG, Bermak JC, et al. (2002) Prokineticin 2 transmits the behavioural circadian rhythm of the suprachiasmatic nucleus. Nature 417: 405–410. 10.1038/417405a [DOI] [PubMed] [Google Scholar]
  • 42.Zhou QY, Cheng MY (2005) Prokineticin 2 and circadian clock output. FEBS J 272: 5703–5709. 10.1111/j.1742-4658.2005.04984.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Hatori M, Gill S, Mure LS, Goulding M, O'Leary DD, et al. (2014) Lhx1 maintains synchrony among circadian oscillator neurons of the SCN. Elife 3: e03357 10.7554/eLife.03357 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Herzog ED (2007) Neurons and networks in daily rhythms. Nat Rev Neurosci 8: 790–802. 10.1038/nrn2215 [DOI] [PubMed] [Google Scholar]
  • 45.McCarthy MJ, Nievergelt CM, Kelsoe JR, Welsh DK (2012) A survey of genomic studies supports association of circadian clock genes with bipolar disorder spectrum illnesses and lithium response. PLoS One 7: e32091 10.1371/journal.pone.0032091 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kurien P, Hsu PK, Leon J, Wu D, McMahon T, et al. (2019) TIMELESS mutation alters phase responsiveness and causes advanced sleep phase. Proc Natl Acad Sci U S A 116: 12045–12053. 10.1073/pnas.1819110116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Izumo M, Pejchal M, Schook AC, Lange RP, Walisser JA, et al. (2014) Differential effects of light and feeding on circadian organization of peripheral clocks in a forebrain Bmal1 mutant. 3: 320–324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Wang X, Tang J, Xing L, Shi G, Ruan H, et al. (2010) Interaction of MAGED1 with nuclear receptors affects circadian clock function. EMBO J 29: 1389–1400. 10.1038/emboj.2010.34 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Gregory S Barsh, Achim Kramer

10 Sep 2019

Dear Dr Xu,

Thank you very much for submitting your Research Article entitled 'High-throughput discovery of genetic determinants of circadian entrainment' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review again a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see our guidelines.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Achim Kramer

Associate Editor

PLOS Genetics

Gregory Barsh

Editor-in-Chief

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Zhang et al. developed a high-throughput screening approach to identify genes involved in circadian entrainment in mice. They use WT activity and feeding data from five different IMPC centers as reference data set and test this against data from 750 mutant lines. They identify 5 new mutants of which one is followed up. Slc7a11 mutant mice show advanced activity onsets and some changes in the expression of cell coupling genes in the SCN.

The paper describes an interesting and very promising approach - and a clever use of existing consortium data. I have two major points of criticism:

1. Maybe I misunderstood this, but to me it is somewhat surprising that from the 750 screened lines no known clock regulators with activity phenotypes have emerged.

2. While the proclaimed target of the screen is circadian entrainment only LD12:12 activity and gene expression data are shown, and no entrainment-targeted experiments were performed.

Further points:

1. It is surprising that the activity phase advance for the Slc7a11 mutants is rather mild and not very robust. Was this one of the best mutants? How often were strong onset advances or delays (+/- 3 hrs or more) observed?

2. Line 103 – would suggest inserting “but may also promote” before “sleep disorders” as chronodisruption is not the only cause for the following disorders.

3. Line 147 – make sure to clearly distinguish between “circadian” and “diurnal”. Since you are studying LD conditions in most cases, “daily” or “diurnal” would be correct here.

4. Page 9 – why were activity data binned at 1-h intervals? For the machine learning algorithm this should not make much of a difference and you are simplifying the data, thus potentially masking interesting phenotypes.

5. Line 193 – you argue that you excluded M peaks from analysis because they are less robust, but what would that mean for entrainment? Entrainment phase should be reflected in both peaks. I guess there is a bit of a misconception between “entrainment” and “activity pattern/profile” throughout the paper. Along this line it does not surprise that SCN clock gene expression is not consistently altered in the Slc7a11 mutant since effects could be downstream of the SCN.

6. Line 265 – what do you mean by “similar phenotypes”?

7. Page 14 – to assess entrainment I would expect T-cycle experiments or experiments under different light intensity conditions. Could the advances of activity onsets in the Slc7a11 mutants be explained by altered light masking? Do the mice entrain faster/slower under shifted LD cycle conditions?

8. Figs. 3 & 5 – relabel “ZT30” to “ZT6”

9. Fig. 7C – why was gene expression not tested in the SCN? How were CTs determined for arrhythmic Per1/2 and Bmal1 mutants? It is very counterintuitive that Slc7a11 gains rhythmicity in the two clock-less mutants compared to WT animals.

10. Fig. 7D – clock gene expression appears very much dampened already in the WT animals. Why were genes with very little rhythmicity in the SCN (Clock, Cry2) tested, but not high-amplitude genes such as Dbp or Nr1d1?

11. Fig. 7E – this figure is confusing to me. Several of the tested genes are not rhythmic in the WT SCN. Does this mean the effect of Slc7a11 is independent of the clock? If so, can one then speak of “circadian entrainment” (see also above)?

Reviewer #2: Zhang et al. analyze indirect calorimetry data form IMPC centers to extract circadian entrainment data. Overall this is paper carries out good analysis of existing data to extract new phenotypes. I certainly think it deserves publication in PLOS Genetics. I have several recommendations that will improve clarity of the paper.

Major – sex is not addressed in the paper. This is a major factor that affects many diurinal/circadian parameters, particularly during estrus (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4288375/). The authors should model sex in their analysis.

The method in which mutants were identified is not clear or explained (pvalue, effect size, and variance are combined to find 5 affected mutants. The process is arbitraty.

The machine learning based method developed here (SyncScreener) it is not stated how and where this will be distributed.

CAM-SU Genomic Resource Center is cited as receiving S1 File and S2 File. There is no information on how the reader can access this. If this is not a maintained data repository please deposit the data at other well curated databases.

Will the analysis performed here be part of the mousephenotype.org (impc) website? Given these are IMPC strains, it would be good to have this data available on this site for the community.

Specific comments –

Figure 1 – I didn’t find this figure useful. It outlines the flow of the paper but does not add any information. It also uses terms that are confusing and misleading.

For instance – “Discovery of genetic determinants” is genotype association. The terms primary and secondary screen makes it seem as if there were two levels of screening. This should state something like primary criteria and secondary criteria. “Validation” was done by regenerating mutant lines for wheel running or longer IC studies. This is not conveyed in the figure. Similarly, the figure legend lacks description.

In general, a figure outlining the process of IC at the various centers will be more useful. Emphasize the differences between the centers. Even though the protocols are available online, the paper should concisely state the detailed protocol at the centers and emphasize the differences (light cycle etc.) between the centers. If not in this figure, it should be a supplementary figure. This will help orient the reader on the challenges in analyzing this data.

Minor – typo in figure. “Cellecting” should probably be “collecting”.

Figure 2 –

A, B - The data is only shown for 18 hours (x-axis). The text does not clearly explain why. I’m assuming that this the general length of the IC protocol. Most reader will expect a 24 hr LD bar. HMGU is padded on the right with 0, please explain this.

The data is sorted on the y axis (mice seem to be organized by peak phase), however, nothing is stated about this in the figure legend.

Page 9, line 180 states “moreover, these two parameters exhibited a stable phase relationship…” I’m not sure what is meant by a stable phase relationship. Do you mean the phase relationship is consistent across centers or within a center or both. They certainly vary across centers – the difference between Activity peak phase and onset varies from 1 hr to 3hr. Please clarify these for the reader.

Line 197-199 describe LD conditions and are not in the methods. See earlier point about describing experimental differences between centers.

Figure 3 – why are the food intake pattens not shown for Fbxl3 and Zbtb20?

Which center was the validation data generated?

Figure 4 – this is a key figure that describies finding the mutants using the methods developed above. I find that a clear figure showing the distribution of the 5 final mutants is missing. I would like to see the phenotype(s) of these plotted against controls or other mutants. The “secondary screen” is actually just a secondary criteria set that the original mutants are placed through. Please explain the rationale for these – why are all lines that have a greater than 2sd effect size selected? Why not use pvalue from the onset regardless of the effect size. The rationale for comparison is not clear. Effect size is used in primary and then a combination of effect size, pvalue (strict at 0.001), and variance (50% must have significant phenotype) is used in the secondary critera. This seems like conditions were tweaked till an acceptable number of hits arose. This should be clarified.

Just to make sure that the baby is not thrown out with the bathwater - do the 88 genes that survive primary analysis show enrichment in certain pathways or gene ontologies. Do they show significant enrichment of human GWAS hits for chronotype phenotypes? There has been a slew of these papers.

Visualization point - The 3D barplots are very hard to interpret. I found myself drawing lines to in 3D space to compare the right bars (https://guides.library.duke.edu/datavis/topten). Showing the data in multiple 2D histograms will be more interpretable. Small multiple plots are generally easier to interpret (https://www.r-bloggers.com/why-you-should-master-the-small-multiple-chart/ or https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003833 (Rule 8)).

Figure 5 – describes the phenotypes of the 5 selected mutants. Please provide a table similar to Figure 3G table for these mutants that clearly describes the effect size and pvalue (table S11 does not provide both stats).

How do these mutants compare in effect size and pvalue to the validation mutants for the same phenotypes. The Slc7A11 phenotype seems to have a very low effect. It would be helpful to have E and M peaks clearly marked for control and mutant on these plots.

Figure 6 – The validation of Slc7a11 confirms that this is a very subtle mutant. The onset of activity is slightly disorganized, perhaps advanced. The wheel running DD data has no effect, when mutations with phase advance usually have shorter period. The wheel running actogram that is shown shows that there is slightly more daytime activity in the mutants. This lack of activity consolidation should lead to lowered circadian amplitude. Please check this.

Figure 7 – no comments.

Reviewer #3: This work takes data collected by the IMPC and analyses it for the timing of behavior. This is a wonderful idea, taking advantage of a resource that has many treasures still to be discovered and extending the usefulness of the massive amount of work that has gone into building this program. The validation of the candidate gene, Slc7a11, is also very nicely handled. I do have several comments which I think will make the article more useful to a general readership and boost the scholarship as well.

“However, how internal circadian clocks are entrained to changes in photoperiod remains unclear. “

This sentence implies that changes in photoperiod will be addressed. They are not. The authors rather use entrained phase as a phenotypic marker in their screen. Another level of complexity would be to screen on entrainment to different photoperiod.

“In addition, activity, feeding, temperature and glucocorticoid signals can also affect the circadian phase of the circadian clock [17-21]. These studies indicate that circadian entrainment is influenced at multiple regulatory levels.”

I think it would be better to say that these are all zeitgebers of the circadian clock and thus by definition, they will impart phase information on their target tissues. Not all of the papers cited actually discuss or systematically probe phase (though they do show phase shifting at least).

“We hypothesized that mouse mutants with impaired circadian entrainment would result in phase changes in locomotor activity and/or food intake behaviour under the light/dark cycle “

Why? The authors should put this hypothesis into a better context. The story of food entrainment – the one before the molecular era – is inspiring to all scientists! It is an opportunity to help non-chronobiologists understand that chronobiology is not just a time point in the dark and a time point in the light, that it is a robust machinery that regulates our behaviour and physiology systematically. Further, PLoS Genetics is a non-clocks journal and the concept (peripheral oscillators etc.) would benefit from a graphical explanation as well as a few words here.

“The onset time was defined as a transition from rest to steep activity/food intake, while the peak phase was the end of the transition.”

This is not clear. Which point on the transition? For dim light melatonin onset, many groups empirically decide which point of the onset (e.g. 25% level of the upslope of onset) to use.

„The results for the onset time and peak phase of activity and food intake for each centre were determined by SyncScreener (Figs 2C and 2D, S3 and S4 Tabl “

I think it would be worth graphing the peak and onset values for activity and food intake for the 5 centers. The data are shown here and we get an impression of the differences but we should see this graphed rather than these figures and the table in the suppl materials. Graphical representation would be helpful to appreciate the variance.

“We reasoned that the onset time was easily recognized by visual assessment due to an obvious steep ascension from an inactive state, whereas peak phase sometimes displayed as a plateau, which may lead to variability in peak phase identification.”

Therefore the onsets are used? It is not clear what the conclusion is here and if a decision was taken. If the peak is unstable (I have seen this also and also prefer onsets) a measure called Center of Gravity can also be used.

“Finally, to validate our predictive models and parameters, we used several known circadian mutant lines to evaluate the circadian parameters”

Two things here: Before going into this, we would like to know about the variance (beyond the nice heat maps) in the controls. How does this look in comparison to wheel running data in the same strain? I think it would be important to not just show the pooled data from the sites but to show the variation of all individuals. What is the onset with SD? Before looking at the mutants, we would like to know this.

How does this data on clock mutants compare to what is published?

“delayed onset and peak phase of both food intake and activity compared to wild-type mice at TCP (Figs 5C-5F), suggesting that circadian robustness might be impaired.“

I do not agree with this statement/conclusion. Can the authors back it up?

„Rhbdl1+/tm1a mutant mice also displayed vision defects“

Very nice observation showing the power of this screen to reveal entrainment mutants. (In Neurospora there have been mutants reported with no difference in period but rather in entrainment.)

“The Oxtrtm1a/tm1a mice exhibited a trend towards more daytime activity and food intake tendency”

Is this backed up somewhere with data?

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: No: the supplementary data 1 and 2 need to be deposited to a curated database. the code and neural network weights should be shared on github or similar repository.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 1

Gregory S Barsh, Achim Kramer

19 Dec 2019

Dear Dr Xu,

We are pleased to inform you that your manuscript entitled "High-throughput discovery of genetic determinants of circadian misalignment" can be principally accepted for publication in PLOS Genetics pending that you deal with the one remaining minor concern of Reviewer #3.

In addition, before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Achim Kramer

Associate Editor

PLOS Genetics

Gregory Barsh

Editor-in-Chief

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: I thank the authors for addressing my suggestions and revising the paper. I has much improved.

Reviewer #3: Thank you for the extensive revisions in response to the reviewer feedback. I have one single remaining objection, namely to the sentence " However, how internal circadian clocks are entrained to changes in photoperiod remains unclear." The paper does not address this issue. The addition of a skeleton photoperiod is a method to probe entrainment but it does not address alternative photoperiods. My impression is that entrainment was used to classify mutant phenotypes and I would suggest using this more encompassing term instead.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: No

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-19-01158R1

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Gregory S Barsh, Achim Kramer

3 Jan 2020

PGENETICS-D-19-01158R1

High-throughput discovery of genetic determinants of circadian misalignment

Dear Dr Xu,

We are pleased to inform you that your manuscript entitled "High-throughput discovery of genetic determinants of circadian misalignment" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Matt Lyles

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Activity/rest scatter plots.

    (RAR)

    S2 File. Food intake scatter plots.

    (RAR)

    S3 File. SyncScreener.

    (RAR)

    S1 Data. The numerical data underlying graphs and summary statistics.

    (XLSX)

    S1 Table. Onset times of wild-type mice from visual assessment.

    (DOCX)

    S2 Table. Peak phases of wild-type mice from visual assessment.

    (DOCX)

    S3 Table. Onset times of wild-type mice from machine learning algorithm.

    (DOCX)

    S4 Table. Peak phases of wild-type mice from machine learning algorithm.

    (DOCX)

    S5 Table. Effect size and p value of known mutants (visual assessment).

    (DOCX)

    S6 Table. Number of mutant lines in each center.

    (DOCX)

    S7 Table. Mutant lines.

    (DOCX)

    S8 Table. Onset times and peak phases of mutant lines.

    (DOCX)

    S9 Table. Mutant lines existing in at least two centers.

    (DOCX)

    S10 Table. Mutant lines for the secondary criteria.

    (DOCX)

    S11 Table. Hits from secondary criteria.

    (DOCX)

    S12 Table. Phenotype associated assay.

    (DOCX)

    S1 Fig. Distribution of the onset time and peak phase in five IMPC centres identified by visual assessment.

    Histogram of the onset and peak phase results obtained from five IMPC centres (ICS, WTSI, RBRC, TCP and HMGU) under 12-hour light and 12-hour dark cycles. n = 2201 C57BL/6N mice for activity and n = 2160 C57BL/6N mice for food intake measured by indirect calorimetry over time. Pink column: onset time, red column: peak phase.

    (TIF)

    S2 Fig. Onset time and peak phase identification in synthetic training sets.

    (A) Mean data of food intake for wild-type mice from HMGU as an example. Red dots are averaged raw data and multicoloured curve is the fitted curve. Part data before peak phase (data between two red dashed lines) are fitted by a piecewise function in (B).Blue curve is the fitting curve. Onset is identified as the first data entering transition (green arrow), while peak phase is at the end of transition (blue arrow). Other data in (A) are fitted by Gaussian functions (green curves). (B) Piecewise function for fitting data points before E peak phase. Dividing points of two stages are indicated by two black arrows in (A) and (B).

    (TIF)

    S3 Fig. Bland–Altman plots of the bias between the two methods, the visual and SyncScreener for activity.

    The 95% limits of agreement (1.96 s.d.) were calculated to determine whether the SyncScreener could replace visual assessment. Activity onset data and peak phase activity data obtained by the five centres (ICS, WTSI, RBRC, TCP and HMGU).

    (TIF)

    S4 Fig. Bland–Altman plots of the bias between the two methods, the visual and SyncScreener of food intake.

    Food intake onset data and peak phase data obtained by the five centres (ICS, WTSI, RBRC, TCP and HMGU).

    (TIF)

    S5 Fig. Daytime activity and food intake of Oxtrtm1.1/tm1.1 mice.

    (A and B) Ratios of daytime activity (A) and food intake (B) to that at night were calculated using data from Oxtrtm1.1/tm1.1 and wild-type mice data from RBRC. Two-way ANOVA was employed to test the statistical significance. ****: P < 0.0001.

    (TIF)

    S6 Fig. Generation of Slc7a11tm1a/tm1a mice.

    (A) Schematic of knockout strategy for Slc7a11 based on knockout-first design. (B) Forward and reverse primers for genotyping. (C) PCR analysis of tail genomic DNA for wild-type and Slc7a11tm1a/tm1a alleles in wild-type, heterozygous and homozygous knockout mice.

    (TIF)

    S7 Fig. Profiles of energy expenditure parameters using a CLAMS.

    (A) food intake; (B) the volume of CO2; (C) the volume of O2; (D) the heat. Rhythms of food intake, VCO2, VO2 and the heat were plotted over a 72 hr time frame as the mean ± SEM (n = 10). Two-way ANOVA was used to determine the statistical significance, *p < 0.05.

    (TIF)

    S8 Fig. Reentrainment of Slc7a11tm1a/tm1a mice to a new light-dark cycle.

    (A) Representative actograms of wheel-running activity of wild-type and Slc7a11tm1a/tm1a mice subjected to a 6-hr phase advance and delay in LD cycle. At day 22, the recording was disrupted for about 24 hours. (B and C) Re-entrainment traces of phase advance (B) and delay (C) of wild-type (Blue) and Slc7a11tm1a/tm1a (Red) mice. n = 9 for wild-type mice, n = 5 for Slc7a11tm1a/tm1a mice. Two-way ANOVA was employed to test the statistical significance. *: P < 0.05.

    (TIF)

    S9 Fig. Masking of wild-type and Slc7a11tm1a/tm1a mice during LD 3.5:3.5.

    (A and B) Representative actograms of daily wheel-running activity of wild-type (A) and Slc7a11tm1a/tm1a mice (B). Light phases are indicated in yellow to show the structure of the LD 3.5:3.5 cycle as well as to help visualize the occurrence of wheel-running activity under this schedule. (C) Masking ratios of wild-type and Slc7a11tm1a/tm1a mice during LD 3.5:3.5, which are calculated by dividing total activity during light phases with that during dark phases. n = 7 for each genotype. Two-way ANOVA was employed to test the statistical significance. n.s.: P >0.05.

    (TIF)

    S10 Fig. Expression profiles of the core clock genes in the liver tissues.

    Error bars represent the s.d. for each time point from three biological independent replicates. Two-way ANOVA was employed to test the statistical significance. n.s.: P >0.05.

    (TIF)

    Attachment

    Submitted filename: Responses 20191108FF.docx

    Data Availability Statement

    The data underlying the results presented in this study are available from the IMPC consortium (https://www.mousephenotype.org/help/api-access/) or Cambridge-suda genomic resource center (http://gofile.me/2F1pE/RP2URKxV2). Numerical data that underlying graphs or summary statistics are provided in spreadsheet form as Supporting Information.


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES