Skip to main content
JMIR Formative Research logoLink to JMIR Formative Research
. 2025 Nov 14;9:e81107. doi: 10.2196/81107

Sleep and Activity Patterns as Transdiagnostic Behavioral Biomarkers in Psychiatry: Longitudinal Observational Study From the DeeP-DD Study

Dylan Hamitouche 1,2,, Tihare Zamorano 1,3, Youcef Barkat 1,4, Deven Parekh 1, Lena Palaniyappan 1,5,6,7, Sara Jalali 1, David Benrimoh 1,5
Editors: Alicia Stone, Amaryllis Mavragani
PMCID: PMC12617961  PMID: 41237332

Abstract

Background

Despite widespread use of symptom rating scales in psychiatry, these tools are limited by reliance on self-report, infrequent administration, and lack of predictive power. This constrains clinicians’ ability to monitor illness trajectories or anticipate adverse outcomes like relapse. Actigraphy, a passive wearable-based method for measuring sleep and physical activity, offers objective, high-resolution behavioral data that may better reflect symptom fluctuations. Prior research has shown associations between actigraphy features and mood or psychosis symptoms, but most studies have focused on narrow diagnostic groups or fixed time windows, limiting clinical translation.

Objective

This study aims to examine whether actigraphy-derived sleep and activity features correlate with psychiatric symptom severity in a transdiagnostic psychiatric sample, and to identify which features are most clinically relevant across multiple temporal resolutions.

Methods

We present a feasibility case series study analyzing preliminary data from 8 outpatients (ages 18‐52 years) enrolled in the Deep Phenotyping and Digitalization at Douglas (DeeP-DD) study, a prospective transdiagnostic study of digital phenotyping. Participants wore wrist-based actigraphy devices (GENEActiv) for up to 5 months. Symptom severity was measured using a variety of self- and clinician-rated scales. We performed intraindividual Spearman correlations and interindividual repeated measures correlations across daily, weekly, monthly, and full-duration averages.

Results

Intraindividual analyses revealed that later rise times were significantly associated with higher weekly 9-item Patient Health Questionnaire (PHQ-9) scores in participant 7 (ρ=0.74, P<.001) and participant 4 (ρ=0.78, P=.02), as well as higher weekly 7-item General Anxiety Disorder (GAD-7) scores in participant 7 (ρ=0.59, P=.03). While similar trends were observed at daily and monthly timescales, the weekly resolution yielded the most robust significance. Interindividual analyses showed that weeks with later average rise time correlated with higher PHQ-9 (r=0.48, P<.001) and GAD-7 scores (r=0.38, P=.03), with the PHQ-9 association remaining significant after Bonferroni correction (Bonferroni-corrected P=.02). Increased light physical activity was linked to lower PHQ-9 scores weekly (r=−0.44, P=.001) and monthly (r=−0.53, P=.01). Over the whole duration of the study, increased levels of sedentary activity were associated with lower GAD-7 scores (ρ=0.74; P<.001).

Conclusions

Our findings highlight actigraphy-derived sleep and activity features, particularly rise time and physical activity, as promising transdiagnostic markers of psychiatric symptom burden. Their consistent associations across temporal scales and diagnostic groups underscore their potential utility for scalable, real-world clinical monitoring. Future work should validate these findings in larger cohorts and explore advanced analytical methods to capture circadian rhythmicity and symptom dynamics more precisely.

Introduction

Despite decades of research into the neurobiology of various mental illnesses, there remains a lack of clinically reliable tools for identifying, monitoring, and staging most psychiatric conditions [1-4]. In major depression, best practices involve standardized rating scales as part of measurement-based care, which can help patients achieve remission more quickly [5-8]. These scales are clinically useful for tracking symptom severity and guiding treatment decisions. However, there are no widely accepted protocols for monitoring patients post-remission, and in severe mental illnesses such as schizophrenia, patient self-report is less reliable, while rating scales add to clinicians’ workload [9-12]. Moreover, rating scales may fail to capture what matters most to patients [13]—for instance, the commonly used 9-item Patient Health Questionnaire (PHQ-9) [14,15] does not distinguish between hypersomnia and insomnia. While rating scales are helpful for assessing illness status or guiding treatment adjustments, they are seldom used to predict events such as rehospitalization. This may be due in part to their limited predictive utility, as well as their infrequent and inconsistent use in clinical settings. Therefore, this gap in our ability to anticipate and prevent adverse outcomes highlights the need for more reliable, actionable predictors in clinical care.

In response, and in parallel with the rise of smartphones and wearables, interest has grown in “digital biomarkers” [16,17]—sensor-derived data collected passively by patient devices. Among those, digital monitoring and predictive biomarkers often use both smartphones and wearables, capturing data such as app usage, communication, location, light, movement, sleep, activity, and heart rate [18,19]. These tools have been used to infer mood states or behavior by combining multiple data streams—for example, call logs and GPS for sociability and mobility [19]. Integrating passive (sensor) and active (questionnaire) digital tools with clinical, historical, and biological data is one approach to “deep phenotyping”—a detailed characterization of patients. Among these tools, actigraphy, a wearable-based method for measuring rest-activity cycles, has emerged as particularly relevant. It provides objective, continuous data on sleep and activity, which are critical in conditions like bipolar disorder [20,21] and are often poorly captured by self-report [22].

Actigraphy has shown clinical relevance as a monitoring and predictive tool by detecting changes linked to symptom trajectories across psychiatric disorders. In schizophrenia, sleep disturbances captured by actigraphy may worsen symptoms via mood and attention-related pathways [23-25]. Studies show daily fluctuations in sleep and activity track symptoms, with physical activity linked to improved same-day mood but sometimes worse next-day symptoms [26]. In mood disorders, machine learning models using actigraphy can predict next-day depressive or manic episodes [27]. Depressed patients show reduced daytime activity, with age-specific sleep difficulties suggesting circadian disruption as a core mechanism [28-30]. Finally, in bipolar disorder, sleep and activity disturbances persist even during euthymia and are tied to episode onset, severity, and long-term outcomes [31-33].

Despite their promise, existing studies of actigraphy-derived sleep and activity features have often examined narrow diagnostic groups or short, fixed observation windows, limiting generalizability [34,35]. This limitation is significant given the multitude of available actigraphy measures, making it essential to identify which features demonstrate reliability and clinical relevance across diagnostic categories to optimize data collection and enhance clinical interpretation. Furthermore, since actigraphy data can be examined over multiple temporal scales, from days to months, understanding which measures maintain their utility across these timescales is critical. For instance, measures that are only informative over extended periods restrict their applicability for early or real-time clinical inference, as a substantial accumulation of data would be necessary before meaningful conclusions can be drawn. Addressing these limitations will improve the clinical utility and efficiency of actigraphy-based monitoring in psychiatric research and clinical practice.

This gap in the literature motivates the exploration of actigraphy-derived sleep and physical activity features as candidate monitoring and predictive biomarkers that transcend diagnostic categories and temporal scales. As part of the Deep Phenotyping and Digitalization at Douglas (DeeP-DD) project, a prospective transdiagnostic study aiming to identify clinically useful, scalable, and interpretable digital markers of mental illness, we present a case series study aiming to demonstrate the feasibility of actigraphy-based monitoring across diagnostic categories and temporal scales. We report data completeness rates, highlight the relevance of actigraphy for tracking subjective mood within and between individuals, and explore early signals of clinical relevance across multiple timescales in a real-world, diagnostically heterogeneous sample. Identifying robust, objective markers of illness could support clinical psychiatry by reducing reliance on disorder-specific tools and providing utility even when diagnostic clarity is limited, such as in youth mental health populations.

Methods

Overview

DeeP-DD is an ongoing feasibility study at the Douglas Mental Health University Institute in Montreal, Canada. Its objective is to test the feasibility and acceptability of multimodal digital phenotyping in a realistic transdiagnostic clinical population, including individuals with early psychosis, schizophrenia or schizoaffective disorder, bipolar disorder, anxiety, unipolar depression, and other conditions such as personality or substance use disorders. The study evaluates a wide range of digital tools, including actigraphy, smartphone sensors, ecological momentary assessments, and self-report questionnaires, and aims to develop interpretable “return of results” clinical reports to support personalized care.

DeeP-DD is designed as a feasibility and demonstration study rather than to test specific quantitative hypotheses. The planned target sample size is approximately 50 participants per diagnostic group (psychosis, bipolar disorder, depression, and personality or addiction disorder), for a total of about 200 participants. This sample size was chosen to enable assessment of transdiagnostic feasibility and data completeness across multiple modalities. As data collection progresses, analyses across all DeeP-DD components will inform the required sample size for a future interventional trial evaluating the impact of a combined digital phenotyping intervention and personalized feedback report. In this study, we focus on preliminary actigraphy and questionnaire data from the initial pilot phase, as these modalities were available for most participants.

Ethical Considerations

The study was approved by the research ethics board of the West-Central Montreal health authority (Project No. 2023‐816) and conducted in line with the Declaration of Helsinki and the Tri-Council Policy Statement. Participants met with researchers who explained the study before providing written informed consent. Informed consent was obtained from all participants, and those who agreed to provide actigraphy data were provided with a GENEActiv wristband (Activinsights, Cambridge, UK) and given instructions on its use. Participant compensation was explicitly tied to engagement with study procedures to encourage adherence: participants received CAD $5 (US $3.55) per week for consistently wearing the actigraphy watch and CAD $7 (US $4.96) per week for completing the questionnaires. This approach was designed to motivate sustained participation. While this was a small compensation per week, it is still compensation, which would not be available in clinical settings; while this is fairly standard in research designs, the reader should keep this in mind when interpreting our results, especially with respect to data contribution rates. Participants provided written informed consent prior to enrollment. All data were deidentified and stored on secure servers, accessible only to authorized study personnel. No clinical care decisions were influenced by participation.

Participant Recruitment

Participants are outpatients recruited from the Clinical High-Risk for Psychosis Clinic, First-Episode Psychosis Clinic, and the Bipolar Disorder Clinic; 1 patient with a primary personality disorder diagnosis was recruited from the general Neuropsychiatry Clinic. The study was advertised by clinicians directly to their patients during routine clinical visits and through posters displayed in the clinics. Participants are referred to the study by their clinicians or self-referred to the study after seeing study advertisements at their clinic.

Data Collection

Patient- and clinician-rated questionnaire data were collected using the REDCap platform [36]. For every participant, a medical chart review was performed at the Douglas Mental Institute assessing overall symptomatology, medication use, and other clinically relevant information. Actigraphy data were extracted from the GENEActiv wristbands using GENEActiv PC Software (version 3.3; ActivInsights) [37], a validated approach to monitor sleep in adults [31-33]. The REDCap questionnaires included in the analyses were the PHQ-9 [14], which assesses depressive symptoms; 7-item General Anxiety Disorder 7 (GAD-7) [38], which evaluates anxiety severity; Clinical Global Impression - Severity (CGI-S) [39], which measures overall illness severity; Scale for the Assessment of Positive Symptoms (SAPS) [40], which examines psychotic symptoms; and Scale for the Assessment of Negative Symptoms (SANS) [41], which assesses deficits in normal emotional and behavioral functioning. The SAPS, SANS, and CGI-S were clinician-rated, while the PHQ-9 and GAD-7 were self-reported. The PHQ-9 and GAD-7 were administered weekly, the SAPS and SANS monthly, and the CGI-S at each clinical visit. Actigraphy data were inaccessible to participants and clinicians during the monitoring month but were made available to both after each monthly monitoring period concluded. As this was a real-world clinical feasibility study, both the timing of clinician-administered questionnaires and return of results depended on clinical follow-up schedules and as such were subject to variation in timing.

Actigraphy Data Processing

Sleep and physical activity features extraction were processed from the raw movement data using GENEActiv default R markdown analysis tools [42]. Participants wore the actigraphy device for 1 month at a time (the length of a single charge), after which it was replaced with a newly charged device. The sleep features extracted and included in the analyses were as follows: total sleep time, sleep efficiency (ie, time spent asleep divided by time spent in bed), number of active periods per night, median length of those active periods, sleep onset time, rise time, day-to-day sleep onset time variability, and day-to-day rise time variability. The GENEActiv PC Software provided the daily step count and activity modes classified as “sedentary,” “light physical activity,” “moderate physical activity,” and “vigorous physical activity” [37]. We excluded vigorous activity from our analyses, as participants spent very little time in this mode (often less than 5 min per day), and these brief episodes were likely due to artifacts or misclassification rather than true vigorous exertion. We also included the daily duration of actigraphy nonwear in our analyses, as these periods may reflect participant behavior or symptom fluctuations and could provide meaningful insights into health status [43].

No filtering was performed on these features to maintain a straightforward preprocessing pipeline and preserve all available data, reflecting the exploratory nature of our analyses and the intent to apply these data in routine clinical practice using existing analysis tools. Due to the limited sample size, missing data were not imputed to avoid introducing bias. Actigraphy features and questionnaire scores were averaged by time unit (daily, weekly, monthly, and overall duration of study) to enable consistent comparisons between behavioral data and psychiatric outcomes across multiple time scales. Days with no data output despite the participant wearing the watch were excluded, as these were considered to result from sensor malfunction. To simplify preprocessing, enhance translatability, and given high participant compliance, we considered a day valid if any actigraphy data were recorded, and a week or month valid if it contained at least 1 such day. The results from sensitivity analyses using stricter thresholds (≥3 or ≥4 valid days for a valid week, and ≥10 or ≥15 valid days for a valid month) are available in Multimedia Appendix 1.

Statistical Analyses

We conducted analyses at the intraparticipant level to capture how actigraphy features relate to psychiatric symptoms within each individual over time. For each participant, we computed Spearman rank correlations between actigraphy features and questionnaire scores at daily, weekly, and monthly resolutions, but only if at least 5 data points were available to ensure sufficient data for a stable and interpretable correlation estimate. This nonparametric method was chosen due to its robustness to nonlinear associations and ordinal or skewed data distributions. Correlations were computed independently for each participant and time scale. Bonferroni correction was applied across all tests to control for multiple comparisons.

Then, we performed analyses at the interparticipant level to identify consistent associations across the sample, accounting for repeated measures within individuals [44]. To do so, we used repeated measures correlation (Pingouin package, Python) [45] to assess associations between daily, weekly, and monthly actigraphy features and questionnaire scores across individuals, while accounting for nonindependence of within-subject data. Additionally, we computed Spearman correlations across participants using data aggregated over the full study period to capture stable between-subject associations and overall group-level trends not dependent on repeated measurements. To ensure reliable estimation, only correlations with at least 10 data points for daily analyses and 4 for weekly and full-study summaries were included, as interindividual analyses provided more data than intraindividual ones. These thresholds were chosen to balance data availability and reliability.

Results

Description

Data were collected from 8 participants, yielding a combined total of 33 months of actigraphy recordings. Participants were aged 18-30 years, included both males and females, and represented 3 primary psychiatric diagnoses with various comorbidities (Table S1 in Multimedia Appendix 1). Most participants met criteria for a first episode of psychosis, except for participant 5, who had a primary personality disorder, and participant 14, who had a primary bipolar disorder. Primary diagnoses included schizophrenia spectrum disorders (eg, schizophrenia, schizoaffective disorder), bipolar disorder, and personality disorder. Comorbid conditions commonly included depression, anxiety, attention-deficit or hyperactivity disorder, posttraumatic stress disorder, and personality disorders. Participants differed in clinical severity and care utilization: participants 4 and 8 had histories of hospitalization and multiple emergency room visits, while others (eg, participants 1, 5, and 13) had no acute care history. Medication use ranged from none (patient 1) to complex regimens involving antipsychotics, mood stabilizers, antidepressants, and stimulants (eg, participants 5 and 8) (Table S1 in Multimedia Appendix 1). Participants 1 and 6 did not consent to complete self-reported questionnaires.

The mean number of days of actigraphy data collected per participant was 110 (SD 43), ranging from 58 to 156 days. A total of 57 days were missing due to sensor malfunction, distributed across 3 participants (5, 7, and 14). Any additional missing days were due to the device being rotated among participants. Compliance with wearing the GeneActiv sensor was very high (97.27% of days showed no nonwear time; mean 1.58 min, median 0 min, SD 22.4 min). Across all recorded days of actigraphy (n=843), 820 showed no nonwear time, 23 had some nonwear time, and only 5 exceeded 1 hour of nonwear (maximum 9 h). On average, a valid week included 6.16 (median 7.00, SD 1.65, range 1‐7) days of data, while a valid month included 19.65 (median 22.00, SD 9.50, range 3‐31) days. Data missingness is reported in Figure 1 and Table 1.

Figure 1. Visualization of actigraphy compliance and questionnaire completion across participants. Each green box represents a day with completed data. In the actigraphy panel, a color gradient indicates compliance (with green representing perfect wear and red indicating high nonwear of the GeneActiv device), while gray areas indicate days without data, which may result from sensor malfunction, participants returning the device, or the device being unavailable due to maintenance or the device being given to another participant due to a limited number of devices. Day 1 marks the start of wristband use for each participant, and the final day corresponds to the last day of actigraphy recording across all participants. Participant numbers on the y-axis are not sequential, as they represent individual patient IDs. CGI-S: Clinical Global Impression–Severity; GAD-7: 7-item General Anxiety Disorder; PHQ-9: 9-item Patient Health Questionnaire; SANS: Scale for the Assessment of Negative Symptoms; SAPS: Scale for the Assessment of Positive Symptoms.

Figure 1.

Table 1. Summary of actigraphy adherence and questionnaire completion.

Values, mean (SD) Min Max
Days of actigraphy 110.1 (40.4) 58 156
Missing days of actigraphy 7.1 (11.5) 0 35
Completed actigraphy (%) 95.4 (7.5) 78 100
PHQ-9a completed 9.8 (11.0) 0 29
GAD-7b completed 7.5 (9.5) 0 25
CGI-Sc completed 2.8 (1.2) 0 4
SAPSd or SANSe completed 2.1 (0.8) 1 3
a

PHQ-9: 9-item Patient Health Questionnaire.

b

GAD-7: 7-item General Anxiety Disorder.

c

CGI-S: Clinical Global Impression–Severity.

d

SAPS: Scale for the Assessment of Positive Symptoms.

e

SANS: Scale for the Assessment of Negative Symptoms.

Intraparticipant Patterns

We analyzed same-day associations between questionnaire scores and actigraphy features within participants using Spearman correlations. In participant 7, higher GAD-7 scores were associated with a later sleep onset the night before (ρ=0.54, P=.04) and later same-day rise time (ρ=0.53, P=.04). However, these daily associations did not remain significant after Bonferroni correction. Figure 2 summarizes all significant associations established in this study.

Figure 2. Heatmap of significant associations between actigraphy features and clinical questionnaires across multiple time scales at both intra- and interparticipant levels. “+” and “–” indicate the direction of the correlation. Cells with bold borders represent features with significant associations at multiple time scales (or across different participants for intraindividual analyses). Each cell displays only the most significant time scale. CGI-S: Clinical Global Impression–Severity; GAD-7: 7-item General Anxiety Disorder; PHQ-9: 9-item Patient Health Questionnaire; SANS: Scale for the Assessment of Negative Symptoms; SAPS: Scale for the Assessment of Positive Symptoms.

Figure 2.

On a weekly time scale, we found that for participant 4 (patient with a schizoaffective disorder and a personality disorder), a later rise time was associated with higher PHQ-9 scores, although the significance did not survive Bonferroni correction (ρ=0.8; uncorrected P=.02 Bonferroni-corrected P=.3). For participant 7 (patient with bipolar disorder, personality disorder, and a first-episode psychosis), a later rise time was associated with both higher PHQ-9 (ρ=0.74; P<.001) and GAD-7 scores (ρ=0.59; P=.03), although only the association with PHQ-9 stayed significant after correction (P=.007). We found that for participant 5 (patient who has a primary personality disorder), a longer time of moderate physical activity was associated with higher PHQ-9 scores (ρ=1.0; P<.001); see discussion below. This association correlates clinically, as it was noted in the participant’s clinical chart that they tended to go on long walks in the evening when feeling depressed. The higher level of significance for this participant is explained by the perfect correlation and limited number of overlapping data points for this participant (5 wk), which can inflate correlation coefficients.

Participant 7 showed that later average monthly rise time was associated with higher monthly average GAD-7 score (ρ=0.81; P=.05), while later average monthly sleep onset was associated with higher monthly average PHQ-9 scores (ρ=0.77; P=.04). None of these correlations remained significant after Bonferroni correction.

In summary, several associations emerged between mood scores and sleep timing or physical activity. After correction, only weekly associations involving rise time and PHQ-9, and moderate activity and PHQ-9 (which was most likely idiosyncratic to participant 5), remained significant—highlighting rise time and activity patterns as potential behavioral markers of mood (Figure 3).

Figure 3. Significant intraindividual associations between actigraphy features and questionnaire scores across time scales. Asterisk (*) indicates fewer than 10 data points; Spearman correlation may be less reliable [46]. PHQ-9: 9-item Patient Health Questionnaire; GAD-7: 7-item General Anxiety Disorder.

Figure 3.

Interparticipant Trends

Same-day associations at the interindividual level showed longer total sleep duration the night before (r=0.30; P=.03). Increased rise-time variability, defined as the absolute difference in rise time between consecutive days, was correlated with higher GAD-7 scores (r=0.39, P=.02). However, none of the same-day associations survived correction for multiple comparisons.

Interestingly, the intraparticipant associations between rise time, sleep onset time, and questionnaire scores remained consistent in our interparticipant repeated measures correlation analyses conducted at a weekly time scale, such that weeks with a later average rise time were associated with a higher GAD-7 (r=0.38; P=.03) and PHQ-9 score (r=0.49; P<.001). Furthermore, weeks of increased time spent doing light physical activity were associated with lower PHQ-9 (r=−0.44; P=.001). Only the association between PHQ-9 scores and rise time remained significant after Bonferroni correction (P=.02).

On a monthly time scale, increased light physical activity was associated with lower PHQ-9 scores (r=−0.53; P=.01). Later average monthly rise time was associated with both higher average CGI-S (r=−0.68; P=.05) and lower total SAPS (r=0.96; P=.04). The association with higher CGI-S appears contradictory to the rest of the study’s findings linking later rise time with better outcomes, but this is likely driven by participant clustering—specifically, participant 4, who had consistently later rise times but low CGI-S scores. However, none of the monthly associations remained significant after Bonferroni correction.

When comparing participants’ average questionnaire scores and actigraphy features collected over the study period (ranging from 1 to several months, depending on each participant’s enrollment date), Spearman correlation analyses showed that participants with a later average sleep onset had a higher PHQ-9 scores average (ρ=0.90; P=.04). Participants with more average daily time spent in sedentary activity had higher PHQ-9 score (ρ=0.90; P=.04) and GAD-7 score averages (ρ=1.00; P<.001); however, after Bonferroni correction, only the GAD-7 association remained significant (P<.001), though this unusually strong correlation should be interpreted with caution given the small sample size (Figure 4).

Figure 4. Significant interindividual associations between actigraphy features and questionnaire scores across time scales. Repeated measures correlations were conducted at the daily, weekly, and monthly time scales; Spearman correlations were used for overall averages across the full study duration. CGI-S: Clinical Global Impression–Severity; GAD-7: 7-item General Anxiety Disorder; PHQ-9: 9-item Patient Health Questionnaire; SANS: Scale for the Assessment of Negative Symptoms; SAPS: Scale for the Assessment of Positive Symptoms.

Figure 4.

Sensitivity analyses using stricter validity thresholds (≥3 or ≥4 valid days per week, and ≥10 or ≥15 valid days per month) were conducted to assess the robustness of our findings and are available in Multimedia Appendix 1. These analyses showed similar patterns of associations, though with reduced statistical significance, as expected with decreased data availability and statistical power.

In summary, findings from interparticipant analyses aligned with intraparticipant results, showing consistent trends where symptom improvement was linked to earlier rise times, longer sleep duration, and increased time spent doing light physical activity, while symptom worsening was associated with increased sedentary behavior and delayed sleep onset. These associations were stronger at the weekly level, with some remaining significant after correction for multiple comparisons.

Discussion

This case series, focused on the feasibility of the use of actigraphy in a realistic clinical setting, explored the relationship between sleep and activity patterns and psychiatric symptom severity across multiple diagnostic categories and time scales. Despite our small and heterogeneous sample, several meaningful intra- and interindividual associations emerged. Notably, sleep timing (particularly rise time) showed consistent associations with mood scores at the individual level, but also at the group level and at different time scales. While these results align with prior evidence linking delayed sleep-wake cycles to mood disorders and psychosis, our study is novel in showing these associations within a realistic transdiagnostic population and highlighting their persistence across different temporal scales [47,48]. This holds clinical significance, as it underscores the potential of sleep timing as a modifiable biomarker and intervention target for improving mood symptoms across a range of psychiatric disorders.

At the individual level, higher depressive and anxious symptoms were consistently associated with delayed circadian rhythm, observed at both daily and weekly time scales. For participant 5, who we know clinically to become more agitated when depressed, the presence of worse depressive symptoms during weeks of increased moderate physical activity likely reflects this pattern. This highlights the importance of accounting for individual clinical context when interpreting behavioral data. Interindividual analyses echoed these findings, showing that weeks with more severe depressive and anxious symptoms were also characterized by later rise times. The persistence of these associations after correction in this small, heterogeneous sample strengthens the case for delayed sleep-wake phase (ie, a regular sleep schedule that is considerably later than the conventional or desired time) as a transdiagnostic marker of psychiatric symptom burden. This characteristic has been previously linked with worse depressive symptoms in young adults [49] and worse outcomes in patients at clinical high risk of psychosis, as well as in those with early psychosis or schizophrenia [48,50], but we provide new evidence that these significant changes may occur across temporal scales and diagnostic groups [51,52]. We also found that during weeks and months of increased physical activity, participants reported less severe depressive and anxiety symptoms, consistent with findings from previous studies [53].

Same-day analyses, although not surviving correction for multiple comparisons, showed that shorter total sleep duration the night before, earlier rise time, and a rise time that has less variability from the previous day are each associated with lower same-day symptom severity. The unexpected association between total shorter sleep duration the preceding night and lower same-day symptom severity should be interpreted with caution, as the relationship was weak and largely driven by a small number of participants (notably participants 5 and 7). The sleep timing findings further support prior evidence that circadian phase is a key predictor of same-day mood episodes [26]. These trends point toward possible short-term responsiveness of mood and anxiety symptoms to daily behavioral patterns.

Additionally, when comparing participants’ data across the entire study period (ie, averaged across all months), those with later average sleep onset and more sedentary behavior tended to report worse symptoms, while those with higher average daily step counts reported fewer negative psychotic symptoms. Due to very high compliance with wearing the GeneActiv sensor (97.27% of recorded days of actigraphy showed no nonwear time), we did not find any significant associations between non-compliance and questionnaire scores.

Despite the promising findings described above, several limitations must be acknowledged. The small sample, as well as missing data due to sensor malfunctions and variable adherence to actigraphy and questionnaire protocols, may have limited statistical power, introduced bias, and reduced sensitivity to detect certain effects. Furthermore, it is important to note that inferring temporal causality from these observations remains challenging; incorporating ecological momentary assessments data could provide a more precise temporal resolution, thereby improving the ability to disentangle the directionality of the relationship between sleep timing and symptom trajectory [54]. In addition, the use of GENEActiv devices requires physical docking to download data, meaning that data are not available in real time. As such, real-time monitoring of adverse events or symptom burden (eg, hospitalizations) was not possible in this study. Alternative wearable devices with real-time data transmission capabilities may be better suited for future studies aiming to evaluate actigraphy as a biomarker for timely clinical intervention. However, the actigraphy data were able to be processed and included in return-of-results reports to clinicians and patients on a monthly basis, which was preliminarily found to be clinically useful; the process of developing these reports will be discussed elsewhere. Nonetheless, the consistency of our results across time scales and methods highlights sleep timing and physical activity as a promising behavioral biomarker in real-world psychiatric populations.

The preliminary findings from the DeeP-DD study reveal complex associations between sleep features and clinical symptoms across various diagnostic groups, time scales, and data collection methods within a realistic clinical population. Earlier sleep and wake times, along with higher physical activity levels, were consistently associated with better clinical outcomes, including lower anxiety and depressive symptoms, both within and between individuals and across multiple time scales. The weekly time scale was particularly interesting, as multiple associations involving both sleep timing and physical activity were significant across participants. Although some associations did not remain significant after correction for multiple comparisons, the consistent patterns observed at both the inter- and intraindividual levels align with prior literature and offer novel insights into which actigraphy metrics, for different durations, may serve as digital biomarkers of monitoring and prediction or psychiatric symptom trajectories in a transdiagnostic population. However, given the small sample size, these conclusions should be interpreted with caution. The next steps in our project involve further exploring these relationships using larger datasets to assess their consistency and investigate the longitudinal dynamics between sleep and symptom changes. Additionally, we aim to incorporate more sophisticated approaches such as Fourier transformations at an intraindividual level to capture rhythmicity and periodic patterns in sleep and activity data, which may reveal subtle disruptions in circadian cycles linked to symptom fluctuations [55]. These methods could help identify individual-specific signatures, ultimately informing personalized interventions.

Supplementary material

Multimedia Appendix 1. Clinical and sociodemographic characteristics, and figures showing significant associations between actigraphy features and clinical questionnaire scores across multiple time scales, at both the intra- and interindividual level.
DOI: 10.2196/81107

Acknowledgments

The authors acknowledge financial support from the Mach-Gaensslen Foundation of Canada, the Dr Clarke K McLeod Memorial Scholarship, the Saputo Foundation, the Emerging Challenges Modelling Project (an initiative led by the Centre de Recherches Mathématiques in partnership with GERAD and UNIQUE, and funded by the Quebec Research Fund), the Strategia Program at the Centre de Recherches Mathématiques, the McGill Computational and Data Systems Initiative, the Douglas Research Center, the Fonds de recherche du Québec – Nature et technologies (STRATÉGIA grant), and the Fonds de Recherche du Québec - Santé (FRQS; Junior 1 Grant).

DB is supported by a NARSAD Young Investigator Award. LP’s research is supported by Monique H. Bourgeois Chair in Developmental Disorders, the Graham Boeckh Foundation, the Mirella and Lino Saputo Foundation, and a Wellcome Trust Discretionary Grant (226168/Z/22/Z to Dr Iris Sommer and LP). He receives a salary award from the Fonds de recherche du Québec-Santé (FRQS: 366934).

Abbreviations

CGI-S

Clinical Global Impression–Severity

DeeP-DD

Deep Phenotyping and Digitalization at Douglas

GAD-7

7-item General Anxiety Disorder

PHQ-9

9-item Patient Health Questionnaire

SANS

Scale for the Assessment of Negative Symptoms

SAPS

Scale for the Assessment of Positive Symptoms

We occasionally used the generative artificial intelligence tool ChatGPT (OpenAI) to assist with grammar and phrasing during manuscript preparation.

Footnotes

Funding: This research protocol was funded by the Douglas Research Center, FRQS Junior 1 Grant, McGill Computational and Data Systems Initiative, Strategia FRQ Program at the Centre de Recherches Mathématiques, Healthy Brains, Healthy Lives grant from McGill, Saputo Foundation, and Brain and Behavior Research Foundation Young Investigator Grant.

Data Availability: Data available on reasonable request from the authors and subject to review board approval.

Conflicts of Interest: DB is a founder and shareholder of Aifred Health, a digital mental health company. Aifred Health was not involved in this research. LP reports personal fees for serving as chief editor from the Canadian Medical Association Journal, speaker honorarium from Janssen Canada and Otsuka Canada, SPMM Course Limited, UK; book royalties from Oxford University Press; and investigator-initiated educational grants from Otsuka Canada outside the submitted work in the last 5 years.

References

  • 1.Fava GA, Kellner R. Staging: a neglected dimension in psychiatric classification. Acta Psychiatr Scand. 1993 Apr;87(4):225–230. doi: 10.1111/j.1600-0447.1993.tb03362.x. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 2.Muneer A. Staging models in bipolar disorder: a systematic review of the literature. Clin Psychopharmacol Neurosci. 2016 May 31;14(2):117–130. doi: 10.9758/cpn.2016.14.2.117. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Compton MT, Bernardini F, Attademo L, et al. Risk prediction models in psychiatry: toward a new frontier for the prevention of mental illnesses. Psychiatrist.com. [11-05-2025]. https://www.psychiatrist.com/jcp/risk-prediction-models-in-psychiatry/ URL. Accessed. [DOI] [PubMed]
  • 4.Horowitz MA, Macaulay A, Taylor D. Limitations in research on maintenance treatment for individuals with schizophrenia. JAMA Psychiatry. 2022 Jan 1;79(1):83–85. doi: 10.1001/jamapsychiatry.2021.3400. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 5.Coley RY, Boggs JM, Beck A, Hartzler AL, Simon GE. Defining success in measurement-based care for depression: a comparison of common metrics. Psychiatr Serv. 2020 Apr 1;71(4):312–318. doi: 10.1176/appi.ps.201900295. doi. [DOI] [PubMed] [Google Scholar]
  • 6.Trivedi MH, Rush AJ, Wisniewski SR, et al. Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry. 2006 Jan;163(1):28–40. doi: 10.1176/appi.ajp.163.1.28. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 7.Guo T, Xiang YT, Xiao L, et al. Measurement-based care versus standard care for major depression: a randomized controlled trial with blind raters. Am J Psychiatry. 2015 Oct;172(10):1004–1013. doi: 10.1176/appi.ajp.2015.14050652. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 8.Benrimoh D, Armstrong C, Mehltretter J, et al. Development and validation of a deep-learning model for differential treatment benefit prediction for adults with major depressive disorder deployed in the artificial intelligence in depression medication enhancement (AIDME) study. arXiv. 2024 Jun 7; doi: 10.48550/arXiv.2406.04993. Preprint posted online on. doi. [DOI]
  • 9.Elliott CS, Fiszdon JM. Comparison of self-report and performance-based measures of everyday functioning in individuals with schizophrenia: implications for measure selection. Cogn Neuropsychiatry. 2014;19(6):485–494. doi: 10.1080/13546805.2014.922062. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 10.Mote J, Fulford D. Ecological momentary assessment of everyday social experiences of people with schizophrenia: a systematic review. Schizophr Res. 2020 Feb;216:56–68. doi: 10.1016/j.schres.2019.10.021. doi. [DOI] [PubMed] [Google Scholar]
  • 11.Abel DB, Minor KS. Social functioning in schizophrenia: comparing laboratory-based assessment with real-world measures. J Psychiatr Res. 2021 Jun;138:500–506. doi: 10.1016/j.jpsychires.2021.04.039. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Xiao L, Qi H, Zheng W, et al. The effectiveness of enhanced evidence-based care for depressive disorders: a meta-analysis of randomized controlled trials. Transl Psychiatry. 2021 Oct 16;11(1):531. doi: 10.1038/s41398-021-01638-7. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zimmerman M, Martinez J, Attiullah N, Friedman M, Toba C, Boerescu DA. Symptom differences between depressed outpatients who are in remission according to the Hamilton Depression Rating Scale who do and do not consider themselves to be in remission. J Affect Disord. 2012 Dec 15;142(1-3):77–81. doi: 10.1016/j.jad.2012.03.044. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 14.Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001 Sep;16(9):606–613. doi: 10.1046/j.1525-1497.2001.016009606.x. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Levis B, Benedetti A, Thombs BD, DEPRESsion Screening Data (DEPRESSD) Collaboration Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis. BMJ. 2019 Apr 9;365:l1476. doi: 10.1136/bmj.l1476. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Barron DS. Decision models and technology in psychiatry. Biol Psychiatry. 2021 Aug 15;90(4):208–211. doi: 10.1016/j.biopsych.2021.06.012. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 17.Benoit J, Onyeaka H, Keshavan M, Torous J. Systematic review of digital phenotyping and machine learning in psychosis spectrum illnesses. Harv Rev Psychiatry. 2020;28(5):296–304. doi: 10.1097/HRP.0000000000000268. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 18.Spinazze P, Rykov Y, Bottle A, Car J. Digital phenotyping for assessment and prediction of mental health outcomes: a scoping review protocol. BMJ Open. 2019 Dec;9(12):e032255. doi: 10.1136/bmjopen-2019-032255. doi. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mendes JPM, Moura IR, Van de Ven P, et al. Sensing apps and public data sets for digital phenotyping of mental health: systematic review. J Med Internet Res. 2022 Feb 17;24(2):e28735. doi: 10.2196/28735. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Dunster GP, Swendsen J, Merikangas KR. Real-time mobile monitoring of bipolar disorder: a review of evidence and future directions. Neuropsychopharmacology. 2021 Jan;46(1):197–208. doi: 10.1038/s41386-020-00830-5. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Plante DT, Winkelman JW. Sleep disturbance in bipolar disorder: therapeutic implications. Am J Psychiatry. 2008 Jul;165(7):830–843. doi: 10.1176/appi.ajp.2008.08010077. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 22.Girschik J, Fritschi L, Heyworth J, Waters F. Validation of self-reported sleep against actigraphy. J Epidemiol. 2012;22(5):462–468. doi: 10.2188/jea.je20120012. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ferrarelli F. Sleep disturbance in schizophrenia spectrum disorders: more than just a symptom? Int Clin Psychopharmacol. 2023 May 1;38(3):187–188. doi: 10.1097/YIC.0000000000000467. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 24.Nordholm D, Jensen MA, Glenthøj LB, et al. Sleep disturbances and the association with attenuated psychotic symptoms in individuals at ultra high-risk of psychosis. J Psychiatr Res. 2023 Feb;158:143–149. doi: 10.1016/j.jpsychires.2022.12.041. doi. [DOI] [PubMed] [Google Scholar]
  • 25.Reich N, Delavari F, Schneider M, Thillainathan N, Eliez S, Sandini C. Multivariate patterns of disrupted sleep longitudinally predict affective vulnerability to psychosis in 22q11.2 Deletion Syndrome. Psychiatry Res. 2023 Jul;325:115230. doi: 10.1016/j.psychres.2023.115230. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 26.Pieters LE, Deenik J, Hoogendoorn AW, van Someren EJW, van Harten PN. Sleep and physical activity patterns in relation to daily-life symptoms in psychosis: an actigraphy and experience sampling study. Psychiatry Res. 2025 Feb;344:116320. doi: 10.1016/j.psychres.2024.116320. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 27.Lim D, Jeong J, Song YM, et al. Accurately predicting mood episodes in mood disorder patients using wearable sleep and circadian rhythm features. NPJ Digit Med. 2024 Nov 18;7(1):1–13. doi: 10.1038/s41746-024-01333-z. doi. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Burton C, McKinstry B, Szentagotai Tătar A, Serrano-Blanco A, Pagliari C, Wolters M. Activity monitoring in patients with depression: a systematic review. J Affect Disord. 2013 Feb 15;145(1):21–28. doi: 10.1016/j.jad.2012.07.001. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 29.Abad VC, Guilleminault C. Sleep and psychiatry. Dialogues Clin Neurosci. 2005 Dec 31;7(4):291–303. doi: 10.31887/DCNS.2005.7.4/vabad. doi. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.de Leeuw M, Verhoeve SI, van der Wee NJA, van Hemert AM, Vreugdenhil E, Coomans CP. The role of the circadian system in the etiology of depression. Neurosci Biobehav Rev. 2023 Oct;153:105383. doi: 10.1016/j.neubiorev.2023.105383. doi. [DOI] [PubMed] [Google Scholar]
  • 31.Jonasdottir SS, Minor K, Lehmann S. Gender differences in nighttime sleep patterns and variability across the adult lifespan: a global-scale wearables study. Sleep. 2021 Feb 12;44(2):zsaa169. doi: 10.1093/sleep/zsaa169. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 32.van Hees VT, Sabia S, Jones SE, et al. Estimating sleep parameters using an accelerometer without sleep diary. Sci Rep. 2018 Aug 28;8(1):12975. doi: 10.1038/s41598-018-31266-z. doi. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.te Lindert BHW, Van Someren EJW. Sleep estimates using microelectromechanical systems (MEMS) Sleep. 2013 May 1;36(5):781–789. doi: 10.5665/sleep.2648. doi. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Krane-Gartiser K, Asheim A, Fasmer OB, Morken G, Vaaler AE, Scott J. Actigraphy as an objective intra-individual marker of activity patterns in acute-phase bipolar disorder: a case series. Int J Bipolar Disord. 2018 Mar 7;6(1):8. doi: 10.1186/s40345-017-0115-3. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Tazawa Y, Wada M, Mitsukura Y, et al. Actigraphy for evaluation of mood disorders: a systematic review and meta-analysis. J Affect Disord. 2019 Jun 15;253:257–269. doi: 10.1016/j.jad.2019.04.087. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 36.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009 Apr;42(2):377–381. doi: 10.1016/j.jbi.2008.08.010. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Dodge L. GENEActiv support 11. Activinsights. [22-06-2025]. https://activinsights.com/resources/geneactiv-support-1-1/ URL. Accessed.
  • 38.Spitzer RL, Kroenke K, Williams JBW, Löwe B. A brief measure for assessing generalized anxiety disorder. Arch Intern Med. 2006 May 22;166(10):1092. doi: 10.1001/archinte.166.10.1092. doi. [DOI] [PubMed] [Google Scholar]
  • 39.Busner J, Targum SD. The clinical global impressions scale: applying a research tool in clinical practice. Psychiatry (Edgmont) 2007 Jul;4(7) Medline. [PMC free article] [PubMed] [Google Scholar]
  • 40.Andreasen NC. Scale for the assessment of positive symptoms. APA PsychTests. 2018. [12-11-2025]. https://psycnet.apa.org/doiLanding?doi=10.1037%2Ft48377-000 URL. Accessed.
  • 41.Andreasen NC. Scale for the assessment of negative symptoms. APA PsychTests. 2014. [12-11-2025]. https://psycnet.apa.org/doiLanding?doi=10.1037%2Ft12696-000 URL. Accessed.
  • 42.Selbie HR. Markdown training. Activinsights. [22-06-2025]. https://activinsights.com/research-study-services/r-markdown-training/ URL. Accessed.
  • 43.Di J, Demanuele C, Kettermann A, et al. Considerations to address missing data when deriving clinical trial endpoints from digital health technologies. Contemp Clin Trials. 2022 Feb;113:106661. doi: 10.1016/j.cct.2021.106661. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 44.Bakdash JZ, Marusich LR. Repeated measures correlation. Front Psychol. 2017;8:456. doi: 10.3389/fpsyg.2017.00456. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Vallat R. Pingouin: statistics in Python. JOSS. 2018 Nov 19;3(31):1026. doi: 10.21105/joss.01026. doi. [DOI] [Google Scholar]
  • 46.Spearman’s rank correlation coefficient Rs and probability (p) value calculator. GeographyFieldwork.com. [06-07-2025]. https://geographyfieldwork.com/SpearmansRankCalculator.html URL. Accessed.
  • 47.Dollish HK, Tsyglakova M, McClung CA. Circadian rhythms and mood disorders: time to see the light. Neuron. 2024 Jan 3;112(1):25–40. doi: 10.1016/j.neuron.2023.09.023. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Meyer N, Lok R, Schmidt C, et al. The sleep–circadian interface: a window into mental disorders. Proc Natl Acad Sci USA. 2024 Feb 27;121(9):e2214756121. doi: 10.1073/pnas.2214756121. doi. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Crouse JJ, Carpenter JS, Song YJC, et al. Circadian rhythm sleep-wake disturbances and depression in young people: implications for prevention and early intervention. Lancet Psychiatry. 2021 Sep;8(9):813–823. doi: 10.1016/S2215-0366(21)00034-1. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 50.Zanini M, Castro J, Coelho FM, et al. Do sleep abnormalities and misaligned sleep/circadian rhythm patterns represent early clinical characteristics for developing psychosis in high risk populations? Neurosci Biobehav Rev. 2013 Dec;37(10 Pt 2):2631–2637. doi: 10.1016/j.neubiorev.2013.08.012. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 51.Lunsford-Avery JR, Gonçalves B da S, Brietzke E, et al. Adolescents at clinical-high risk for psychosis: circadian rhythm disturbances predict worsened prognosis at 1-year follow-up. Schizophr Res. 2017 Nov;189:37–42. doi: 10.1016/j.schres.2017.01.051. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Wong TR, Hickie IB, Carpenter JS, et al. Dynamic modelling of chronotype and hypo/manic and depressive symptoms in young people with emerging mental disorders. Chronobiol Int. 2023 Jun 3;40(6):699–709. doi: 10.1080/07420528.2023.2203241. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 53.Peluso MAM, Guerra de Andrade LHS. Physical activity and mental health: the association between exercise and mood. Clinics (Sao Paulo) 2005 Feb;60(1):61–70. doi: 10.1590/s1807-59322005000100012. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 54.De la Barrera U, Arrigoni F, Monserrat C, Montoya-Castilla I, Gil-Gómez JA. Using ecological momentary assessment and machine learning techniques to predict depressive symptoms in emerging adults. Psychiatry Res. 2024 Feb;332:115710. doi: 10.1016/j.psychres.2023.115710. doi. Medline. [DOI] [PubMed] [Google Scholar]
  • 55.Li X, Kane M, Zhang Y, et al. Circadian rhythm analysis using wearable device data: novel penalized machine learning approach. J Med Internet Res. 2021 Oct 14;23(10):e18403. doi: 10.2196/18403. doi. Medline. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia Appendix 1. Clinical and sociodemographic characteristics, and figures showing significant associations between actigraphy features and clinical questionnaire scores across multiple time scales, at both the intra- and interindividual level.
DOI: 10.2196/81107

Articles from JMIR Formative Research are provided here courtesy of JMIR Publications Inc.

RESOURCES