Summary
Actigraphy, a tool known for investigating sleep–wake patterns at home, lacks scientific validation in hypersomnolent subjects. We aim to validate an actigraphy‐based sleep–wake prediction algorithm against 32‐h continuous polysomnography in patients with suspected idiopathic hypersomnia, and to compare its performance to predict sleep–wake parameters assessed by polysomnography with those of a commercially available algorithm. Two hundred and six hypersomnolent subjects were included prospectively in a Reference Centre for Hypersomnias, and underwent a 32‐h bedrest protocol, wearing wrist‐actigraphy, to diagnose idiopathic hypersomnia. Among them, 126 patients (91 females, 30.6 ± 15.5 years, 101 idiopathic hypersomnia, 25 non‐specified hypersomnia) with synchronised actigraphy and polysomnography were analysed. Age, sex, and Epworth Sleepiness Scale scores were collected. We trained various supervised algorithms and selected a recurrent neural network (S2S sequence‐to‐sequence long short‐term memory network) for comparison with Actiwatch Software (AS) on sleep–wake variables and prediction errors during daytime and nighttime. S2S outperformed AS across all relevant metrics, and Bland–Altman analysis showed disagreement between the two algorithms. S2S had a lower absolute error than AS. AS mainly overestimated sleep, an overestimation that was substantially reduced with S2S, overall as well as during day and night. Performance was not correlated with age, sex, or subjective sleepiness, but objective sleepiness and longer sleep time on the bedrest were associated with sleep underestimation. Our S2S algorithm using deep learning performed better to predict sleep–wake parameters than AS and other commonly used algorithms. The next objective is to leverage this algorithm to study sleep–wake patterns in patients with hypersomnolence at home.
Keywords: actigraphy, deep learning, extended polysomnography, hypersomnolence, idiopathic hypersomnia
1. INTRODUCTION
Idiopathic hypersomnia (IH) is a rare neurological disorder characterised by excessive daytime sleepiness (EDS), prolonged and unrefreshing daytime naps, sleep inertia, and prolonged nighttime sleep. The current diagnostic criteria for IH, according to the International Classification of Sleep Disorders (ICSD‐3‐TR) (American Academy of Sleep Medicine, 2023), include EDS persisting for ≥3 months, the absence of cataplexy, no more than one sleep onset rapid eye movement (REM) period (SOREMP) during the multiple sleep latency test (MSLT) and preceding polysomnography (PSG). To objectify hypersomnolence, either a mean sleep latency (MSL) on MSLT ≤ 8 min is required, or a total sleep time ≥ 660 min on a 24‐h continuous PSG or assessed via wrist actigraphy (ACT) and sleep logs, for ≥7 days of unrestricted sleep. Although continuous PSG (24‐h or 32‐h recordings) (Evangelista et al., 2018; Pizza et al., 2013; Vernet & Arnulf, 2009) is the gold standard to objectify prolonged nighttime and daytime sleep, it remains costly and still lacks standardisation. On the other hand, ACT is cheaper and a potential alternative to assess sleep–wake cycles over a long period of time in ecological conditions. However, ACT relies on the indirect estimation of sleep and wake through rest and activity assessment (Acebo et al., 1999; Ancoli‐Israel et al., 1997; Ancoli‐Israel et al., 2003; Smith et al., 2018; Walia & Mehra, 2019). Currently, the ACT diagnosis criterion lacks scientific validation in idiopathic hypersomnia, although it can be used in clinical practice to make this diagnosis, according to the ICSD‐3‐TR.
One study investigating the reliability of ACT in IH proposed an optimal configuration of Actiwatch 2 settings to estimate total sleep time (TST) in individuals suspected of IH (Cook et al., 2019). The authors emphasised the importance of device‐setting parameters to accurately estimate sleep parameters, and noted that the ICSD‐3 does not provide specific guidelines in that regard (Smith et al., 2018). Consistent with previous studies (Meltzer et al., 2015; Sadeh, 2011), the research findings highlighted the substantial impact of parameter settings on nighttime sleep estimation, which remains highly questionable, with a high sleep sensitivity (0.92–0.99) but a very poor wake sensitivity (0.21–0.42) (Cook et al., 2019). Additionally, the results were derived from a single‐night PSG evaluation and may not be applicable to daytime scenarios. Filardi et al. (Filardi et al., 2015) investigated home‐ACT in narcoleptic and patients with idiopathic hypersomnia versus controls, assessing sleep/wake patterns using Cole's algorithm (Cole et al., 1992), and found that sleep variables derived from home‐ACT discriminated NT1 subjects from other groups on daytime and nighttime variables, while patients with idiopathic hypersomnia could be individualised from controls on daytime parameters only. A recent study evaluated the reliability of actigraphy in narcoleptic and a small sample of IH subjects during in‐lab continuous PSG (Biscarini et al., 2024), showing that all three tested algorithms significantly overestimated 24‐h TST, and in particular daytime TST in both diseases. Another group compared sleep parameters derived from nocturnal PSG and actigraphy (Actiwatch, Philips) in narcoleptic and ‘hypersomnolent’ subjects, and found a correlation between the two measures, with differences in a clinically acceptable range except for sleep latency and time in bed (Torstensen et al., 2021).
In contrast to the few studies in disorders of hypersomnolence, the use and validation of ACT in insomnia is well documented in the literature. However, epoch‐to‐epoch performances show overall a high sensitivity and low specificity to sleep (Taibi et al., 2013; te Lindert et al., 2020). Additionally, sleep parameters derived from ACT (sleep latency, wake after sleep onset (WASO), TST, and sleep efficiency) often correlate with PSG (Lichstein et al., 2006; Williams et al., 2020), with some studies suggesting an overestimation of sleep (Hauri & Wisbey, 1992; Jean‐Louis et al., 1999; Verbeek et al., 1994). Thus, ACT is validated in healthy individuals and in some sleep disorders such as insomnia, but there is a gap in the literature concerning its validation in hypersomnia; bouts of daytime sleep require a specific ACT algorithm with reliable performances in such a condition. In addition, most ACT studies have relied on single‐night PSG (American Sleep Disorders Association, 1995; Ancoli‐Israel et al., 2003; Blackwell et al., 2008; Cole et al., 1992; Cook et al., 2019; de Souza et al., 2003; Jean‐Louis et al., 1997; Lichstein et al., 2006; Sadeh et al., 1994; Taibi et al., 2013; Williams et al., 2020), and reported algorithms lacked specificity to wake, raising significant concerns about the reliability of the tool in capturing daytime sleepiness, naps and prolonged nighttime sleep in individuals with idiopathic hypersomnia. A recent study benchmarked several algorithms on data from both nighttime and daytime, and showed a more balanced performance, with a better tradeoff between sensitivity and specificity using machine learning approaches than traditional algorithms (Palotti et al., 2019).
The aims of this study were to select and validate, against the gold‐standard PSG, the best ACT algorithm for the correct prediction of sleep–wake patterns in a cohort of patients with suspected IH, during a standardised 32‐h continuous PSG recording.
2. METHODS
2.1. Population
This study included consecutive subjects referred at the National Reference Center for Narcolepsy and Rare Hypersomnias at the University Hospital of Montpellier‐France between September 2014 and April 2023 for suspected idiopathic hypersomnia. No participant had night or shiftwork schedule, sleep deprivation, significant medical or psychiatric disorders, and none reported cataplexies. They had not taken any drug or substances with an effect on sleep or vigilance in the past 2 weeks before recording. Age, sex, and Epworth sleepiness scale (ESS) scores were collected.
2.2. Bedrest protocol
All patients underwent a bedrest protocol as described previously (Evangelista et al., 2018). A first PSG was followed by a modified MSLT, defined as five 20‐min naps opportunity throughout the day, with interruption of the test after 1 min of sleep. Patients were then continuously recorded for 32 h, from 11 p.m. to 7 a.m. after the second night, under standardised conditions. To ensure maximum spontaneous sleep, external circadian synchronisers were suppressed: they had no access to television, newspapers, phones, watches, or visits from friends or family, and interaction with hospital staff was limited to emergency and meal delivery. Patients were instructed to sleep ad libitum, with the environment maintained in a constant dim light level (10 lux). We reported that a cut‐off of TST ≥19 h had a very good sensitivity and specificity for diagnosing idiopathic hypersomnia in these conditions (Evangelista et al., 2018).
For the analyses, the bedrest was artificially divided in: night 1 (11 p.m.–7 a.m.), daytime (7 a.m.–11 p.m.), night 2 (11 p.m.–7 a.m.). PSG data acquired during the bedrest was scored by trained sleep experts according to the AASM criteria (Iber et al., 2007). Figure 1 provides an overview of the participants' visit.
FIGURE 1.

Workflow of a participant's involvement in the study. Participants undergo a standard PSG (11 p.m.–7 a.m.), followed the next day by modified MSLT. The following night they start the bedrest protocol at 11 p.m., a 32‐h continuous PSG recording with abolition of all circadian synchronisers (dim light, meals on demand, no social interaction, calm activities) (Evangelista et al., 2018). Wrist actigraphy is simultaneously recorded during the whole bedrest. Periods of sleep and wake here are shown as examples. mMSLT, modified multiple sleep latency test (the test is stopped when sleep is reached); PSG, polysomnography.
2.3. Actigraphy
During the whole duration of the bedrest, patients wore the Actiwatch 2 (Philips) on the non‐dominant wrist and data were recorded on a 30‐second basis. This device measures movements via a tri‐axial piezoelectric accelerometer with raw data acquired at a sampling rate of 32 Hz, before being converted to activity counts via a proprietary algorithm developed by Philips. Activity counts are computed each second, and 30‐s window averages are then output. Additionally, the Actiwatch Software (AS) proposes an unsupervised algorithm for the prediction of wake/sleep periods epoch to epoch. As ACT and PSG may have independent internal clocks, and do not always share a common signal, ACT data were then manually synchronised with the PSG. More specifically, synchronisation was considered to be reached when increases in ACT counts over a single epoch matched with a single epoch arousal during deep sleep. This synchronisation was carried out meticulously by aligning individual actimetry activations with corresponding single‐epoch arousals during deep sleep. In cases where synchronisation could not be achieved with absolute certainty for a given subject, all data from that subject (i.e. actigraphy and PSG) were excluded from the analysis.
2.4. Sleep–wake algorithms
We investigated different algorithms for ACT sleep–wake predictions and compared them with the internal algorithm AS. We assessed two unsupervised algorithms traditionally used in the literature, those designed by Cole et al. and Sadeh et al. (Cole et al., 1992; Sadeh et al., 1994). Additionally, we investigated machine learning (ML) and deep learning (DL) methods, which involve constructing mathematical models with optimised hyperparameters based on a training set and evaluating their performances on a separate test set. The algorithms included logistic regression, support vector machine, random forest, a simple multilayer perceptron, two different architectures of convolutional neural networks, and two different architectures of recurrent neural networks. To account for interindividual and inter‐device variabilities in signal amplitude and movement sensitivity, we first applied a normalisation step, and divided, for each subject, the activity counts by the standard deviation of the 100 lowest values greater than 5 over the first 24 h. The data fed into the algorithms comprised, for each 30‐s epoch, a set of 133 features derived from this normalised activity counts and from time (Supplementary Table S1), and commonly used in actigraphy‐based supervised approaches (Cho et al., 2019; Palotti et al., 2019; Tilmanne et al., 2009). When needed, initial hyperparameter selection was conducted on the data of 75% of the participants, randomly selected. Subsequently, a 300‐sample bootstrapping procedure was implemented, randomly selecting within each bootstrap data with replacement for training (75% of the participants) and testing (the remaining 25% of participants). Missing data were replaced with the average across all observations for the corresponding feature. Within each bootstrap, predictions from the algorithms on the test set were compared with gold‐standard PSG to compute various performance metrics. We reported for each algorithm the median, 1st quartile, and 3rd quartile of the performance metrics to compare the algorithms' efficacy. Bootstrapping was also applied to unsupervised algorithms and AS without the training step for standardisation. The use of a bootstrapping procedure ensures that results are robust, minimising dependence on the randomly assigned test sets and the high interindividual and device‐to‐device variability in ACT data (Supplementary Figure S1).
Performance metrics included overall accuracy, sleep sensitivity (SEN), sleep specificity (SPE), predictive value for sleep (PVS), predictive value for wake (PVW), F1‐score (harmonic mean of PVS and SEN) and Cohen's kappa. Daytime (7 a.m.–11 p.m.) and nighttime (11 p.m.–7 a.m. on both nights) metrics were differentiated for SEN, SPE, PVS, and PVW. A detailed characterisation of these metrics can be found in the supplement (Supplementary Table S2). The algorithm demonstrating superior performance across multiple metrics was selected.
In a post hoc analysis, we evaluated feature importance using the permutation method. This involved randomly shuffling the values of a single input feature, predicting class labels, calculating the root mean square error (RMSE), and comparing it to the RMSE obtained without shuffling by computing an error ratio. We performed this process for individual features, and ranked feature importance based on the extent to which shuffling each feature increased the RMSE.
2.5. Algorithms' assessment on sleep data assessed by PSG
To further assess the performances of the selected algorithm versus AS, we compared sleep parameters derived from the gold standard PSG, from the predictions of our best algorithm, and the AS ActiWatch software (using the optimal parameters identified by Cook et al. (Cook et al., 2019)). During the bootstrap procedure, test set predictions and subject identifiers were saved, then predictions for each participant were averaged and binarised at a 0.5 threshold to yield a single prediction per participant. We first evaluated nighttime sleep and wake metrics, namely TST over different time windows (whole recording, night 1 [11 p.m.–7 a.m.], night 2 [11 p.m.–7 a.m.], daytime 1 [7 a.m.–3 p.m.], daytime 2 [3 p.m.–11 p.m.]), WASO (night 1 and night 2), sleep efficiency (night 1 and night 2), and sleep latency (night 1). Note that the concept of ‘night 1’ and ‘night 2’ can be difficult to distinguish in a free‐sleeping setup, thus WASO, sleep efficiency and sleep latency were computed on nights defined as [11 p.m.–7 a.m.], in line with previous work (Evangelista et al., 2018; Evangelista et al., 2021), although subjects' sleep may extend beyond these time windows.
These metrics do not account for the type of error the algorithm might make (false‐positive or false‐negative). We thus additionally assessed the type 1 error (false‐positive, or overestimation of sleep, i.e. predicting sleep when the true label is wake), the type 2 error (false‐negative, or underestimation of sleep, i.e. predicting wake when the true label is sleep) and the absolute error (type 1 and type 2 together), during the whole recording and across the different time periods (night 1, daytime, and night 2). Data were processed regardless of the results of the neurophysiological tests, to allow subsequent application of the algorithm to population with a complaint of hypersomnolence, with IH or not. In a sensitivity analysis, we compared performances in IH versus non‐IH subjects; performances in subjects with (A) TST≥19 h on the bedrest versus those below, performances in subjects with (B) MSL≤8 min on the MSLT versus those above, and subjects with both (A) and (B) versus those without.
2.6. Statistical analysis
The different sleep and wake parameters were computed for each subject for AS and sequence to sequence long short‐term memory network (further named S2S). The agreement between values from PSG and derived from AS and S2S were investigated with Bland–Altman plots. Biases were compared with zero with one‐sample Student's tests, and biases from AS and S2S were compared with paired Student's tests. The Kolmogorov–Smirnov test was used to assess normality of the distribution of each of the parameters. Pearson correlation (or Spearman correlation in the absence of a normal distribution) was used to evaluate the correlation between the parameters predicted by the algorithms and derived from the PSG. Normality of type 1, type 2, and absolute errors were assessed with the Kolmogorov–Smirnov test, and errors between both algorithms were compared with a paired Student's test (or Wilcoxon signed‐rank test). The influence of age, ESS score, TST, categorised into terciles of the population, and sex, was studied on the S2S's performances using Kruskall‐Wallis tests. All data analysis, statistical modelling and analysis, and plots were done using Matlab R2021a. Significant level was set at p ≤ 0.05.
3. RESULTS
3.1. Demographic data
Two hundred and six consecutive subjects were prospectively included and underwent the 32‐h PSG bedrest recording together with ACT recording over the same period. Due to non‐functional ACT, 36 (17.5%) recordings could not be analysed, and 44 (21.4%) others were rejected due to a lack of optimal synchronisation of the ACT data with PSG. Altogether, 126 recordings with accurately synchronised ACT and PSG were finally analysed over the 32‐h period. The main characteristics of these participants (91 females, mean age: 30.6 ± 15.5, 101 patients with confirmed IH and 25 with non‐specified hypersomnia) are provided in Table 1.
TABLE 1.
Characteristics of the study population.
| Variable | Number of subjects: 126 |
|---|---|
| Female, n (%) | 91 (70) |
| Age, years, mean (SD) | 30.6 (15.5) |
| BMI, kg/m2, mean (SD) | 23.0 (3.9) |
| ESS total score, mean (SD) | 14.7 (4.7) |
| Total sleep time on the 32‐h bedrest recording, hours, mean (SD) | 20.2 (2.6) |
| (A) Patients with total sleep time on bedrest ≥ 19 h, n (%) | 89 (71) |
| (B) Patients with mean sleep latency on MSLT ≤ 8 min, n (%) | 60 (48) |
| (A) + (B), n (%) | 48 (38) |
| IH diagnosis, n (%) | 101 (80) |
Abbreviations: BMI, body mass index; ESS, Epworth sleepiness scale; IH, idiopathic hypersomnia; MSLT, mean sleep latency test (modified); SD, standard deviation.
3.2. ACT algorithms performances to predict sleep–wake parameters
A better performance of ML algorithms and neural networks was found compared with AS, Sadeh's and Cole‐Kripke's algorithms, which supports the need for a data‐specific trained model (Table 2). Although SEN is high during daytime and nighttime, SEN alone is not sufficient to measure performance, as high SEN may lead to an overestimation of sleep. The results revealed a frequent association between high SEN and poor SPE, particularly pronounced during nighttime due to the increased difficulty in accurately predicting wake when it is weakly present. We also observed a reduced PVS during daytime, indicating that epochs predicted as sleep during the day are more prone to misclassification, highlighting the trend of algorithms to overestimate sleep. Conversely, PVW is lower during nighttime compared with daytime, indicating a higher likelihood of misclassification of epochs scored as wake at night. Our findings underscore the importance of considering a balanced tradeoff between false‐positive and false‐negative rates for evaluation of the algorithms. Therefore, comprehensive assessment metrics, including SEN, SPE, PVS, PVW, the F1‐score (which also reflects this tradeoff), and Cohen's kappa, should collectively inform the assessment of algorithms' performance. With regard to the different metrics, the algorithm that best performed is the S2S, a type of recurrent network, which shows the highest Cohen's kappa, F1 score, SPE, with a small reduction in SEN, and the best PVS with a small reduction in PVW (Table 2). The detailed architecture of S2S can be found in supplementary data (Figure S1).
TABLE 2.
Actigraphy algorithm performances to predict sleep–wake parameters determined by polysomnography.
| Sensitivity | Specificity | PVS | PVW | F1‐score | Cohen's kappa | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Metric | Whole recording | Whole recording | Whole recording | Whole recording | |||||||
| Algorithm | Overall accuracy | DT | NT | DT | NT | DT | NT | DT | NT | ||
| AS | 76.0 (79.6–94.1) | 95.5 (92.8–97.3) | 55.5 (42.9–65.3) | 78.4 (72.4–84.2) | 80.9 (86.8–91.7) | 85.3 (82.3–89.2) | 51.9 (42.2–60.6) | ||||
| 95.2 (92.6–97.8) | 95.4 (92.8–97.4) | 57.0 (44.1–65.9) | 50.6 (33.0–63.0) | 64.9 (51.8–73.7) | 90.1 (84.5–94.0) | 94.1 (88.5–97.4) | 68.9 (57.4–78.6) | ||||
| Sadeh | 80.7 (75.1–86.1) | 98.2 (94.7–99.6) | 51.0 (35.5–65.7) | 78.5 (71.1–85.4) | 94.1 (86.5–98.1) | 85.6 (81.5–90.5) | 53.3 (38.3–64.3) | ||||
| 97.5 (93.9–99.6) | 98.9 (95.4–99.8) | 55.3 (41.5–72.3) | 32.8 (14.1–49.8) | 67.1 (49.9–78.4) | 88.8 (82.8–93.1) | 96.8 (91.2–99.3) | 79.1 (63.5–93.4) | ||||
| Cole‐Kripke | 80.4 (73.8–85.4) | 99.3 (97.1–99.8) | 45.6 (31.7–60.8) | 76.5 (68.1–84.3) | 97.4 (91.1–99.1) | 86.0 (80.1–90.3) | 50.8 (35.7–63.1) | ||||
| 99.3 (96.5–99.9) | 99.4 (97.6–99.9) | 50.9 (36.7–64.9) | 30.1 (15.2–47.3) | 64.4 (48.4–76.6) | 88.3 (82.2–93.2) | 98.6 (95.0–99.8) | 88.5 (75.8–95.6) | ||||
| Logistic Regression | 81.2 (79.0–82.3) | 93.0 (87.1–96.6) | 64.5 (53.1–73.2) | 82.1 (77.8–85.0) | 84.2 (76.9–90.1) | 86.0 (84.4–86.9) | 69.0 (48.4–78.1) | ||||
| 91.1 (84.6–95.8) | 93.9 (88.4–97.0) | 69.2 (58.4–77.2) | 51.1 (36.5–60.7) | 70.9 (64.6–75.4) | 89.6 (87.6–91.2) | 90.5 (86.2–94.5) | 64.2 (53.1–72.9) | ||||
| Random Forest | 87.0 (84.0–90.0) | 94.0 (89.0–96.0) | 80.0 (69.0–87.0) | 89.0 (84.0–93.0) | 87.0 (77.0–92.0) | 90.0 (87.0–92.0) | 71.0 (62.0–77.0) | ||||
| 89.0 (51.0–93.0) | 96.0 (93.0–98.0) | 86.0 (77.0–92.0) | 57.0 (40.0–71.0) | 84.0 (72.0–90.0) | 93.0 (88.0–96.0) | 90.0 (83.0–93.0) | 74.0 (58.0–84.0) | ||||
| SVM | 76.9 (71.3–79.7) | 86.5 (71.1–94.8) | 65.4 (48.1–81.5) | 81.5 (75.6–87.1) | 74.0 (62.0–84.7) | 82.6 (77.5–84.9) | 50.0 (33.2–66.9) | ||||
| 84.4 (66.8–93.3) | 87.6 (73.0–95.4) | 70.0 (53.4–84.4) | 50.9 (29.0–72.5) | 70.0 (61.6–77.7) | 89.2 (86.1–92.9) | 84.5 (75.5–90.7) | 48.3 (35.3–61.8) | ||||
| MLP | 86.8 (83.5–89.8) | 92.7 (87.9–95‐9) | 80.6 (70.1–87.4) | 89.6 (84.3–93.2) | 86.5 (77.1–92.2) | 89.6 (87.1–92.2) | 70.3 (62.4–76.8) | ||||
| 85.9 (76.4–91.5) | 96.8 (93.9–98.6) | 88.3 (79.6–93.2) | 55.1 (35.7–69.6) | 85.4 (74.1–91.5) | 92.6 (87.4–95.4) | 88.6 (79.8–93.8) | 76.3 (62.3–86.6) | ||||
| CNN | 86.6 (82.7–89.4) | 94.9 (89.9–97.2) | 74.3 (63.8–86.3) | 87.7 (81.8–92.1) | 88.4 (81.1–93.6) | 89.8 (86.4–92.0) | 68.6 (58.2–75.7) | ||||
| 92.9 (87.1–96.1) | 96.2 (91.9–98.1) | 80.3 (67.8–89.4) | 56.7 (38.9–74.3) | 78.3 (67.1–87.8) | 93.5 (88.6–96.1) | 92.9 (86.9–96.2) | 67.4 (53.2–81.4) | ||||
| CNN‐MLP | 89.4 (85.6–91 0.8) | 95.7 (91.0–98.2) | 82.6 (70.9–90.8) | 90.9 (85.9–94.6) | 90.7 (83.6–96.1) | 91.2 (88.6–93.9) | 75.3 (65.1–81.2) | ||||
| 94.1 (87.8–98.1) | 96.4 (91.9–98.5) | 86.0 (75.4–94.0) | 64.9 (46.4–82.6) | 85.0 (74.0–91.9) | 95.2 (91.3–96.8) | 95.1 (89.9–98.2) | 75.5 (60.8–85.7) | ||||
| biLSTM | 86.6 (82.3–90.0) | 96.3 (90.5–98.8) | 75.4 (62.0–85.5) | 87.8 (81.6–92.0) | 91.7 (82.2–97.1) | 89.9 (86.4–92.5) | 69.3 (58.5–76.7) | ||||
| 95.8 (89.2–98.8) | 96.9 (91.3–99.1) | 80.3 (66.9–89.8) | 52.5 (29.5–73.4) | 80.4 (66.5–88.0) | 93.1 (88.5–95.3) | 96.5 (89.5–98.8) | 73.3 (56.1–88.4) | ||||
| Seq2seq LSTM | 89.9 (85.8–92.3) | 93.9 (88.8–96.9) | 86.6 (77.6–91.9) | 92.8 (87.9–95.4) | 88.9 (80.3–94.1) | 91.9 (88.8–94.0) | 77.3 (67.8–82.9) | ||||
| 89.9 (80.5–95.7) | 96.8 (93.3–98.6) | 92.2 (83.9–96.4) | 67.4 (44.5–81.6) | 90.7 (81.9–95.1) | 94.5 (90.0–96.6) | 92.1 (83.7–96.6) | 78.4 (63.4–88.8) | ||||
Note: The algorithm that performs best for each metric is highlighted in bold. Results are reported as median (1st quartile – 3rd quartile).
Abbreviations: AS, actigraphy software algorithm; biLSTM, bidirectional long‐short term memory network; CNN, convolutional neural network; CNN‐MLP, convolutional neural network combined with a multilayer perceptron; DT, daytime (7 a.m.–11 p.m.) during the bedrest; MLP, multilayer perceptron; NT, nighttime (11 p.m.–7 a.m., night 1 + night 2) during the bedrest; PVS, predictive value for sleep; PVW, predictive value for wake; Seq2seq LSTM, sequence to sequence long short‐term memory network; SVM, support vector machine.
Using a permutation approach, we evaluated the importance of individual features by measuring the increase in prediction error caused by shuffling each parameter. This was quantified as an error ratio, representing the relative increase in error. Sorted error ratios, as well as corresponding features, for error ratios above 1.3, can be found in supplementary data (Figure S2, Table S3). Approximately 50% of the input features (67 in total) showed minimal impact on prediction error (ratio ≤1.1). In contrast, 43 features significantly influenced prediction error when shuffled (ratio ≥1.3). Of these, 14 corresponded to the maximum activity within windows of various sizes (5 to 31 epochs) centred on the epoch of interest, 11 reflected the median activity, and 9 represented the standard deviation of activity within these windows.
3.3. Bland–Altman analysis of sleep wake parameters assessed by ACT and PSG
During the first night, TST (and consequently sleep efficiency) was accurately predicted by AS and slightly overestimated by S2S, sleep latency was underestimated by AS and accurately predicted by S2S, while WASO was significantly overestimated by AS and underestimated by S2S. Despite these discrepancies with PSG, biases remained within clinically acceptable ranges (Table 3, Figure 2). During daytime, TST was clinically and significantly overestimated by AS, conversely biases were close to zero with S2S. Similarly, for night 2, biases for all parameters were closer to 0 for S2S than for AS. Altogether, AS showed a greater divergence from the PSG on overall TST, although its performance during the first night was comparable to that of S2S. AS performed well in cases of undisrupted nocturnal sleep, with rare awakenings (i.e. ‘night 1’ of the bedrest with preserved sleep homeostasis, as reflected by the high sleep efficiency 91.6 ± 6.5%), while performances were lower when sleep was fragmented (i.e. ‘night 2’ of the bedrest with variable TST due to differences in the amount of sleep the day before, as reflected by the low sleep efficiency 72.2 ± 14.1%). In contrast, our S2S algorithm performed better and closer to the PSG, especially during daytime and night 2 (Table 3, Figure 2). Finally, the biases between the AS and S2S estimates differed significantly for all sleep and wake parameters, confirming the clear difference between these two algorithms in the estimation of sleep parameters observed in the Bland–Altman plots (Figure 2).
TABLE 3.
Bland–Altman analysis of the sleep parameters scored on polysomnography and on actigraphy by the Actiwatch software and the selected algorithm S2S.
| Actiwatch software (AS) | Selected algorithm (S2S) | AS versus S2SBias (95% CI) | ||||
|---|---|---|---|---|---|---|
| Bias (95% CI) | p‐value | Bias (95% CI) | p‐value | Bias (95% CI) | p‐value | |
| Total sleep time bedrest (min) | −255.0 (−289.3, −220.7) | <0.01 | −27.9 (−55.9, 0.2) | 0.07 | −227.2 (−252.3, −202.0) | <0.01 |
| Night 1 (11 p.m.–7 a.m.) | ||||||
| Total sleep time (min) | 1.0 (−4.1 6.2) | 0.69 | −9.7 (−15.1, −4.3) | <0.01 | 10.7 (7.3, 14.1) | <0.01 |
| Sleep latency (min) | 7.2 (5.0, 9.3) | <0.01 | 2.2 (−0.6, 4.9) | 0.13 | 5.0 (2.6, 7.5) | <0.01 |
| Sleep efficiency (%) | 0.22 (−0.9, 1.3) | 0.69 | −2.1 (−3.4, −0.9) | <0.01 | 2.3 (1.6, 3.0) | <0.01 |
| WASO (min) | −13.2 (−22.7, −3.6) | <0.01 | 15.2 (5.9, 24.6) | <0.01 | −28.4 (−34.8, −22.0) | <0.01 |
| Daytime 1 (7 a.m.–3 p.m.) | ||||||
| Total sleep time (min) | −68.7 (−80.4, −57.0) | <0.01 | −8.8 (−17.8, 0.2) | 0.07 | −59.9 (−69.1, −50.8) | <0.01 |
| Daytime 2 (3 p.m.‐11 p.m.) | ||||||
| Total sleep time (min) | −144.0 (−157.2, −130.7) | <0.01 | −0.6 (−10.7, 9.5) | 0.91 | −143.3 (−154.1, −132.6) | <0.01 |
| Night 2 (11 p.m.–7 a.m.) | ||||||
| Total sleep time (min) | −50.2 (−60.1, −40.3) | <0.01 | −18.5 (−28.3, −8.7) | <0.01 | −31.7 (−41.6, −21.7) | <0.01 |
| Sleep efficiency (%) | −10.5 (−12.6, −8.4) | <0.01 | −3.9 (−5.9, −1.8) | <0.01 | −6.6 (−8.7, −4.5) | <0.01 |
| WASO (min) | 93.0 (73.4, 112.7) | <0.01 | 34.2 (15.5, 52.8) | <0.01 | 58.9 (39.0, 78.7) | <0.01 |
Note: Bold indicates significant values.
Abbreviations: AS, Actiwatch Software algorithm; S2S, sequence to sequence LSTM (selected algorithm), WASO, wake after sleep onset.
FIGURE 2.

Bland–Altman plots of agreement between actigraphy and polysomnography for total sleep time during the whole recording, nighttime (night 1, night 2) and daytime; wake after sleep onset (night 1, night 2) and sleep latency (night 1). For each parameter, X‐axis represents the mean of PSG and actigraphy values, and Y‐axis represents the difference between PSG and actigraphy values. Horizontal solid lines denote the average mean difference and dotted lines the limits of agreement (95% CI). AS, Actiwatch Software algorithm, BR, bedrest; CI, confidence interval; S2S, sequence to sequence LSTM (selected algorithm); TST, total sleep time; WASO, wake after sleep onset.
3.4. Error types in the algorithm during the bedrest between PSG and ACT algorithms
Type 1 (i.e. overestimates sleep), type 2 (i.e. underestimation of sleep) and absolute errors were reported for both AS and S2S, during the whole bedrest across four consecutive 8‐h time periods (night 1, daytime 1, daytime 2, and night 2) (Figure 3, Figure S3 for a graphical comparative example). The absolute error was greatly reduced in S2S compared with AS, during the whole recording (median error AS vs S2S: 385 min vs 169 min, p < 0.001), during night 1, day 1 (across the 2 periods), and night 2, respectively. AS showed the greatest absolute error during daytime ([7 a.m.–3 p.m.] 93 min vs 45 min p < 0.001, [3 p.m.–11 p.m.] 154 min vs. 39 min p < 0.001) (Figure 3a). Analysis of the type of error revealed that AS made mainly type 1 error, while the type 2 error was minimal. In contrast, S2S made significantly less type 1 error, overall (median error: 321 min vs 80 min p < 0.001) and across time ([7 a.m.–3 p.m.] 74 min vs 20 min p < 0.001, [3 p.m.–11 p.m.] 148 min vs 12 min p < 1e–5, [11 p.m.–7 a.m.] 60 min vs 25 min p < 0.001). S2S made significantly more type 2 error both overall (54 min vs 57 min, p = 0.05) and during daytime ([7 a.m.–3 p.m.] 11.5 min vs 13.5 min p < 0.001, [3 p.m.–11 p.m.] 4.7 min vs 15.2 min p < 1e–5), but the amount remained very low (Figure 3b).
FIGURE 3.

Comparison of error types to score sleep and wake parameters of both actigraphy algorithms based on scoring from polysomnography across time during the 32‐h bedrest recording. (a) Absolute error made by AS and S2S, during the whole recording and across time during the bedrest; (b) distinction between type 1 (over‐estimation of sleep) and type 2 (under‐estimation of sleep) errors. *p ≤ 0.05, **p ≤ 0.01. AS, Actiwatch Software algorithm; S2S, sequence to sequence LSTM (selected algorithm).
3.5. Error types in the algorithm and characteristics of the subjects
We evaluated the impact of age, sex, ESS, and TST on the whole‐recording absolute, type 1 and type 2 error of S2S. We found no effect of sex or ESS on either error type. Age had no effect on the absolute and type 1 errors, but younger subjects had more type 2 error (p = 0.04, post‐hoc: tercile 1 (T1) [age < 22.6 years] > tercile 3 (T3) [age ≥ 30.2 years]). TST was not associated with absolute error, but TST was associated with type 1 and type 2 errors: more precisely, shorter sleepers had more type 1 errors (p = 0.02, post‐hoc: T1 [TST < 19.2 h] > T3) while conversely longer sleepers had more type 2 errors (p = 0.01, post‐hoc: T3 [TST ≥ 21.6 h] > T1).
In a sensitivity analysis, we investigated differences in absolute, type 1 and type 2 errors between IH subjects and non‐IH patients. We found no differences in absolute and type 1 errors, but we found more type 2 error in IH subjects (p = 0.01). Additionally, we compared performances in subjects with prolonged sleep time ((A) TST ≥ 19 h on the bedrest) versus others, and found the latter to have more type 1 error (p = 0.04), while the former had more type 2 error (p = 0.01), but no difference in absolute errors. We found that participants with objective sleepiness ((B) MSL ≤ 8 min on the MSLT) had fewer type 1 (p = 0.04) but more type 2 (p = 0.01) errors than those without. Similarly, subjects with both (A) and (B) had fewer type 1 (p = 0.01) and more type 2 (p = 0.01) errors compared with those without.
4. DISCUSSION
We developed and tested several ACT algorithms using PSG as the reference standard in a larger sample of patients with suspected idiopathic hypersomnia than any prior work, during a bedrest 32‐h protocol, and we compared their performances with the automatic sleep/wake scoring integrated in the Actiwatch software and two other widely used scores (Sadeh's and Cole's algorithms). Our best algorithm performed much better than AS, Sadeh's and Cole's algorithms, based on the relevant performance metrics. Additionally, our best algorithm outperformed AS on sleep and wake variables, and also when looking at absolute, type 1 and type 2 errors, and exhibited better performances compared with the existing literature.
To our knowledge, this study is the first that validates an ACT‐based sleep–wake prediction algorithm on both nighttime and daytime settings in a sample of subjects with a suspicion of idiopathic hypersomnia. As part of a unique approach, ACT and PSG were manually aligned, a crucial step often overlooked when evaluating ACT‐derived parameters rather than focussing on epoch‐to‐epoch performance, and an extensive process was applied to train both ML and DL models. Our results are consistent with those reported recently that compared several traditional, ML, and DL‐based algorithms using a similar methodology on daytime and nighttime concomitant actimetry and PSG recordings, in a population of subjects without hypersomnia (Palotti et al., 2019). They found that DL algorithms outperformed other algorithms, particularly a long‐short term memory network algorithm, which uses a network architecture similar to ours (Palotti et al., 2019). Biscarini et al. (Biscarini et al., 2024) recently conducted a study with a similar objective, in IH and in narcolepsy, though utilising a different design (24‐h‐continuous PSG, but without instructions to remain in bed), and in a smaller sample. Their findings aligned with ours, showing that traditional prediction algorithms for actigraphy tend to overestimate total sleep time when daytime periods are included, and highlighting the need for caution when using actigraphy for diagnostic purposes.
When examining differences in TST, our algorithm consistently showed comparable to lower differences with the PSG‐based scoring compared with AS. Specifically, on the first night, the median absolute error was 21.5 min, similar to performance levels reported in existing literature in other sleep disorders. Alakuijala et al. (Alakuijala et al., 2021). showed mean (non‐absolute) difference in TST over one night of −24.3 min in NT1 subjects (underestimation of sleep), and 12.8 min in patients suffering from obstructive sleep apnea. Choi et al. (Choi et al., 2017). found ACT to overestimate the average TST by 29.3 min in one‐night PSG in patients with sleep‐disordered breathing, and by 31.9 min in patients with chronic insomnia. Similar to AS, our algorithm S2S exhibited poorer performances in more fragmented sleep/wake patterns. This was observed during the daytime and the second night of the bedrest, linked to the large variability in wake and sleep duration between patients during the last 24‐h of the recording; however, these errors remained smaller than those observed with AS. Unlike most algorithms from the literature, our algorithm was trained on balanced data and managed to perform well both during the night (where detecting wakefulness during sleep can be challenging) and during the day (where detecting sleep during waketime can be difficult, and the distinction between sleep and immobility crucial).
We found an association between S2S performances and TST. The longer the subjects slept, the more likely the algorithm made type 2 errors (underestimate sleep) and conversely. It is not surprising that, in the extreme theoretical case of a subject sleeping all the time, the algorithm can only perfectly predict or underestimate sleep. Participants with prolonged sleep in the bedrest generally experienced more sleep bouts, leading to more initiations of sleep, transitional periods of lighter sleep, which the algorithm might misclassify as wakefulness. This may also be due to the use of time as a parameter in training the model, as long sleepers could sleep at times when most subjects are awake (e.g. the forbidden sleep zone around 6 p.m.–8 p.m.) which would make the algorithm more likely to score those epochs as wake. This should be less problematic in ACT in an outpatient, as patients due to their social routine, are more likely to remain awake at this time. This also explains the increase in type 2 errors in subjects with longer TST on the bedrest recording, and with objective sleepiness, overlapping with patients with idiopathic hypersomnia.
It is crucial to be able to discriminate patients with idiopathic hypersomnia from subjects with clinophilia (i.e. lying in bed, remaining awake, often observed in major depressive episodes) and long sleepers (i.e. need for longer sleep duration, without daytime impairment when sleep duration is adequately obtained) (Dauvilliers et al., 2022). This task becomes particularly challenging in real‐world conditions, when the reference PSG is unavailable, and only ACT data are accessible. By training our algorithm during the bedrest, where the instructions are to stay calmly in bed and try to sleep ad libitum and the reference PSG is available, we ensure that it is able to successfully separate lying awake from sleep. Furthermore, ACT could prove useful in differentiating the different IH phenotypes, with and without long sleep time. Naps are indeed scheduled and limited in time during the MSLT, so are regular PSG recordings, and the bedrest protocol recording takes place in conditions exceptionally reproduced at home.
There is a need to further our understanding of idiopathic hypersomnia outside the laboratory environment, without relying on PSG and home‐based assessments could improve diagnosis and management of the disorder. Actigraphy is listed as a tool in the ICSD‐3‐TR criteria to diagnose IH, as a non‐invasive and cost‐effective solution, but surprisingly lacks sufficient validation and reliability in this condition. In addition, it is important to highlight the poor performances of AS, which display clear overestimation of sleep during daytime, and larger divergence from PSG on sleep/wake parameters during daytime and the second night. A graphical comparative example of evaluation of the three sleep and wake scoring methods is provided in Figure S3 to illustrate the limits of using commercial algorithms when evaluating populations with hypersomnolence. The biases between the AS and S2S estimates via the Bland–Altman analysis differed significantly for all sleep and wake parameters, confirming the clear difference between these two algorithms in the estimation of sleep parameters. Thus, AS scoring cannot be considered reliable for scoring sleep–wake phases in continuous monitoring (during both daytime and nighttime periods), when the reference PSG is not available. There is a clear need for a more reliable algorithm that produces predictions more accurately aligned with PSG results. Alternatively, one should at least develop a post‐processing algorithm such as removal of aberrant and transient sleep bouts in the middle of clear activity periods, which tends to be scored by AS as sleep, as it is a threshold‐based prediction. A recent paper highlighted that the prevalence of IH in the general population might be greater than thought (Plante et al., 2024), further supporting the need for reliable non‐PSG diagnostic tools, such as ACT, which is to date neither validated enough nor sufficiently reliable.
There are several strengths of this study. We evaluated the performance of wrist ACT, in a larger sample of individuals with suspected IH than any prior work. We considered measures across 32‐h bedrest recording, collecting in‐lab PSG and MSLT data. Analyses of performance of ACT vs. PSG considered night‐time and day‐time periods, which is unique. We hand‐aligned ACT and PSG epochs, a critical step before evaluating performance. Finally, we followed a rigorous process to test multiple ML and DL models using training and test sets.
We acknowledged, however, several limitations. First, we did not incorporate micro‐arousals in our algorithms, events that may be associated with movements recorded during the sleep periods. Second, the absence of a perfect synchronisation method between hypnograms and ACT is a major drawback. However, no automated and computerised method has been developed to date for this synchronisation (Palotti et al., 2019). This is a critical and delicate step that we had to perform visually, hence subject to human error, and that also led to discarding many data. Due to this method of data pre‐processing, exportation of our algorithm to other sleep‐laboratories potentially using different ACT devices in different populations might not yield similar performances and should be tested before being used. The present results may indeed not be generalised to all patients suffering from hypersomnolence in other sleep laboratories. Thus, we may recommend a pre‐synchronisation step when initiating concurrent PSG and actimetry recording, using a timed arm‐shake, which generates a key increase in ACT count to help synchronise PSG and actigraphy. Secondly, during the training phase of our algorithms, we optimised our hyperparameters based on the random split of our data into training and test sets. We then used these hyperparameters in the bootstrapping procedure, which involves a different training/test randomisation in each iteration. Thus, we may have data leakage as data from the training set used for hyperparameter selection could later be used in the prediction phase of a bootstrap iteration. To reduce additional data leakage, we have always performed the training/test split by subject, so data from a given subject were either in the training set or the test set. We did not perform feature selection, and although the features were designed based on those commonly used in the literature (Cho et al., 2019; Cole et al., 1992; Palotti et al., 2019; Sadeh et al., 1994) and recurrent neural networks inherently perform some levels of feature selection, the use of bootstrapping may have amplified irrelevant noise and contributed to data overfitting. These limitations highlight the need for replication and validation of our findings with novel datasets to ensure the generalisability of the model. Third, the study population primarily comprised individuals with complaint of hypersomnolence and a high clinical suspicion of IH; however, some of these patients finally did not meet criteria for IH. Fourth, we split the bedrest into two nights and a daytime period, in order to differentiate performances of the algorithms in conditions where sleep is prevalent and detecting wake is challenging, and conversely. While participants' sleep patterns may not align precisely with these divisions, this approach was chosen to maintain standardisation, and was consistent with prior research on the bedrest protocol (Evangelista et al., 2018; Evangelista et al., 2021). Nonetheless, it is important to note that this division may not accurately reflect the participants' actual nightly sleep patterns. Our sample lacked normal sleepers. Due to the long recruitment period to record patients with IH, a rare disorder, on a 32‐h bedrest protocol, during this period we used a unique actigraph named ActiWatch‐2, a device with tri‐axial piezoelectric accelerometer technology, still widely used in many sleep laboratories. The exportability of our algorithm to other piezoelectric and micro‐electro‐mechanical system (MEMS) accelerometers remains to be defined. The Actiwatch‐2 has been validated for motions above 0.35 Hz – which encompasses most of the spontaneous human motions, and MEMS accelerometers are well known to be more suitable for human movement. However, they measure essentially the same underlying signal and these measurements would require to rescale MEMS‐actigraphy data to match the range of Actiwatch‐2. These considerations could be addressed in future research to formally confirm signal similarity through simultaneous recording from Actiwatch‐2 and a MEMS accelerometer. Note that the ICSD‐3 TR (American Academy of Sleep Medicine, 2023), which includes an actigraphy‐based criterion for the diagnostic of IH, does not specify device requirements, in terms of accelerometer technology. Here, we assessed the performance of the best algorithm during a 32‐h in‐laboratory bedrest recording that allowed us to distinguish precisely between sleep, remaining awake but inactive in bed, and being fully awake in the bed. This is a crucial distinction to allow its subsequent application at home in patients suffering from hypersomnolence. However, we acknowledge that this procedure is not representative of ecological conditions, future studies are needed to test this algorithm to study sleep–wake patterns without PSG in the home of patients with hypersomnia. Novel actimetry technologies now often incorporate other biosignals, such as heart rate, skin temperature, and blood oxygen saturation. These additional signals will be valuable in improving the accuracy of actigraphy‐based sleep–wake prediction in the future, based on machine learning algorithms.
5. CONCLUSION
This study is the first to validate an algorithm to derive sleep/wake patterns from ACT, in a population of patients with suspected IH, during a controlled standardised 32‐h bedrest protocol, with PSG recording as the reference. This algorithm using machine learning approaches performed much better than the automated scoring from the Actiwatch software and Sadeh's and Cole&Kripke's widely used prediction algorithms. This highlights the need for more advanced algorithms to evaluate actigraphy data in IH populations; the selected recurrent neural network model (S2S for sequence‐to‐sequence long short‐term memory network) may be the best option for further research, although more‐in depth validation studies are needed. One of our future goals is to test this algorithm to study sleep–wake patterns without PSG in patients with hypersomnia, under ecological conditions in daily life for 2 weeks in drug‐free condition, and then to reevaluate them in the same environment after being medicated.
AUTHOR CONTRIBUTIONS
Tugdual Adam: Writing – original draft; software; formal analysis; data curation; methodology; validation. Jérôme Tanty: Writing – review and editing; data curation; investigation. Lucie Barateau: Investigation; data curation; writing – review and editing. Yves Dauvilliers: Conceptualization; writing – review and editing; validation; methodology; visualization; supervision; funding acquisition; investigation; resources; project administration; software.
FUNDING INFORMATION
We declare no financial arrangements or connections related to this article.
CONFLICT OF INTEREST STATEMENT
L Barateau received funds for travelling to conferences by Idorsia, and Bioprojet, and board engagements by Jazz, Takeda, Idorsia, and Bioprojet. Y Dauvilliers received funds for seminars, board engagements and travel to conferences by UCB Pharma, Jazz, Theranexus, Idorsia, Takeda, Avadel, and Bioprojet. T Adam and J Tanty report no disclosure.
Supporting information
Data S1 Supporting Information.
ACKNOWLEDGEMENTS
We thank all study participants, and the French Association of Narcoleptic patients (ANC, Association Française de Narcolepsie Cataplexie et d'Hypersomnies rares). We thank all collaborators at the National Reference Center for Narcolepsy in Montpellier.
Adam, T. , Tanty, J. , Barateau, L. , & Dauvilliers, Y. (2025). Actigraphy against 32‐hour polysomnography in patients with suspected idiopathic hypersomnia. Journal of Sleep Research, 34(5), e70007. 10.1111/jsr.70007
Lucie Barateau and Yves Dauvilliers share last authorship.
DATA AVAILABILITY STATEMENT
The data and codes that support the findings of this study are publicly available at https://github.com/TukDwal/Actigraphy_Idiopathic_Hypersomnia.
REFERENCES
- Acebo, C. , Sadeh, A. , Seifer, R. , Wolfson, A. R. , Hafer, A. , & Carskadon, M. A. (1999). Estimating sleep patterns with activity monitoring in children and adolescents: how many nights are necessary for reliable measures? Sleep, 22(1), 95–103. https://academic.oup.com/sleep/article/22/1/95/2731704 [DOI] [PubMed] [Google Scholar]
- Alakuijala, A. , Sarkanen, T. , Jokela, T. , & Partinen, M. (2021). Accuracy of Actigraphy compared to concomitant ambulatory polysomnography in narcolepsy and other sleep disorders. Frontiers in Neurology, 12, 629709. 10.3389/FNEUR.2021.629709 [DOI] [PMC free article] [PubMed] [Google Scholar]
- American Academy of Sleep Medicine . (2023). International classification of sleep disorders‐third edition, text revision (ICSD‐3‐TR). American Academy of Sleep Medicine. [Google Scholar]
- American Sleep Disorders Association . (1995). Practice parameters for the use of actigraphy in the clinical assessment of sleep disorders. Sleep, 18(4), 285–287. 10.1093/SLEEP/18.4.285 [DOI] [PubMed] [Google Scholar]
- Ancoli‐Israel, S. , Clopton, P. , Klauber, M. R. , Fell, R. , & Mason, W. (1997). Use of wrist activity for monitoring sleep/wake in demented nursing‐home patients. Sleep, 20(1), 24–27. 10.1093/SLEEP/20.1.24 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ancoli‐Israel, S. , Cole, R. , Alessi, C. , Chambers, M. , Moorcroft, W. , & Pollak, C. P. (2003). The role of actigraphy in the study of sleep and circadian rhythms. Sleep, 26(3), 342–392. 10.1093/SLEEP/26.3.342 [DOI] [PubMed] [Google Scholar]
- Biscarini, F. , Vandi, S. , Riccio, C. , Raggini, L. , Neccia, G. , Plazzi, G. , & Pizza, F. (2024). The actigraphic evaluation of daytime sleep in central disorders of hypersomnolence: Comparison with polysomnography. Sleep, 47(12), zsae189. 10.1093/SLEEP/ZSAE189 [DOI] [PubMed] [Google Scholar]
- Blackwell, T. , Redline, S. , Ancoli‐Israel, S. , Schneider, J. L. , Surovec, S. , Johnson, N. L. , Cauley, J. A. , Stone, K. L. , & Study of Osteoporotic Fractures Research Group . (2008). Comparison of sleep parameters from actigraphy and polysomnography in older women: The SOF study. Sleep, 31(2), 283–291. 10.1093/SLEEP/31.2.283 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cho, T. , Sunarya, U. , Yeo, M. , Hwang, B. , Koo, Y. S. , & Park, C. (2019). Deep‐ACTINet: End‐to‐end deep learning architecture for automatic sleep–wake detection using wrist actigraphy. Electronics, 8(12), 1461. 10.3390/electronics8121461 [DOI] [Google Scholar]
- Choi, S. J. , Kang, M. , Sung, M. J. , & Joo, E. Y. (2017). Discordant sleep parameters among actigraphy, polysomnography, and perceived sleep in patients with sleep‐disordered breathing in comparison with patients with chronic insomnia disorder. Sleep & Breathing, 21(4), 837–843. 10.1007/S11325-017-1514-5 [DOI] [PubMed] [Google Scholar]
- Cole, R. J. , Kripke, D. F. , Gruen, W. , Mullaney, D. J. , & Gillin, J. C. (1992). Automatic sleep/wake identification from wrist activity. Sleep, 15(5), 461–469. 10.1093/SLEEP/15.5.461 [DOI] [PubMed] [Google Scholar]
- Cook, J. D. , Eftekari, S. C. , Leavitt, L. A. , Prairie, M. L. , & Plante, D. T. (2019). Optimizing actigraphic estimation of sleep duration in suspected idiopathic hypersomnia. Journal of Clinical Sleep Medicine, 15(4), 597–602. 10.5664/jcsm.7722 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dauvilliers, Y. , Bogan, R. K. , Arnulf, I. , Scammell, T. E. , St Louis, E. K. , & Thorpy, M. J. (2022). Clinical considerations for the diagnosis of idiopathic hypersomnia. Sleep Medicine Reviews, 66, 101709. 10.1016/J.SMRV.2022.101709 [DOI] [PubMed] [Google Scholar]
- de Souza, L. , Benedito‐Silva, A. A. , Pires, M. L. N. , Poyares, D. , Tufik, S. , & Calil, H. M. (2003). Further validation of actigraphy for sleep studies. Sleep, 26(1), 81–85. 10.1093/SLEEP/26.1.81 [DOI] [PubMed] [Google Scholar]
- Evangelista, E. , Lopez, R. , Barateau, L. , Chenini, S. , Bosco, A. , Jaussent, I. , & Dauvilliers, Y. (2018). Alternative diagnostic criteria for idiopathic hypersomnia: A 32‐hour protocol. Annals of Neurology, 83(2), 235–247. 10.1002/ANA.25141 [DOI] [PubMed] [Google Scholar]
- Evangelista, E. , Rassu, A. L. , Barateau, L. , Lopez, R. , Chenini, S. , Jaussent, I. , & Dauvilliers, Y. (2021). Characteristics associated with hypersomnia and excessive daytime sleepiness identified by extended polysomnography recording. Sleep, 44(5), zsaa264. 10.1093/SLEEP/ZSAA264 [DOI] [PubMed] [Google Scholar]
- Filardi, M. , Pizza, F. , Martoni, M. , Vandi, S. , Plazzi, G. , & Natale, V. (2015). Actigraphic assessment of sleep/wake behavior in central disorders of hypersomnolence. Sleep Medicine, 16(1), 126–130. 10.1016/J.SLEEP.2014.08.017 [DOI] [PubMed] [Google Scholar]
- Hauri, P. J. , & Wisbey, J. (1992). Wrist actigraphy in insomnia. Sleep, 15(4), 293–301. 10.1093/SLEEP/15.4.293 [DOI] [PubMed] [Google Scholar]
- Iber, C. , Ancoli‐Israel, S. , Chesson, A. , Quand, S. , & for the AA of SM . (2007). The AASM Manual for the Scoring of Sleep and Associated Events.
- Jean‐Louis, G. , von Gizycki, H. , Zizi, F. , Hauri, P. , Spielman, A. , & Taub, H. (1997). The Actigraph data analysis software: I. A novel approach to scoring and interpreting sleep–wake activity. Perceptual and Motor Skills, 85(1), 207–216. 10.2466/PMS.1997.85.1.207 [DOI] [PubMed] [Google Scholar]
- Jean‐Louis, G. , Zizi, F. , Von Gizycki, H. , & Hauri, P. (1999). Actigraphic assessment of sleep in insomnia: Application of the Actigraph data analysis software (ADAS). Physiology & Behavior, 65(4–5), 659–663. 10.1016/S0031-9384(98)00213-3 [DOI] [PubMed] [Google Scholar]
- Lichstein, K. L. , Stone, K. C. , Donaldson, J. , Nau, S. D. , Soeffing, J. P. , Murray, D. , Lester, K. W. , & Aguillard, R. N. (2006). Actigraphy validation with insomnia. Sleep, 29(2), 232–239. 10.1093/SLEEP/29.2.232 [DOI] [PubMed] [Google Scholar]
- Meltzer, L. J. , Hiruma, L. S. , Avis, K. , Montgomery‐Downs, H. , & Valentin, J. (2015). Comparison of a commercial accelerometer with polysomnography and actigraphy in children and adolescents. Sleep, 38(8), 1323–1330. 10.5665/sleep.4918 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palotti, J. , Mall, R. , Aupetit, M. , Rueschman, M. , Singh, M. , Sathyanarayana, A. , Taheri, S. , & Fernandez‐Luque, L. (2019). Benchmark on a large cohort for sleep–wake classification with machine learning techniques. NPJ Digital Medicine, 2(1), 50. 10.1038/S41746-019-0126-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pizza, F. , Moghadam, K. K. , Vandi, S. , Detto, S. , Poli, F. , Mignot, E. , Ferri, R. , & Plazzi, G. (2013). Daytime continuous polysomnography predicts MSLT results in hypersomnias of central origin. Journal of Sleep Research, 22(1), 32–40. 10.1111/J.1365-2869.2012.01032.X [DOI] [PubMed] [Google Scholar]
- Plante, D. T. , Hagen, E. W. , Barnet, J. H. , Mignot, E. , & Peppard, P. E. (2024). Prevalence and course of idiopathic hypersomnia in the Wisconsin sleep cohort study. Neurology, 102(2), e207994. 10.1212/WNL.0000000000207994 [DOI] [PubMed] [Google Scholar]
- Sadeh, A. (2011). The role and validity of actigraphy in sleep medicine: An update. Sleep Medicine Reviews, 15(4), 259–267. 10.1016/j.smrv.2010.10.001 [DOI] [PubMed] [Google Scholar]
- Sadeh, A. , Sharkey, K. M. , & Carskadon, M. A. (1994). Activity‐based sleep–wake identification: An empirical test of methodological issues. Sleep, 17(3), 201–207. 10.1093/SLEEP/17.3.201 [DOI] [PubMed] [Google Scholar]
- Smith, M. T. , McCrae, C. S. , Cheung, J. , Martin, J. L. , Harrod, C. G. , Heald, J. L. , & Carden, K. A. (2018). Use of Actigraphy for the evaluation of sleep disorders and circadian rhythm sleep–wake disorders: An American Academy of sleep medicine systematic review, meta‐analysis, and GRADE assessment. Journal of Clinical Sleep Medicine, 14(7), 1209–1230. 10.5664/JCSM.7228 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taibi, D. M. , Landis, C. A. , & Vitiello, M. V. (2013). Concordance of polysomnographic and Actigraphic measurement of sleep and wake in older women with insomnia. Journal of Clinical Sleep Medicine, 9(3), 217–225. 10.5664/JCSM.2482 [DOI] [PMC free article] [PubMed] [Google Scholar]
- te Lindert, B. H. W. , van der Meijden, W. P. , Wassing, R. , Lakbila‐Kamal, O. , Wei, Y. , van Someren, E. J. W. , & Ramautar, J. R. (2020). Optimizing actigraphic estimates of polysomnographic sleep features in insomnia disorder. Sleep, 43(11), zsaa090. 10.1093/SLEEP/ZSAA090 [DOI] [PubMed] [Google Scholar]
- Tilmanne, J. , Urbain, J. , Kothare, M. V. , Wouwer, A. V. , & Kothare, S. V. (2009). Algorithms for sleep–wake identification using actigraphy: A comparative study and new results. Journal of Sleep Research, 18(1), 85–98. 10.1111/J.1365-2869.2008.00706.X [DOI] [PubMed] [Google Scholar]
- Torstensen, E. W. , Pickering, L. , Korum, B. R. , Wanscher, B. , Baandrup, L. , & Jennum, P. J. (2021). Diagnostic value of actigraphy in hypersomnolence disorders. Sleep Medicine, 85, 1–7. 10.1016/J.SLEEP.2021.06.033 [DOI] [PubMed] [Google Scholar]
- Verbeek, I. , Arends, J. , Declerck, G. , & Beecher, L. (1994). Wrist actigraphy in comparison with polysomnography and subjective evaluation in insomnia. Sleep–Wake Research in the Netherlands, 5, 163–169. [Google Scholar]
- Vernet, C. , & Arnulf, I. (2009). Idiopathic hypersomnia with and without long sleep time: A controlled series of 75 patients. Sleep, 32(6), 753–759. 10.1093/SLEEP/32.6.753 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walia, H. K. , & Mehra, R. (2019). Practical aspects of actigraphy and approaches in clinical and research domains. Handbook of Clinical Neurology, 160, 371–379. 10.1016/B978-0-444-64,032-1.00024-2 [DOI] [PubMed] [Google Scholar]
- Williams, J. M. , Taylor, D. J. , Slavish, D. C. , Gardner, C. E. , Zimmerman, M. R. , Patel, K. , Reichenberger, D. A. , Francetich, J. M. , Dietch, J. R. , & Estevez, R. (2020). Validity of Actigraphy in young adults with insomnia. Behavioral Sleep Medicine, 18(1), 91–106. 10.1080/15402002.2018.1545653 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data S1 Supporting Information.
Data Availability Statement
The data and codes that support the findings of this study are publicly available at https://github.com/TukDwal/Actigraphy_Idiopathic_Hypersomnia.
