Skip to main content
American Journal of Respiratory and Critical Care Medicine logoLink to American Journal of Respiratory and Critical Care Medicine
letter
. 2021 Mar 30;204(10):1227–1231. doi: 10.1164/rccm.202103-0680LE

Machine Learning–based Sleep Staging in Patients with Sleep Apnea Using a Single Mandibular Movement Signal

Nhat-Nam Le-Dong 1,*, Jean-Benoit Martinot 2,3,*,, Nathalie Coumans 2, Valérie Cuthbert 2, Renaud Tamisier 4,5, Sébastien Bailly 4,5, Jean-Louis Pépin 4,5
PMCID: PMC8759305  PMID: 34297641

To the Editor:

We all sleep, and sleep patterns and architecture influence our health and wellbeing. At present, the gold standard method for recording detailed sleep patterns to detect and monitor sleep disorders is in-laboratory overnight polysomnography (PSG), requiring specialized equipment and trained staff. This is no longer feasible in view of the size of the population with suspected sleep disorders, and especially in the coronavirus disease (COVID-19) era (1).

Mandibular movements reveal the changes in trigeminal motor nucleus activity driven by brainstem centers involved in sleep and wake transitions (2, 3). The activity of upper airway muscles anchored on the mandibular jaw is the net result of the activation of brainstem respiratory and sleep centers and their respective interactions. This produces specific mandibular movement patterns reflecting the interactions between sleep stages and respiratory control. We previously demonstrated that sleep mandibular movements represent a powerful tool for characterizing respiratory disturbances in obstructive sleep apnea (OSA) (46).

Figure 1 gives examples of how the different sleep stages each have typical mandibular movement signal patterns.

Figure 1.


Figure 1.

The mandibular movements (MM) signal processed by machine learning to provide sleep staging. Typical example of two of the six channels (upper and lower trace) of the MM signal recorded by a single sensor during the four sleep stages in a single individual. Each trace represents a 210-second (3.5-min) time span of MM recordings by the Sunrise system (inertial measurement with six channels) during wakefulness (top), REM sleep, light sleep, and deep sleep (bottom). Thirty-second epochs were used for sleep stage classification. Sleep is detected when MM occur at the breathing frequency. During light sleep (N2), the amplitude of MM reaches several tenths of a millimeter and varies slightly. The movements during quiet respiration and light sleep are repeated at a frequency ranging between 0.15 and 0.60 Hz depending on central drive output. Deepening of sleep (N3) increases the upper airway’s resistance, and this is reflected by an increase in the amplitude of movement, which is also more stable than during N2. REM sleep is easily identified by irregular frequencies and changing amplitudes in MM that are on average smaller than non-REM sleep amplitudes. Cartoon images adapted from Freepik.com.

Recordings of mandibular movements throughout the night provide hundreds of temporal–spatial signals for modeling and identifying the different sleep stages. Our objective was to develop, train, and then validate an artificial intelligence algorithm to stage sleep using a single sensor detecting mandibular movements.

This prospective study included 1,026 adults with suspected OSA referred for overnight in-laboratory PSG and simultaneous recordings of mandibular movements using the Sunrise system (IRB 00004890; number B707201523388).

The PSG data (Somnoscreen Plus, Somnomedics) were manually scored by two experienced sleep technicians (interobserver agreement, 92.1%; 95% confidence interval [CI], 0.89–0.94; P < 0.001) in accordance with criteria of the American Academy of Sleep Medicine (7).

The Sunrise system is composed of a coin-sized sensor attached by the sleep technician to the chin of the patient (Figure 1). The embedded inertial measurement device senses mandibular movements and is externally controlled by a smartphone application via Bluetooth, automatically transferring nightly data to a cloud-based infrastructure (2).

Using the Extreme Gradient Boosting classifier as the core algorithm, we developed and progressively trained a machine learning sleep staging algorithm (8) using the overnight PSG and mandibular movement recordings from 800 of the patients. The algorithm automatically classified each 30-second epoch of mandibular movement patterns as wake, light non-REM (NREM; N1 + N2), deep NREM (N3), or REM sleep stage (Figure 1). N1 and N2 stages were combined in the automated scoring to reach the best compromise between clinical relevance and best model performances. The extracted features consisted of a combination of raw signals along the three axes of the accelerometer/gyroscope, processing modes (filters with different frequency bands, moving average), and statistical functions. The statistics applied to the above features were tendency toward centrality (mean, median), extreme values (min, max), quartiles, and SD, as well as the normal standardized version of all above features. The programming language was Python.

Patients in the machine learning training set (n = 800 [451 males]) were aged 48.4 years (16.7) with a body mass index [BMI] of 29.1 kg/m2 (10.2), and neck circumference of 40.0 cm (5.0), all median (interquartile range [IQR]) respectively. PSG recordings showed apnea–hypopnea, respiratory disturbance, and microarousal indexes of 17.1 (27.5), 23.9 (28.5), and 24.2 (20.2), all median events/hour (IQR); and PSG sleep parameters: total sleep time 372 minutes (122.7), sleep efficiency 85.1% (13.7), and wake time 12.2% (16.5), all median (IQR).

Patients in a separate validation set (n = 226 [116 males]) had similar characteristics: 46.5 years (17.5), a BMI of 32.3 kg/m2 (11.5), and neck circumference of 40.0 cm (5.0), all median (IQR); similar PSG indexes (20.3 [23.5], 27.0 [23.6], and 25.0 [20.3] for apnea–hypopnea, respiratory disturbance, and microarousal, all median [IQR] respectively), and sleep parameters: 397 min (95.7), 87.1% (11.8), and 11.5% (12.2), all median (IQR) for total sleep time, sleep efficiency, and wake time, respectively.

In the validation set, quantitative agreement analysis between machine learning and human scorings was estimated using a linear mixed model by a two-way intraclass correlation coefficient (ICC) (A, 1) (95% CI) for total sleep time, wake time, light NREM, deep NREM, and REM sleep stages: 0.94 (0.93–0.96), 0.90 (0.88–0.92), 0.70 (0.63–0.76), 0.66 (0.58–0.73), and 0.65 (0.56–0.72), respectively. The mean (95% CI) measurement bias for total sleep time and the four sleep stages (as above) were −13.0 minutes (−52.9 to +19.0), +3.8% (−6.8 to +16.8), −14.9% (−31.1 to +1.8), +6.0% (−6.0 to +21.2), and +8.4 (−21.3 to +2.4).

The algorithm classified sleep epochs with substantial qualitative agreement with manual PSG scorers, which improved as the size of the learning set was progressively increased (κ = 0.71 and accuracy = 78.3% using the full machine learning data set of 800 patients). As shown in Figure 2, a sleep stagewise receiver operating characteristics curve analysis confirmed the well-balanced performance for each target sleep stage.

Figure 2.


Figure 2.

Stagewise receiver operating characteristics (ROC) curve analysis. This consisted of extracting prediction scores for each target stage (wake, light sleep, deep sleep, and REM sleep) and for each patient, then estimating the false and true positive rates of a binary one-versus-rest classification rule to establish the ROC curve. The 95% CIs of the area under the curve (AUC) and smoothing effect were obtained from empirical data (without using any resampling). The diagonal dashed line serves as a reference and shows the performance if sleep staging had been made randomly. The algorithm performed well in detecting REM sleep with a ROC–AUC of 0.96 (0.90–0.99) and non-REM deep sleep with a ROC–AUC of 0.97 (0.91–0.99). Only light non-REM sleep was slightly less well detected with an ROC–AUC of 0.86 (0.77–0.94). CI = confidence interval; DS = deep sleep; LS = light sleep; R = REM sleep; W = wake.

Wakefulness was clearly discriminated from sleep states with a sensitivity of 88% (95% CI, 71–99%) and a specificity of 94% (85–98%). Moreover, the algorithm performed well in detecting REM sleep (sensitivity 83% [64–97%], specificity 89% [76–97%]) and deep sleep (sensitivity 84% [59–100%], specificity 90% [79–98%]). Light NREM sleep was slightly less well detected (sensitivity 60% [36–82%], specificity 88% [79–96%]).

These findings indicate that machine learning analysis of mandibular movements identifies sleep stages with good agreement to that of individual manual scorers of PSG data.

A strength of this work is that it was conducted using a real-life cohort consisting of both subjects for whom PSG detected no OSA and patients with a broad spectrum of OSA, who were randomly sampled into training and validation sets.

Clear advantages of our approach are that it relies on a highly performant sleep staging algorithm processing signals from a single mandibular movement, facilitating the complex process of signal treatment and improving sleep staging reproducibility.

Our study was designed to avoid the limitations occurring in other studies. First, PSG sleep staging was performed by two experienced technicians. Second, data from an independent set of patients were used to validate the algorithm. The input data were balanced using a random resampling (SMOTE) technique to minimize the effect of data imbalance. A conventional algorithmic framework implying manual feature extraction and a structured data-driven algorithm was adopted for better control and understanding of input data. Furthermore, the XGBoost algorithm offers several advantages over classical methods, including high efficiency in computation and resources, allowing for fast training and execution speed.

In conclusion, the mandibular movement signal acquired from a compact inertial measurement device is suitable for automated sleep staging in adults presenting a broad spectrum of OSA severity. The proposed algorithm performs well for clinical applications and could present a major step forward toward unobtrusive, reliable, and cost-effective home-based sleep assessment and value-based care (9).

Acknowledgments

Acknowledgment

The authors thank Ravzat Ashurlaeva (Respisom, Erpent, Belgium), who kindly spent innumerable hours providing secretarial assistance, and Alison Foote, Ph.D. (Grenoble Alpes University Hospital, France), for critical reading and substantial editing of the letter.

Footnotes

Supported by the French National Research Agency in the framework of the “Investissements d’avenir” program (ANR-15-IDEX-02) and the “e-health and integrated care and trajectories medicine and MIAI artificial intelligence” chairs of excellence from the Grenoble Alpes University Foundation (J.-L.P., R.T., and S.B.). This work has also been partially supported by Multidisciplinary Institute in Artificial Intelligence @ Grenoble Alpes (ANR-19-P3IA-0003). The devices used in the study were provided by Sunrise, Namur, Belgium.

Author Contributions: N.-N.L.-D. conceived and designed the project, analyzed the data, drafted the initial manuscript, and reviewed and revised the manuscript; J.-B.M. conceived and designed the study, performed the research, analyzed the data, drafted the initial manuscript, and reviewed and revised the manuscript; N.C. and V.C. performed the research and participated in data acquisition; R.T. reviewed and revised the manuscript; S.B. analyzed the data and reviewed and revised the manuscript; J.-L.P. conceived and designed the study, analyzed the data, and reviewed and revised the manuscript. All authors helped revise the manuscript and approved it for submission.

Data sharing statement: The deidentified data used in this study are not publicly available at present. Parties interested in data access should contact N.-N.L.-D. (nam@hellosunrise.com) for queries related to the Extreme Gradient Boosting (XGB) classifier and J.-B.M. (martinot.j@respisom.be) for queries related to the sleep laboratory data set. The data sets generated and/or analyzed during the current study are available from the corresponding author on reasonable request. Applications will need to undergo ethical and legal approvals by the respective institutions. Those interested in research collaborations should contact J.-B.M. (martinot.j@respisom.be).

Originally Published in Press as DOI: 10.1164/rccm.202103-0680LE on July 23, 2021

Author disclosures are available with the text of this letter at www.atsjournals.org.

References

  • 1. Benjafield AV, Ayas NT, Eastwood PR, Heinzer R, Ip MSM, Morrell MJ, et al. Estimation of the global prevalence and burden of obstructive sleep apnoea: a literature-based analysis. Lancet Respir Med . 2019;7:687–698. doi: 10.1016/S2213-2600(19)30198-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kubin L. Neural control of the upper airway: respiratory and state-dependent mechanisms. Compr Physiol. 2016;6:1801–1850. doi: 10.1002/cphy.c160002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Moore JD, Kleinfeld D, Wang F. How the brainstem controls orofacial behaviors comprised of rhythmic actions. Trends Neurosci . 2014;37:370–380. doi: 10.1016/j.tins.2014.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Pépin JL, Letesson C, Le-Dong NN, Dedave A, Denison S, Cuthbert V, et al. Assessment of mandibular movement monitoring with machine learning analysis for the diagnosis of obstructive sleep apnea. JAMA Netw Open . 2020;3:e1919657. doi: 10.1001/jamanetworkopen.2019.19657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Martinot JB, Borel JC, Cuthbert V, Guénard HJP, Denison S, Silkoff PE, et al. Mandibular position and movements: Suitability for diagnosis of sleep apnoea. Respirology . 2017;22:567–574. doi: 10.1111/resp.12929. [DOI] [PubMed] [Google Scholar]
  • 6. Martinot JB, Le-Dong NN, Cuthbert V, Denison S, Silkoff PE, Guénard H, et al. Mandibular movements as accurate reporters of respiratory effort during sleep: validation against diaphragmatic electromyography. Front Neurol . 2017;8:353. doi: 10.3389/fneur.2017.00353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Berry RB, Brooks R, Gamaldo C, Harding SM, Lloyd RM, Quan SF, et al. AASM Scoring Manual updates for 2017 (version 2.4) J Clin Sleep Med . 2017;13:665–666. doi: 10.5664/jcsm.6576. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. 22nd SIGKDD Conference on Knowledge Discovery and Data Mining . 2016:785–794. [Google Scholar]
  • 9. Pépin JL, Baillieul S, Tamisier R. Reshaping sleep apnea care: time for value-based strategies. Ann Am Thorac Soc . 2019;16:1501–1503. doi: 10.1513/AnnalsATS.201909-670ED. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Respiratory and Critical Care Medicine are provided here courtesy of American Thoracic Society

RESOURCES