Skip to main content
Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine logoLink to Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine
. 2021 Sep 1;17(9):1777–1784. doi: 10.5664/jcsm.9292

Real-time prediction of upcoming respiratory events via machine learning using snoring sound signal

Bochun Wang 1,2,3,*, Xuanyu Yi 4,*, Jiandong Gao 4,5, Yanru Li 1,2,3, Wen Xu 1,2,3, Ji Wu 4,5,, Demin Han 1,2,3,
PMCID: PMC8636355  PMID: 33843580

Abstract

Study Objectives:

The aim of the study was to inspect the acoustic properties and sleep characteristics of a preapneic snoring sound. The feasibility of forecasting upcoming respiratory events by snoring sound was also investigated.

Methods:

Participants with habitual snoring or a heavy breathing sound during sleep were recruited consecutively. Polysomnography was conducted, and snoring-related breathing sound was recorded simultaneously. Acoustic features and sleep features were extracted from 30-second samples, and a machine learning algorithm was used to establish 2 prediction models.

Results:

A total of 74 eligible participants were included. Model 1, tested by 5-fold cross-validation, achieved an accuracy of 0.92 and an area under the curve of 0.94 for respiratory event prediction. Model 2, with acoustic features and sleep information tested by Leave-One-Out cross-validation, had an accuracy of 0.78 and an area under the curve of 0.80. Sleep position was found to be the most important among all sleep features contributing to the performance of the 2 models.

Conclusions:

Preapneic sound presented unique acoustic characteristics, and snoring-related breathing sound could be deployed as a real-time apneic event predictor. The models, combined with sleep information, serve as a promising tool for an early warning system to forecast apneic events.

Citation:

Wang B, Yi X, Gao J, et al. Real-time prediction of upcoming respiratory events via machine learning using snoring sound signal. J Clin Sleep Med. 2021;17(9):1777–1784.

Keywords: obstructive sleep apnea, snoring-related breathing sound, real-time prediction, acoustic features, early warning system


BRIEF SUMMARY

Current Knowledge/Study Rationale: A snoring sound signal contains essential information about the mechanisms of obstructive sleep apnea and sites of upper airway obstruction, serving as a promising tool for apneic detection. However, few studies have achieved the goal of event-by-event prediction for apnea.

Study Impact: The present study shows that snoring-related breathing sound before an apneic event is unique and can provide sufficient information to forecast sleep apnea in a real-time manner.

INTRODUCTION

Snoring is the commonest and earliest manifestation of obstructive sleep apnea (OSA), occurring in 70%–95% of patients,1 whereas in the general population, habitual snoring affects 15.58%–44% of middle-aged men and 8.4%–28% of middle-aged women.24 Many recent studies and the availability of alternative devices have raised the possibility of shifting from the gold-standard full-night polysomnography (PSG) studies to ambulatory home studies, aiming to increase the amount of people that can be analyzed and to facilitate the study process. Being low-cost and easily conducted, snoring sound signals contain essential information about the mechanisms and sites of upper airway obstruction,1 serving as a promising tool for apneic event detection.

Earlier studies investigated several features of snoring/breathing sounds and utilized statistical models to predict the presence of OSA. Narayan et al5 deployed smartphone-recorded ambient sound to separate normal breath sounds, apnea, and loud obstructive sounds by spectral analysis. Ben-Israel et al6 differentiated patients with OSA and without OSA for the apnea-hypopnea index (AHI) thresholds of 10 and 20 events/h based on 5 intra- and intersnore acoustic features extracted from snoring signals. Kim et al7 extracted an acoustic biomarker consisting of 500 audio features to establish binary classifiers for OSA prediction. However, these methods extracted a set of human-engineered acoustic features and the performance was validated by comparing the overall estimated AHI with the AHI from PSG, neglecting the detection of every individual respiratory event. Few studies have achieved the goal of event-by-event prediction for apnea and hypopnea by fully exploiting the characteristics of snoring sounds before respiratory events and exploring the acoustic dynamics within an individual.

The major goal of this article is to inspect the acoustic properties and sleep characteristics of preapneic snoring sounds. In addition, we investigated the feasibility of forecasting upcoming apneic events based on snoring sounds that had occurred already to thus establish a real-time prediction system for sleep apnea event detection.

METHODS

A study flow chart of our method is presented in Figure 1. Once the PSG and audio data for each patient were collected, the corresponding audio data were annotated according to the PSG adjudicated respiratory events. Next, high-risk snoring-related breathing sound (SRBS) and low-risk SRBS categories were created, and acoustic features together with other auxiliary features were extracted as the input of the machine learning models.

Figure 1. Study flow chart of the proposed technique.

Figure 1

Once the PSG and audio data were collected, the audio data were annotated according to the PSG-adjudicated respiratory events. High-risk SRBS and low-risk SRBS categories were created, and acoustic features together with other auxiliary features were extracted as the input of 2 machine learning models. ABO = abdominal respiratory inductive plethysmography, EMG = chin electromyography, LEOG = left electrooculography, NPT = nasal pressure transducer, OT = oronasal thermistor, PSG = polysomnography, REOG = right electrooculography, SRBS = snoring-related breathing sound, THO = thoracic respiratory inductive plethysmography, VAD = voice activity detection.

Participants

We recruited 74 individuals aged ≥ 18 years consecutively and prospectively. Participants were referred for PSG because of a medical history suggestive of habitual snoring or heavy breathing sound during sleep and had at least 1 of the following symptoms: restless sleep, pauses in breathing during sleep, morning headaches, excessive daytime sleepiness, cognitive impairment, or depression with a suspicion of association with sleep apnea. No exclusion criteria were imposed. Attended, in-laboratory, full-night PSG and audio recording were conducted at the sleep center of Beijing Tongren Hospital (Beijing, China). The Epworth Sleepiness Scale was used to evaluate daytime sleepiness.8 This study was approved by the institutional review board of Beijing Tongren Hospital (TRECKY2017–032).

Data collection

PSG (Alice 6; Philips Respironics, Murrysville, Pennsylvania) consisted of electroencephalography, 2-channel electrooculography, bilateral anterior tibial and chin electromyography, electrocardiography, nasal pressure transducer, oronasal thermistor, thoracic and abdominal respiratory inductive plethysmography, and pulse oximetry. American Academy of Sleep Medicine 2016 scoring criteria were used for sleep staging and respiratory analyses (The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications, Version 2.3).9 PSG data were adjudicated by 3 experienced sleep technicians using those criteria. An apnea was defined as a 90% flow reduction for 10 seconds or longer; a hypopnea was supposed to satisfy either of the following rules: the recommended rule required a 30% flow reduction for 10 seconds associated with 3% oxygen desaturation or an arousal; the alternative version involved a 30% flow reduction followed by a 4% oxygen desaturation. The AHI was calculated as the number of apneas and hypopneas per hour of sleep. The labels for sleep-related respiratory events (hypopnea, OSA, central apnea, and mixed apnea) from PSG data were checked by the author (B.W.).

The overnight ambient sound, or named SRBS, was recorded simultaneously with PSG using a noncontact digital voice recorder (PCM-D10; Sony, Kōnan Minato, Japan) with a sampling frequency of 44,100 Hz and 16-bit quantizing precision. The voice recorder was placed 1 m from the head of the participants. SRBS recording was converted to mono soundtrack and was down-sampled to 16,000 Hz.

Sample annotation and creation

Because the PSG data and audio recording of each person were recorded simultaneously, we were able to annotate 4 types of respiratory events in the corresponding audio recording according to the time axis, referencing the gold standard of concurrent PSG respiratory event labels. Afterward, we obtained each individual’s SRBS database with the accurate labels for their sleep-related respiratory events. Considering that the aim of the study was to forecast the presence of upcoming respiratory events based on SRBS in an event-by-event manner, the samples were created and dichotomized to give 2 categories: high-risk SRBS (samples followed by respiratory events) and low-risk SRBS (samples followed by regular breaths without any respiratory events). The details of sample creation were in accordance with the following rules (Figure 2). For each respiratory event, we extracted a 1-minute fragment before the event, split the fragment into 30-second epochs with a stride of 20 seconds between them, and selected the first two 30-second epochs as positive samples (high-risk SRBS), excluding those containing respiratory events. The negative sample (low-risk SRBS) was defined as the epoch satisfying either of the following 2 conditions: the 30-second epochs with a stride of 20 seconds between them, around which 10 minutes were free of any respiratory event, termed a continuous negative sample, or the middle 2 of 30-second epochs with a stride of 20 seconds in a fragment that was longer than 5 minutes and located between 2 respiratory events, termed a discrete negative sample. The reason why we extracted low-risk samples in different ways was to better simulate a real clinical setting and to keep the number of positive and negative samples balanced.

Figure 2. Creation of positive and negative samples.

Figure 2

Diagram illustrates sample creation: (A) positive sample and discrete negative sample; (B) continuous negative sample. Yellow box represents apneic event, red box represents positive sample, and green box represents negative sample. au = arbitrary unit.

Preprocessing and feature extraction

Voice activity detection based on the double-threshold algorithm,10 using a zero crossing rate and short-term energy with an adaptive threshold (empirically 1–1.3 times the average of the 30-second epoch), was leveraged to clip silent segments from each sample. Considering that a discrete snore usually lasted at least 0.5 seconds, the segments shorter than the duration were eliminated. Next, 3–4 nonsilent segments on average remained in each sample, and a sample with less than 1 nonsilent segment was jettisoned, because a segment free of any detected sound precluded further evaluation.

The original ambient audio recording inevitably collected several unwanted background noises, such as the sounds of beds, duvets, speech, body movements, and doors, which were further processed by a multifilter to refine the SRBS. The multifilter encompassed 3 parts: a sleep-stage filter selecting the middle 6-hour stable data of total sleep time, a bandpass filter of 80–8,000 Hz for noise elimination, and a pre-emphasis filter to emphasize high-frequency signal components. In addition, a total of 59 features were extracted from each sample for model training (Table 1). We derived 5 categories of acoustic features to characterize SRBS: formants (F1–F3), log energy, linear predictive coding (LPC) coefficients, linear predictive cepstral coefficients, and mel-frequency cepstral coefficients (MFCCs) with their second derivatives, using singular-value decomposition for feature dimension reduction.11 The acoustic features we applied were commonly used in speech signal processing without modification (see Supplemental Methods (215.3KB, pdf) in the supplemental material). We also augmented the acoustic features with other auxiliary features: age, sex, and several anthropometric parameters, such as body mass index (BMI) and neck circumference, which were 1-time measurements readily available at no extra cost. Sleep information, such as sleep stages (N1, N2, N3, and rapid eye movement sleep) and body positions (supine and nonsupine) were enrolled in the feature set as well. Acoustic analyses were performed using MATLAB (R-2018a; The Math-Works, Inc., Natick, MA).

Table 1.

Feature set derived for sample evaluation.

Feature Category Number of Features
Acoustic features 53
 MFCC 24
 LPCC 12
 LPC 13
 Formant 3
 Log energy 1
Sleep features 2
 Stage 1
 Position 1
Patient features 4
 Age 1
 Sex 1
 BMI 1
 NC 1

BMI = body mass index, LPC = linear predictive coding, LPCC = linear predictive cepstral coefficient, MFCC = mel-frequency cepstral coefficient, NC = neck circumference.

Machine learning protocol and data analysis

The XGBoost model12 was modified for the classification of high-risk and low-risk SRBS (see Supplemental Methods (215.3KB, pdf) ). We constructed the 2 models using each of the 5 categories of acoustic features and evaluated the model with combined features to achieve the best performance. The feature fusion process was adopted through feature selection and importance comparison to explore the interaction between different features. The Grid Search method was conducted to search for the best parameters for each feature. To evaluate the generalization performance of our algorithm, we first established a model with all samples available and tested it using 5-fold cross-validation (model 1). However, earlier research noted that the samples of similar acoustic properties during a period tended to cluster within an individual. Hence, we developed model 2, in which an individual was regarded as a separate entity and all the samples of each participant were assessed, using Leave-One-Out cross-validation. To evaluate the prediction performance, we calculated sensitivity, precision, area under the curve (AUC) for the receiver operating characteristic, and diagnostic accuracy rate using the PSG data as the comparator. The Kruskal-Wallis H test was used to compare sleep data among categories. A 1-way analysis of variance was used to compare demographic findings among categories. The Pearson correlation coefficient was used to characterize the association between accuracy and AHI. Data are presented as mean ± standard deviation or median (interquartile range) where appropriate.

RESULTS

Participant characteristics

A total of 74 individuals (57 men and 17 women) with a mean age of 42.5 years were included in the study. Their characteristics, according to OSA severity, are summarized in Table 2. As expected, significant differences were found in BMI, AHI, Epworth Sleepiness Scale score, and neck circumference among the groups (P < .05). All groups had similar ages and sleep efficiency. The database consisted of 13,255 samples from the 74 participants.

Table 2.

Participant characteristics.

Variable AHI < 5 events/h (n = 3) AHI 5 to < 15 events/h (n = 21) AHI 15 to < 30 events/h (n = 13) AHI ≥ 30 events/h (n = 37) P
Age (y) 40.3 ± 13.5 42.5 ± 10.2 44.2 ± 8.9 42.2 ± 11.1 .92
Sex (M:F) 3:0 16:5 9:4 29:8
BMI (kg/m2) 21.8 ± 2.1 24.7 ± 2.5 26.9 ± 2.4 27.1 ± 2.6 < .001
NC (cm) 35.0 ± 1.0 38.3 ± 2.7 38.3 ± 2.6 40.1 ± 3.7 .023
ESS (score) 6.6 ± 2.08 4.3 ± 3.5 6.4 ± 3.2 7.3 ± 4.0 .045
AHI (events/h) 3.0 (1.5–4.6) 10.5 (8.1–13.2) 21.2 (16.1–26.4) 59.8 (43.2–76.4) < .001
Sleep efficiency (%) 93.8 ± 1.5 84.7 ± 12.6 85.3 ± 10.2 87.7 ± 9.5 .44

AHI = apnea-hypopnea index, BMI = body mass index, ESS = Epworth Sleepiness Scale, F = female, M = male, NC = neck circumference.

Comparison of different acoustic features based on model 1

Model 1 studied the prediction performance of every acoustic feature with all extracted SRBS samples and was tested using 5-fold cross-validation (Table 3). LPC, being the single feature showing the best prediction performance, achieved an accuracy of 0.92 and an AUC of 0.94. When we integrated sleep information such as stages and positions into the MFCC and LPC combined acoustic features, the accuracy and AUC increased to 0.93 and 0.96, respectively. Nonetheless, we noticed that both the training and test sets may have included samples from the same participant. We therefore randomly selected several participants, extracted the MFCC feature of samples from each participant as an entity, and examined the model in depth. This model was inspected visually using t-SNE,13 a widely used algorithm for dimension reduction and visualization, and we projected the samples onto a 2-dimensional plane. As illustrated in Figure 3A, when extracted from an individual, the high-risk SRBS (positive) samples separated from low-risk SRBS (negative) samples clearly and could clump together in clusters. This agglomeration effect during the course of night implicates that the predictive power of model 1 may have been overestimated because the samples in the vicinity within an individual harbored similar acoustic characteristics. Figure 3B shows that respiratory events were not equally distributed throughout the night. The kernel density estimation was conducted based on the histogram of the intervals between respiratory events.14

Table 3.

Performance of model 1 with each acoustic feature.

Feature Accuracy AUC Sensitivity Precision
MFCC 0.81 0.84 0.78 0.70
LPC 0.92 0.94 0.91 0.90
LPCC 0.77 0.72 0.73 0.75
Formant 0.69 0.63 0.7 0.60
Log energy 0.58 0.55 0.49 0.52

AUC = area under the curve, LPC = linear predictive coding, LPCC = linear predictive cepstral coefficient, MFCC = mel-frequency cepstral coefficient.

Figure 3. Agglomeration effect.

Figure 3

(A) With the help of the t-SNE algorithm, the scatterplot shows that a representative acoustic feature, the MFCC of positive and negative samples within an individual across the night, clumps in clusters. (B) The histogram shows that the intervals between respiratory events extracted from all participants are concentrated within 2 minutes. The curve illustrates kernel density estimation. MFCC = mel-frequency cepstral coefficient.

Comparison of different acoustic features based on model 2

In the light of the results shown above, model 2 was established and tested using Leave-One-Out cross-validation. This model was trained using samples from all participants except the one being tested. Examining Table 4, one can see that LPC still remained the best single classification feature, with an accuracy of 0.72 and an AUC of 0.73, although the overall performance of all features decreased significantly. When we integrated sleep stages and positions into combined acoustic features (MFCC and LPC), the accuracy and AUC improved and reached 0.78 and 0.80, respectively. Most uses of model 2 with the optimal feature set had good predictive power (Figure 4).

Table 4.

Performance of model 2 with each acoustic feature.

Feature Accuracy AUC Sensitivity Precision
MFCC 0.67 0.63 0.69 0.72
LPC 0.72 0.73 0.7 0.68
LPCC 0.68 0.63 0.55 0.72
Formant 0.6 0.56 0.61 0.49
Log energy 0.52 0.49 0.34 0.46

AUC = area under the curve, LPC = linear predictive coding, LPCC = linear predictive cepstral coefficient, MFCC = mel-frequency cepstral coefficient.

Figure 4. Feature importance ranking of model predictive power.

Figure 4

The bar plot of feature importance ranking sorted by absolute Shapley Additive Explanations (SHAP) value. LPC_7 denotes the seventh order of LPC. LPC = linear predictive coding, MFCC = mel-frequency cepstral coefficient.

Interpretation of feature importance in model 2

For better model understanding, further evaluation of model 2 was conducted with the parameter of achieving best overall performance (a combination of MFCC and LPC with sleep information). A feature attribution approach named Shapley Additive Explanations was adopted,15 which helped us estimate how much each feature contributed to the performance. The summary plot shown in Figure 5 sorts features by the sum of the absolute SHAP value over all samples in descending order, showing the distribution of the impacts each feature had on the model output. Overall, LPC illustrated a higher predictive power than MFCC. In the importance ranking of all 39 features, sleep position was ranked in the top 6 among 69% of the samples and remained in the top 5 among 61% of the samples. The AHI and the accuracy of model 2 were correlated and found to have a medium strength of correlation (Figure 6). The strength of correlation between accuracy vs age, BMI, and neck circumference was very low (Figure 7). We found that stage N2 sleep showed the best predictive power among all sleep stages (stage N3 sleep was not shown because of a scarcity of samples; Table 5).

Figure 5. Overall performance of model 2.

Figure 5

The histograms show the performance of model 2 in terms of accuracy (A) and AUC (B). The curves exhibit kernel density estimation. AUC = area under the curve.

Figure 6. Association between accuracy and AHI in model 2.

Figure 6

A scatterplot of accuracy vs AHI in model 2 Pearson correlation coefficient; r = .49 and P < .0001. AHI = apnea-hypopnea index.

Figure 7. Correlation map between 5 variables in model 2.

Figure 7

Map shows the strength of the Pearson correlation coefficient between the 5 variables. Blue illustrates r = –1 and red illustrates r = 1. AHI = apnea-hypopnea index, BMI = body mass index, NC = neck circumference.

Table 5.

Performance of model 2 among sleep stages.

Stage Number Accuracy Specificity Sensitivity PPV NPV
W 93 0.64 0.60 0.80 0.73 0.63
REM sleep 2,467 0.73 0.71 0.78 0.83 0.69
N1 sleep 4,032 0.76 0.81 0.70 0.68 0.84
N2 sleep 6,733 0.81 0.83 0.73 0.80 0.81

NPV = negative predictive value, PPV = positive predictive value, REM = rapid eye movement.

DISCUSSION

This is the first study to explore the properties of snoring sounds per se followed by respiratory events and shows the feasibility of forecasting upcoming respiratory events based on the snoring that has already occurred. Moreover, the application of our model can be extended to OSA severity prediction and respiratory event management. The model incorporating sleep information into combined acoustic features (MFCC and LPC) manifested a good prediction performance, achieving an AUC of 0.80 and an accuracy of 0.78.

The present study has shown that objective snoring is associated with OSA severity, indicating that snoring is a surrogate for marked elevations in airway collapsibility during sleep.16 Some studies have focused on the postapneic snoring sound, which is closely related to the collision vibration of the soft tissues caused by explosive airflow disturbance after the reopening of the collapsed upper airway.1720 Unlike in common practice, model 1 investigated the acoustic properties of preapneic SRBS and was able to predict respiratory events with a high degree of accuracy using acoustic features alone. The overall performance was even enhanced with sleep information and achieved an accuracy of 0.93 and an AUC of 0.96.

However, from the analysis of sample distribution, we noticed that high- and low-risk SRBS were not equally distributed within individuals throughout the night. This phenomenon could be partly explained by the fact that respiratory events during the course of the night are likely to occur in clusters, because the sleep stage and position of an individual may remain unchanged over a certain period of time, resulting in a relatively stable state of the upper airway. The agglomeration effect suggested that the performance of model 1 may have been overestimated, because the time continuity of an individual’s samples was a confounding factor for classification. For this reason, we intended to determine whether the model would still be effective when excluding such a confounding factor. To this end, we developed model 2 to simulate a real-world situation in which respiratory event prediction of an unknown patient was intended to succeed. The model proved to be plausible because the classification result assessed using Leave-One-Out cross-validation exhibited good overall performance even if a single feature was deployed, and the optimum feature subset reached an accuracy and AUC of 0.78 and 0.80, respectively.

Our model was of good generalization ability in that the correlation between BMI, neck circumference, and age vs the AHI was very low. Although the correlation between the AHI and the predictive power of model 2 was of medium strength (Figure 6 and Figure 7), the reason remains to be explored. Lee, Hong, et al21 showed that the soft palate alone is the most common obstructed structure in mild OSA, and the combination of the soft palate and the tongue base is more frequent in severe OSA. More research has indicated that different excitation locations of snore sounds exhibit different acoustic features.22 These facts imply that the acoustic characteristics of severe OSA differ from mild OSA as expected, resulting in the disparity of predictive power. Further studies are warranted to stratify samples into groups according to their anatomical mechanisms of snore sounds.

Several prior studies have investigated the feasibility of OSA detection using snoring sounds.6,2326 However, these approaches neglected the breathing dynamics and independence of each respiratory event and attempted to utilize the entire set of snoring sounds to extract averaged features/parameters across the night. On the contrary, our approach is unique because we have considered the variability between respiratory events, thereby allowing a potential fully automated real-time early warning system for respiratory event forecast in an event-by-event manner. By using audio recording devices alone, such as mobile phones, home-based assessments can facilitate respiratory event detection strategy in the community at large without extra cost or a painstaking process. In addition, the application of such a system is useful and promising in that it can be adopted in OSA management when integrated into several nonmedical and device therapies. For example, to avoid upcoming apneic events and further oxygen desaturation, positional therapy and hypoglossal nerve stimulation may be applied after respiratory events are predicted. Supine predominant OSA is variably estimated as occurring in between 50% and 60% of patients presenting to sleep clinics for overnight PSG,2729 and supine sleep avoidance is efficacious in some of these patients,30 consistent with the abovementioned feature importance ranking. The high prevalence indicates that our fully automated real-time warning system can be widely used in the community. For patients with supine predominant OSA, an alert for lateral trunk rotation may be triggered after an apneic event is forecast.

Limitations

A few limitations should be noted when interpreting our results. First, our approach was based on SRBS, which indicates that patients who did not produce any sound through the night could not be analyzed, despite the fact that most patients with OSA do snore or produce heavy breathing sounds during sleep.31 Second, we acknowledge that night-to-night variability in SRBS may introduce some inaccuracies in our model in a single night. Further research is required to consolidate our study by recruiting people with SRBS data that are evaluated for more than 1 night and analyzing differences in subgroups in terms of age, BMI, and snore sound excitation locations.

CONCLUSIONS

Prerespiratory sounds harbor unique acoustic characteristics, and SRBS can be deployed as a real-time respiratory event predictor. Acoustic features alone are useful in respiratory prediction, and the models combined with sleep information serve as a promising tool for an early warning system to forecast respiratory events.

DISCLOSURE STATEMENT

All authors have seen and approved this manuscript. This study was funded by the National Key Research and Development Program of China (2018YFC0116800), the National Natural Science Foundation of China (81970866), the Beijing Municipal Administration of Hospitals’ Youth Programme (QMS20190202), and the Beijing Municipal Natural Science Foundation (L192026). The authors report no conflicts of interest.

ACKNOWLEDGMENTS

The authors thank the study participants, otolaryngologists, and technologists at the Department of Otolaryngology Head and Neck Surgery, Bejing Tongren Hospital.

ABBREVIATIONS

AHI,

apnea-hypopnea index

AUC,

area under the curve

BMI,

body mass index

LPC,

linear predictive coding

MFCC,

mel-frequency cepstral coefficient

OSA,

obstructive sleep apnea

PSG,

polysomnography

SRBS,

snoring-related breathing sound

REFERENCES

  • 1. Lee LA , Lo YL , Yu JF , et al . Snoring sounds predict obstruction sites and surgical response in patients with obstructive sleep apnea hypopnea syndrome . Sci Rep. 2016. ; 6 ( 1 ): 30629 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Park CG , Shin C . Prevalence and association of snoring, anthropometry and hypertension in Korea . Blood Press. 2005. ; 14 ( 4 ): 210 – 216 . [DOI] [PubMed] [Google Scholar]
  • 3. Teculescu D , Benamghar L , Hannhart B , Michaely J-P . Habitual loud snoring: a study of prevalence and associations in 850 middle-aged French males . Respiration. 2006. ; 73 ( 1 ): 68 – 72 . [DOI] [PubMed] [Google Scholar]
  • 4. Nakano H , Hirayama K , Sadamitsu Y , et al . Monitoring sound to quantify snoring and sleep apnea severity using a smartphone: proof of concept . J Clin Sleep Med. 2014. ; 10 ( 1 ): 73 – 78 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Narayan S , Shivdare P , Niranjan T , Williams K , Freudman J , Sehra R . Noncontact identification of sleep-disturbed breathing from smartphone-recorded sounds validated by polysomnography . Sleep Breath. 2019. ; 23 ( 1 ): 269 – 279 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Ben-Israel N , Tarasiuk A , Zigel Y . Obstructive apnea hypopnea index estimation by analysis of nocturnal snoring signals in adults . Sleep. 2012. ; 35 ( 9 ): 1299 – 305C . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Kim JW , Kim T , Shin J , et al . Prediction of obstructive sleep apnea based on respiratory sounds recorded between sleep onset and sleep offset . Clin Exp Otorhinolaryngol. 2019. ; 12 ( 1 ): 72 – 78 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Johns MW . A new method for measuring daytime sleepiness: the Epworth Sleepiness Scale . Sleep. 1991. ; 14 ( 6 ): 540 – 545 . [DOI] [PubMed] [Google Scholar]
  • 9. Berry RB , Brooks R , Gamaldo CE , et al. ; for the American Academy of Sleep Medicine . The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. Version 2.3. Darien, IL: : American Academy of Sleep Medicine; ; 2016. . [Google Scholar]
  • 10. Kinsler LE , Frey AR , Coppens AB , Sanders JV . Fundamentals of Acoustics , 4th ed. Hoboken, NJ: : Wiley; ; 1999. . [Google Scholar]
  • 11. Deller JR , Proakis JG , Hansen JH . Discrete-Time Processing of Speech Signals. Hoboken, NJ: : Wiley/IEEE Press; ; 1999. . [Google Scholar]
  • 12. Chen T , Guestrin C . XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13–17, 2016. ; San Francisco, CA.
  • 13. Hall M , Frank E , Holmes G , Pfahringer B , Reutemann P , Witten IH . The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter, November 2009. . Accessed May 20, 2021.
  • 14. Jones MC , Marron JS , Sheather SJ . A brief survey of bandwidth selection for density estimation . J Am Stat Assoc. 1996. ; 91 ( 433 ): 401 – 407 . [Google Scholar]
  • 15. Lundberg SM , Erion GG , Lee S-I . Consistent individualized feature attribution for tree ensembles. arXiv. Preprint posted online March 7, 2019. . doi: 1802.03888.
  • 16. Gleadhill IC , Schwartz AR , Schubert N , Wise RA , Permutt S , Smith PL . Upper airway collapsibility in snorers and in patients with obstructive hypopnea and apnea . Am Rev Respir Dis. 1991. ; 143 ( 6 ): 1300 – 1303 . [DOI] [PubMed] [Google Scholar]
  • 17. Kim T , Kim JW , Lee K . Detection of sleep disordered breathing severity using acoustic biomarker and machine learning techniques . Biomed Eng Online. 2018. ; 17 ( 1 ): 16 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Ng AK , Koh TS , Baey E , Lee TH , Abeyratne UR , Puvanendran K . Could formant frequencies of snore signals be an alternative means for the diagnosis of obstructive sleep apnea? Sleep Med. 2008. ; 9 ( 8 ): 894 – 898 . [DOI] [PubMed] [Google Scholar]
  • 19. Sola-Soler J , Jané R , Fiz JA , Morera J . Spectral envelope analysis in snoring signals from simple snorers and patients with obstructive sleep apnea. Paper published in Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; September 17–21, 2003. ; Cancun, Mexico.
  • 20. Fiz JA , Abad J , Jané R , et al . Acoustic analysis of snoring sound in patients with simple snoring and obstructive sleep apnoea . Eur Respir J. 1996. ; 9 ( 11 ): 2365 – 2370 . [DOI] [PubMed] [Google Scholar]
  • 21. Lee CH , Hong SL , Rhee CS , Kim SW , Kim JW . Analysis of upper airway obstruction by sleep videofluoroscopy in obstructive sleep apnea: a large population-based study . Laryngoscope. 2012. ; 122 ( 1 ): 237 – 241 . [DOI] [PubMed] [Google Scholar]
  • 22. Qian K , Janott C , Pandit V , et al . Classification of the excitation location of snore sounds in the upper airway by acoustic multifeature analysis . IEEE Trans Biomed Eng. 2017. ; 64 ( 8 ): 1731 – 1741 . [DOI] [PubMed] [Google Scholar]
  • 23. Karunajeewa AS , Abeyratne UR , Hukins C . Multi-feature snore sound analysis in obstructive sleep apnea-hypopnea syndrome . Physiol Meas. 2011. ; 32 ( 1 ): 83 – 97 . [DOI] [PubMed] [Google Scholar]
  • 24. Azarbarzin A , Moussavi Z . Snoring sounds variability as a signature of obstructive sleep apnea . Med Eng Phys. 2013. ; 35 ( 4 ): 479 – 485 . [DOI] [PubMed] [Google Scholar]
  • 25. Alencar AM , Silva DGVD , Oliveira CB , Vieira AP , Moriya HT , Lorenzi-Filho G . Dynamics of snoring sounds and its connection with obstructive sleep apnea . Physica A: Stat Mech App. 2013. ; 392 ( 1 ): 271 – 277 . [Google Scholar]
  • 26. Sowho M , Sgambati F , Guzman M , Schneider H , Schwartz A . Snoring: a source of noise pollution and sleep apnea predictor . Sleep. 2020. ; 43 ( 6 ): zsz305 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Joosten SA , O’Driscoll DM , Berger PJ , Hamilton GS . Supine position related obstructive sleep apnea in adults: pathogenesis and treatment . Sleep Med Rev. 2014. ; 18 ( 1 ): 7 – 17 . [DOI] [PubMed] [Google Scholar]
  • 28. Oksenberg A , Silverberg DS , Arons E , Radwan H . Positional vs nonpositional obstructive sleep apnea patients: anthropomorphic, nocturnal polysomnographic, and multiple sleep latency test data . Chest. 1997. ; 112 ( 3 ): 629 – 639 . [DOI] [PubMed] [Google Scholar]
  • 29. Joosten SA , Hamza K , Sands S , Turton A , Berger P , Hamilton G . Phenotypes of patients with mild to moderate obstructive sleep apnoea as confirmed by cluster analysis . Respirology. 2012. ; 17 ( 1 ): 99 – 107 . [DOI] [PubMed] [Google Scholar]
  • 30. Ravesloot MJL , van Maanen JP , Dun L , de Vries N . The undervalued potential of positional therapy in position-dependent snoring and obstructive sleep apnea-a review of the literature . Sleep Breath. 2013. ; 17 : 39 – 49 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Hoffstein V , Mateika S , Anderson D . Snoring: is it in the ear of the beholder? Sleep. 1994. ; 17 ( 6 ): 522 – 526 . [DOI] [PubMed] [Google Scholar]

Articles from Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine are provided here courtesy of American Academy of Sleep Medicine

RESOURCES