Abstract
Study Objectives:
Type 3 home sleep apnea tests may underestimate the apnea-hypopnea index (AHI) due to overestimation of total sleep time (TST). We aimed to evaluate the effect of manual editing of the total recording time (TRT) on the TST and AHI.
Methods:
Thirty 15-channel in-home polysomnography studies (AHI 0 to 30 events/h) scored using American Academy of Sleep Medicine criteria were rescored by two blinded polysomnologists after data from electroencephalogram, electrooculogram, and electromyogram were masked. In method 1, periods of probable wakefulness and artifact were manually edited and removed from analysis. Method 2 identified TST as the TRT without manual editing. Paired t-tests were used to compare the TST and AHI between these methods. Sensitivity and specificity of each method were calculated for gold standard AHI cutoffs of ≥ 5 and ≥ 15 events/h.
Results:
TST (mean [standard deviation, SD]) by polysomnography, method 1, and method 2 was 366.0 (70.1), 447.1 (59.0), and 542 (61.9) min, respectively. The corresponding AHI was 12.5 (8.2), 10.8 (7.0), and 9.1 (6.1) events/h, respectively. Compared to polysomnography, both alternative methods overestimated the TST (method 1: mean difference [SD] 81.1 [56.1] min, method 2: 176.0 [89.7] min; both p < 0.001) and underestimated the AHI (method 1: mean difference [SD] −1.6 [3.3], method 2: −3.3 [3.9]; both p < 0.001). The sensitivity was 100% and 70.0% for method 1, and 91.3% and 40.0% for method 2 for identifying sleep-disordered breathing using AHI cutoffs of ≥ 5 and ≥ 15 events/h, respectively.
Conclusions:
Manual editing of TRT reduces the overestimation of TST and improves the sensitivity for identifying studies with sleep-disordered breathing.
Commentary:
A commentary on this article appears in this issue on page 9.
Citation:
Zhao YY, Weng J, Mobley DR, Wang R, Kwon Y, Zee PC, Lutsey PL, Redline S. Effect of manual editing of total recording time: implications for home sleep apnea testing. J Clin Sleep Med. 2017;13(1):121–126.
Keywords: apnea-hypopnea index, home sleep apnea tests, monitoring time, sleep-disordered breathing, total recording time, total sleep time
INTRODUCTION
Sleep-disordered breathing (SDB) is estimated to affect 26% of the general population and is associated with increased cardiovascular morbidity and mortality.1,2 Despite this, SDB remains significantly underdiagnosed and undertreated.3,4 The gold standard diagnostic test for SDB is overnight in-laboratory polysomnography (PSG). However, the cost, inconvenience, limited access, and often long wait times associated with in-laboratory PSG often hinder its widespread use. With an increased focus on cost reduction, there has been growing adoption of unattended home sleep apnea testing (HSAT) to diagnose SDB. In a recent 2013 survey of sleep centers, 50% of sleep centers offer HSAT for Medicare patients and 64% offer HSAT for privately insured patients.5 When used in the appropriate setting, HSAT may be a cost-effective alternative to PSG and may result in reduced wait times and improved recognition and treatment of SDB. Several studies have evaluated the clinical outcomes of patients managed with HSAT compared to in-laboratory PSG and found no differences in clinical outcomes between the two management pathways.6–8 In the HomePAP study, acceptance of continuous positive airway pressure (CPAP) therapy, titration pressures, effective titrations, time to treatment, and subjective sleepiness were similar between those with a diagnosis of SDB and managed using HSAT and autotitrating CPAP compared to a traditional pathway of in-laboratory PSG and CPAP titration.9 However, although the home-based management pathway incurred fewer costs than the laboratory-based pathway for payers, costs are comparable if not higher for providers.10
BRIEF SUMMARY
Current Knowledge/Study Rationale: Home sleep apnea tests and automated scoring platforms are frequently used for the diagnosis of sleep-disordered breathing. How the apnea-hypopnea index on automatically scored sleep studies compare to manually edited studies or the gold standard polysomnography is unknown.
Study Impact: Our study shows that compared to polysomnography, sleep studies scored without manual editing of total recording time overestimate the total sleep time and consequently underestimate the apnea-hypopnea index. Manual editing of total recording time can improve the accuracy of the apnea-hypopnea index and the classification of sleep-disordered breathing.
HSAT is approved by the Centers for Medicare & Medicaid Services for the diagnosis of SDB.10 The American Academy of Sleep Medicine (AASM) recently provided scoring rules for HSAT in The AASM Manual for the Scoring of Sleep and Associated Events.11 The AASM defines monitoring time as the total recording time (TRT) minus periods of artifact and time the patient was awake as determined by actigraphy, body position, respiratory pattern, or patient diary.11 The respiratory event index (REI), defined as the total number of respiratory events scored times 60 divided by the monitoring time, may be used as a surrogate for the apnea-hypopnea index (AHI), which can only be reported if actual sleep is recorded.11 The AASM Manual for the Scoring of Sleep and Associated Events recommends the reporting of either the REI or the AHI.11 The AASM also recommends that “portable monitoring devices must allow for the display of raw data for manual scoring or editing of automated scoring by a trained and qualified sleep technician/technologist” and that “evaluation of portable monitor data must include review of the raw data by a board certified sleep specialist or a board eligible individual.”12 Despite the aforementioned recommendations, many laboratories and commercial vendors providing HSAT services use the TRT, which is automatically computed, instead of the guideline-defined monitoring time when calculating the REI. Arguments for using the TRT include those related to the additional time (and costs) in manually editing of the recording period, and concern that any such editing might introduce a level of subjectivity in analysis without appreciably improving AHI estimation. Other potential reasons for using TRT instead of monitoring time include the lack of experience in manual editing and the use of HSAT devices that do not allow for manual editing of automated scoring.
The current study was designed to better understand how the TRT compares to the monitoring time and to the gold standard total sleep time (TST: scored using electroencephalography [EEG], electrooculograms [EOG], and submental chin electromyogram [EMG]) from a PSG, and to quantify the effect of differences in scoring on REI estimation. We hypothesized that manual editing of the TRT would reduce misclassification of TST, and result in a REI that more closely approximated AHI values generated by full PSG.
METHODS
Thirty 15-channel in-home PSG studies with AHI ranging between 0 to 30 events/h (10 studies with AHI less than 10, 10 studies with AHI between 10 to 15, and 10 studies with AHI between 15 to 30 events/h) were randomly selected from the Multi-Ethnic Study of Atherosclerosis (MESA) Sleep Study sample, an ancillary study of MESA designed to investigate the effect of sleep disorders on prevalent and incident subclinical cardiovascular disease and its progression.13,14 We excluded studies with an AHI > 30 events/h given our interest in classification in mild to moderate-severe range. The PSG studies were conducted using a 15-channel monitor (Compumedics Somte System; Compumedics Ltd., Abbotsville, Australia). The recording montage included central, occipital and frontal EEGs, bilateral EOGs, a chin EMG, bipoloar electrocardiography, thoracic and abdominal respiratory inductance plethysmography, airflow measured by thermistor and nasal pressure cannula, finger pulse oximetry, and bilateral limb movements (piezoelectric sensors). The studies were scored by trained research polysomnologists using AASM criteria (gold standard).15 Apneas were defined as a ≥ 90% reduction in respiratory amplitude for ≥ 10 seconds. Hypopneas were defined as a ≥ 30% reduction in respiratory amplitude for ≥ 10 sec associated with a ≥ 3% oxygen desaturation. The REI was defined as the number of apneas and hypopneas per hour of TST. Only studies with at least one respiratory channel (airflow or either band), oximetry and one EEG channel with ≥ 5 h of scorable data and for ≥ 50% of the sleep time were included in the current analysis.
For the current analysis, the PSG studies were rescored in random order by two blinded polysomnologists after data from EEG, EOG, and EMG were masked. In method 1, monitoring time, determined by removing periods of probable wakefulness and artifact from the TRT, as recommended by Berry et al.,11 served as a proxy for TST and was used to calculate the REI. Specifically, sleep onset was determined based on a qualitative assessment that includes evidence of reduction in artifact across channels, reduction in heart rate, and assumption of rhythmic breathing, while sleep offset is marked by appearance of sustained movement artifact and/or increased heart rate. Wake periods were scored if there were consecutive epochs with consistent activity (no minimum duration) or when the oximetry signal or all respiratory signals were unreliable for a duration of at least 20 min. Periods with a loss of reliable signal from all respiratory channels and from the oximetry channel were scored as wake (no minimum duration). In method 2, the TST was defined as the TRT without manual editing, identifying sleep onset based on lights off and sleep offset as lights on and was used to calculate the REI.
Statistical Analysis
The indices averaged for both scorers were used for all analyses. Paired t-tests were used to compare the TST and REI between methods. Two-sided values of p < 0.05 adjusted for three pairwise comparisons using Bonferroni correction were considered statistically significant. Agreement between PSG and the alternative methods were depicted graphically using Bland-Altman plots.16 Sensitivity and specificity of each method were calculated for gold standard AHI cutoffs of ≥ 5 and ≥ 15 events/h. SDB was classified using AHI thresholds of < 5, 5–14.9, ≥ 15 events/h, representing normal, mild, and moderate SDB, respectively. Intraclass correlation coefficients were used to assess agreement between scorers for TST and AHI for both alternative methods. Analyses were performed using SAS version 9.4 statistical software (SAS Institute, Inc., Cary, NC).
RESULTS
The age of the 30 participants was on average 68.2 y (standard deviation [SD] 9.1) and they had a mean body mass index of 27.5 kg/m2 (SD 4.4), respectively. The participants were 46.7% male, 36.7% white, 30.0% Hispanic, 20% Chinese, and 20.0% black. A comparison of the TST and AHI as measured by PSG versus method 1 and method 2 are shown in Table 1. The mean (SD) TST using PSG, method 1, and method 2 was 366.0 (70.1), 447.1 (59.0), and 542 (61.9) min, respectively. The mean (SD) AHI using PSG, method 1, and method 2 was 12.5 (8.2), 10.8 (7.0), and 9.1 (6.1) events/h, respectively. Bland-Altman plots depicting the mean bias comparing the TST and AHI between PSG and the two alternative methods are shown in Figures 1 and 2. Compared to PSG, both alternative methods overestimated the TST (method 1: mean difference [SD] 81.1 [56.1] min, method 2: 176.0 [89.7] min; both p < 0.001) and underestimated the AHI (method 1: mean difference [SD] −1.6 [3.3]; p = 0.034, method 2: −3.3 [3.9]; p < 0.001). Compared to method 1, method 2 overestimated the TST (mean difference [SD] 94.9 [66.2] min, p < 0.001) and underestimated the AHI (mean [SD] 1.7 [1.2] events/h, p < 0.001).
Table 1.
Using AHI cutoffs of ≥ 5 and ≥ 15 events/h, the sensitivity was 100% and 70.0% for method 1, and 91.3% and 40.0% for method 2 for identifying studies with SDB, respectively. The specificity was 100% for both methods using both AHI cutoffs (Table 2). Using AHI cutoffs of < 5, 5–14.9, and ≥ 15 events/h on PSG, method 1 resulted in an overall misclassification of 10% and method 2 resulted in an overall misclassification of 26.7% of studies. Misclassification was greater for moderate SDB (AHI ≥ 15 events/h) than for mild SDB (Table 3).
Table 2.
Table 3.
The intraclass correlation coefficients for TST and AHI for method 1 were 0.772 and 0.980, respectively, and for method 2 were 0.967 and 0.981, respectively.
DISCUSSION
Our study demonstrates that compared to a full PSG (gold standard), sleep studies without EEG and scored automatically using TRT overestimated the TST, which led to an underestimation of the AHI. This underestimation of AHI resulted in a misclassification of SDB severity in participants. Manual editing of TRT for periods of artifact and probable wakefulness reduced the overestimation of TST and provided an AHI that more closely approximated that obtained from the PSG. As a result, manual editing of TRT improved the sensitivity for identifying studies with SDB and reduced the misclassification of SDB severity compared to studies where AHI was derived from TRT.
It has been debated that use of HSAT may improve access to sleep services and may be associated with reduced cost.17 However, a recent economic analysis of HSAT and related home-based sleep apnea pathways identified that although cost savings are achieved for payers, HSAT is associated with comparable or even greater costs for providers compared to PSG-based pathways.9 A prior economic simulation study also concluded that full-night PSG, not portable monitoring, was the preferred diagnostic strategy in patients suspected to have moderate to severe obstructive sleep apnea.18 In addition, the AASM recommends that patients with high pretest probabilities of moderate to severe obstructive sleep apnea with negative or technically inadequate portable monitoring tests be evaluated with in-laboratory PSG to exclude the possibility of a false negative study.12 Such an approach would entail additional use of resources, thereby reducing the cost-effectiveness of the HSAT strategy as a whole.
Given the reduced reimbursement for HSAT, there are incentives for reducing actual costs and time with the use of automated scoring software. Although the AASM has published guidelines for scoring HSAT, there is variability among sleep centers regarding whether the TRT or monitoring time is used to calculate the REI. As demonstrated in our study, the use of TRT without manual editing of period of wakefulness and artifact can grossly overestimate the TST and result in significant underestimation of the REI. Given prior studies have shown that HSAT presents more artifacts and have higher data-loss rates than in-laboratory studies, manual editing may be particularly valuable in reducing the overestimation of the TST.19,20 The overestimation may be especially evident in subjects with long sleep latency and poor sleep efficiency. In our study, use of TRT to define the REI led to a decreased sensitivity in identifying studies with SDB and may misclassify sleep apnea severity. Thus, manual editing of the recording period may improve the sensitivity of the REI without negatively influencing specificity, thus reducing the likelihood of missing a diagnosis of SDB and/or underappreciating the severity of SDB, which may lead to missed opportunities for treatment. It is also important to recognize that there is no standardized definition for TRT, which may account for differences in AHI reported by PSG compared to HSAT. In the current study, TRT was defined as the period between lights off and lights on. However, some providers may allow the patient to manually turn the HSAT device on and off when getting into and out of bed while others program the HSAT device to turn on and off at prespecified times irrespective of when the patient gets in and out of bed. With either approach, manual editing of TRT for artifacts and periods of wakefulness may increase the sensitivity in identifying studies with SDB and reduce misclassification of sleep apnea severity. Some sleep centers use HSAT devices that calculate the REI using a surrogate ‘monitoring time’ that is calculated using a TRT edited by a proprietary algorithm. Although the surrogate ‘monitoring time’ may better approximate TST than using TRT, the algorithms are not usually transparent to users and the REI derived using such proprietary algorithms are device specific (and may even be version specific), making comparisons of results and generalization difficult.
To our knowledge, our study is the first to compare TST and AHI from a full PSG to that derived using the TRT and to evaluate the effect of manual editing of TRT on TST and REI. In interpreting our results, we acknowledge several limitations to our study. Sleep studies were only scored by two highly trained polysomnologists familiar with HSAT; it is possible that less trained scorers would perform differently. We also did not include any studies with an AHI > 30 events/h due to our interest in classification in mild to moderate-severe range. However, it is unlikely that any misclassification of studies with an AHI > 30 events/h will result in a normal sleep study and thus affect treatment decisions or have implications for CPAP coverage. Our analyses were based on studies with a minimal of 5 h of scorable data. Studies with fewer data may be less reliable both because there is less information and because they may not reflect the individual's “typical” sleep. It is possible greater levels of discrepancy between the scoring methods would have been observed in shorter studies or studies with greater amounts of artifact. Prior review of portable monitors reported data loss of 3% to 18% for type 3 monitors.19 Furthermore, some HSAT devices may not measure airflow using both pressure and thermal sensors, whereas we used both thermistor and nasal pressure cannula to measure airflow. Use of only a single airflow sensor may influence the total number of events identified, or classification of events as apneas versus hypopneas. Last, we also analyzed data from only one type 1 monitor; use of alternative monitors may result in differences.
CONCLUSIONS
Sleep studies scored without EEG channels and without manual adjudication of TRT overestimate the TST and underestimate the AHI. Manual editing of TRT reduces the overestimation of TST and results in an REI that more closely approximates the AHI from PSG. Manual editing also improves the sensitivity for identifying studies with abnormal AHI and reduces the misclassification of SDB severity.
DISCLOSURE STATEMENT
This was not an industry supported study. Funding was provided by NIH 5T32HL007901, 1R01HL083075, R01HL098433, R01 HL098433-02S1, 1U34HL105277-01, 1R01HL110068-01A1, and 1Ro1HL113338-01. Dr. Redline has received research support from Jazz Pharmaceuticals Inc. Dr. Zee has received research support from Merck & Co., Philips Respironics, Jazz Pharmaceuticals Inc, and Eisai Inc. The other authors have indicated no financial conflicts of interest.
ACKNOWLEDGMENTS
The authors thank all study participants, Stephanie Marvin and Michelle Nicholson (polysomnologists), and Michael Rueschman (data manager) for their contribution to the study. Author contributions: Drs. Zhao and Redline had full access to all of the data in the study and are responsible for the integrity of the data and the accuracy of the data analysis; Drs. Zhao, Redline and Daniel Mobley contributed to the study design and data collection. Drs. Zhao and Weng contributed to the data analysis. All authors contributed to the interpretation of results and the preparation of the manuscript.
ABBREVIATIONS
- AASM
American Academy of Sleep Medicine
- AHI
apnea-hypopnea index
- EEG
electroencephalogram
- EMG
electromyogram
- EOG
electrooculogram
- HSAT
home sleep apnea testing
- MESA
Multi-Ethnic Study of Atherosclerosis
- PSG
polysomnography
- REI
respiratory event index
- SD
standard deviation
- SDB
sleep-disordered breathing
- TST
total sleep time
- TRT
total recording time
REFERENCES
- 1.Peppard PE, Young T, Barnet JH, Palta M, Hagen EW, Hla KM. Increased prevalence of sleep-disordered breathing in adults. Am J Epidemiol. 2013;177(9):1006–1014. doi: 10.1093/aje/kws342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Somers VK, White DP, Amin R, et al. American Heart Association Council for High Blood Pressure Research Professional Education Committee; Council on Clinical Cardiology: American Heart Association Stroke Council; American Heart Association Council on Cardiovascular Nursing; American College of Cardiology Foundation. Circulation. 2008;118(10):1080–1111. doi: 10.1161/CIRCULATIONAHA.107.189375. [DOI] [PubMed] [Google Scholar]
- 3.Javaheri S, Caref EB, Chen E, Tong KB, Abraham WT. Sleep apnea testing and outcomes in a large cohort of Medicare beneficiaries with newly diagnosed heart failure. Am J Respir Crit Care Med. 2011;183(4):539–546. doi: 10.1164/rccm.201003-0406OC. [DOI] [PubMed] [Google Scholar]
- 4.Gibson GJ. Obstructive sleep apnoea syndrome: underestimated and undertreated. Br Med Bull. 2005;72:49–65. doi: 10.1093/bmb/ldh044. [DOI] [PubMed] [Google Scholar]
- 5.Holman FA. Home testing surges as bed growth declines. Sleep Review Web site. [Accessed April 28, 2016]. http://www.sleepreviewmag.com/2012/09/home-testing-continues-growth/. Published September 30, 2012.
- 6.Berry RB, Hill G, Thompson L, McLaurin V. Portable monitoring and autotitration versus polysomnography for the diagnosis and treatment of sleep apnea. Sleep. 2008;31(10):1423–1431. [PMC free article] [PubMed] [Google Scholar]
- 7.Mulgrew AT, Fox N, Ayas NT, Ryan CF. Diagnosis and initial management of osbstructive sleep apnea without polysomnography: a randomized validation study. Ann Intern Med. 2007;146(3):157–166. doi: 10.7326/0003-4819-146-3-200702060-00004. [DOI] [PubMed] [Google Scholar]
- 8.Rosen CL, Auckley D, Benca R, et al. A multisite randomized trial of portable sleep studies and positive airway pressure autotitration versus laboratory-based polysomnography for the diagnosis and treatment of obstructive sleep apnea: the HomePAP Study. Sleep. 2012;35(6):757–767. doi: 10.5665/sleep.1870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kim RD, Kapur VK, Redline-Bruch J, et al. An economic evaluation of home versus laboratory-based diagnosis of obstructive sleep apnea. Sleep. 2015;38(7):1027–1037. doi: 10.5665/sleep.4804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Decision memo for sleep testing for obstructive sleep apnea (OSA) (CAG-00405N) Centers for Medicare & Medicaid Services Web site. [Accessed February, 2016]. https://www.cms.gov/Regulations-and-Guidance/Guidance/Transmittals/downloads/R86NCD.pdf.
- 11.Berry RB, Brooks R, Gamaldo CE, et al. for the American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. Darien, IL: American Academy of Sleep Medicine; 2015. Version 2.2. [Google Scholar]
- 12.Collop NA, Anderson WM, Boehlecke B, et al. Clinical guidelines for the use of unattended portable monitors in the diagnosis of obstructive sleep apnea in adult patients. Portable Monitoring Task Force of the American Academy of Sleep Medicine. J Clin Sleep Med. 2007;3(7):737–747. [PMC free article] [PubMed] [Google Scholar]
- 13.Bild DE, Bluemke DA, Burke GL, et al. Multi-ethnic study of atherosclerosis: objectives and design. Am J Epidemiol. 2002;156(9):871–881. doi: 10.1093/aje/kwf113. [DOI] [PubMed] [Google Scholar]
- 14.Chen X, Wang R, Zee P, et al. Racial/ethnic differences in sleep disturbances: the Multi-Ethnic Study of Atherosclerosis (MESA) Sleep. 2015;38(6):877–888. doi: 10.5665/sleep.4732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Redline S, Budhiraia R, Kapur V, et al. The scoring of respiratory events in sleep: reliability and validity. J Clin Sleep Med. 2007;3(2):169–200. [PubMed] [Google Scholar]
- 16.Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–310. [PubMed] [Google Scholar]
- 17.Kirsch DB. PRO: sliding into home: portable sleep testing is effective for diagnosis of obstructive sleep apnea. J Clin Sleep Med. 2013;9(1):5–7. doi: 10.5664/jcsm.2324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Pietzsch JB, Garner A, Cipriano LE, Linehan JH. An integrated health-economic analysis of diagnostic and therapeutic strategies in the treatment of moderate-to-severe obstructive sleep apnea. Sleep. 2011;34(6):695–709. doi: 10.5665/SLEEP.1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Flemons WW, Littner MR, Rowley JA, et al. Home diagnosis of sleep apnea: a systematic review of the literature. An evidence review cosponsored by the American Academy of Sleep Medicine, the American College of Chest Physicians, and the American Thoracic Society. Chest. 2003;124(4):1543–1579. doi: 10.1378/chest.124.4.1543. [DOI] [PubMed] [Google Scholar]
- 20.Tonelli de Oliveira AC, Martinez D, Vasconcelos LF, et al. Diagnosis of obstructive sleep apnea syndrome and its outcomes with home portable monitoring. Chest. 2009;135(2):330–336. doi: 10.1378/chest.08-1859. [DOI] [PubMed] [Google Scholar]