Skip to main content
Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine logoLink to Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine
. 2024 May 1;20(5):709–717. doi: 10.5664/jcsm.10982

The impact of study type and sleep measurement on oxygen desaturation index calculation

Carley B Whenn 1,2, Danielle L Wilson 1,2, Warren R Ruehland 1,2, Thomas J Churchward 1,2, Christopher Worsnop 1,2, Julie Tolson 1,2,3,
PMCID: PMC11063702  PMID: 38169424

Abstract

Study Objectives:

The oxygen desaturation index (ODI) is an important measure of sleep-disordered breathing during polysomnography (PSG); however, the AASM Manual (V3) does not specify whether to include oxygen desaturations occurring during wake epochs. Additionally, an ODI obtained from PSG can differ from an ODI using home sleep apnea tests (HSATs) that do not measure sleep, hampering diagnostic and treatment decision reliability. This study aimed to (1) compare an ODI that included all desaturations with an ODI that excluded desaturations occurring during wake epochs in PSG and (2) compare ODIs obtained from PSG with HSAT.

Methods:

100 consecutive PSGs for investigation of obstructive sleep apnea were compared. ODIs were calculated including all desaturations (ODIall) and by excluding desaturations entirely during wake epochs (ODIsleep). Additionally, we compared ODIall with an ODI calculated using monitoring time as the denominator (ODIHSAT).

Results:

The median (interquartile range) 3% ODI for ODIall was 22.8 (13.1, 44.1) events/h and ODIsleep was 17.6 (11.5, 35.2) events/h (median difference: –3.9 events/h [–8.2, –0.9]; 21.0% [8.7%, 33.2%]). This discrepancy was larger with increasing ODI and decreasing sleep efficiency. The ODIHSAT was 17.4 (11.3, 35.2) events/h and the median reduction in ODIHSAT vs ODIall was –4.5 (–10.9, –2.0) events/h (21.6%; 11.1%, 33.8).

Conclusions:

ODI was significantly reduced when desaturations in wake epochs were excluded, and when ODI was based on monitoring time rather than sleep time, with the potential for underestimation of disease severity. Results suggest that ODI can differ substantially depending on the calculation and study type used, and that there is a need for standardization to ensure consistent diagnosis and treatment outcomes.

Citation:

Whenn CB, Wilson DL, Ruehland WR, Churchward TJ, Worsnop C, Tolson J. The impact of study type and sleep measurement on oxygen desaturation index calculation. J Clin Sleep Med. 2024;20(5):709–717.

Keywords: desaturation, epoch, calculation, oxygen, sleep


BRIEF SUMMARY

Current Knowledge/Study Rationale: The oxygen desaturation index (ODI) is an important measure of sleep-disordered breathing. However, there is no accepted standard for its calculation during polysomnography, nor guidelines on how results from polysomnography can be compared with limited channel recordings. Since the clinical significance of underestimation of ODI using different study types and calculation methods is currently unknown, this study aimed to compare different calculation methods using current guidelines and to investigate any differences.

Study Impact: Results quantify the degree to which different calculation methods and different sleep study types can produce different ODIs. Results inform clinical and research practice and emphasize the need for improved standardization, transparency, and translatability of ODI measurements acquired across different testing modalities, to ensure appropriate diagnosis and treatment of sleep-disordered breathing.

INTRODUCTION

Polysomnography (PSG) is routinely performed for the investigation of obstructive sleep apnea (OSA). OSA is characterized by repeated episodes of upper airway obstruction, which can result in arterial oxygen desaturations, measured using pulse oximetry. The oxygen desaturation index (ODI) is a measure of the number of times per hour that blood oxygen saturation decreases by a predetermined amount (typically 3%, ODI3, or 4%, ODI4) from baseline.

The purpose of the ODI is to quantify the bearing of nocturnal respiratory events on oxygen saturation. However, there is no clearly defined standard for the calculation of ODI within PSG, hampering efforts to compare ODI across sleep centers and research studies.1 Version 3 of The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications2 (AASM Scoring Manual) provides definitions to assist in the calculation of ODI; however, these definitions are open to interpretation. The AASM Scoring Manual2 defines ODI when calculated using PSG as “the number of oxygen desaturations × 60/total sleep time.” The definition of an oxygen desaturation is only described in the scoring of hypopneas (in adults) as a desaturation of ≥ 3% or ≥ 4% (depending on the hypopnea rule set selected) from pre-event baseline.2

PSG uses 30-second epochs for the visual identification of sleep stages. The AASM Scoring Manual2 does not specify whether oxygen desaturations occurring during wake epochs should be included in the numerator of the calculation of ODI during PSG. When measured using AASM rules, if a respiratory event is followed by a cortical arousal leading to wakefulness, due to the physiological lag time between event and oximetry measurement (typically 25–30 seconds after the event), the corresponding desaturation may occur in an epoch scored as wake3 (Figure 1). If oxygen desaturations occurring in wake epochs are excluded from the numerator of the calculation, this may substantially underestimate ODI, particularly when sleep is highly fragmented.

Figure 1. Example PSG fragment demonstrating a respiratory event beginning in a sleep epoch, with a corresponding oxygen desaturation occurring in a wake epoch.

Figure 1

Epochs are labeled as W = wake, 1 = stage 1, or 2 = stage 2 sleep. Desat = desaturation, Abdo = abdominal RIP signal, Flow = nasal flow signal, Pleth = oximeter plethysmography, PSG = polysomnography, SpO2% = % oxygen saturation, Therm = thermistor signal, Thor = thoracic respiratory inductance plethysmography (RIP) signal.

Although home sleep apnea tests (HSATs) that record electroencephalograms (EEGs) are used in some settings, the ODI can also be calculated using type 3 or 4 limited channel HSATs where EEG is not used to quantify sleep. Limited channel HSATs are being utilized frequently4 due to their reduced waiting times and expense in comparison to attended in-laboratory PSG. The AASM Scoring Manual recommends using monitoring time (MT) as the denominator when calculating ODI during HSAT when sleep is not recorded. MT is defined as “total recording time minus periods of artifact and time when the patient was awake as determined by actigraphy, body position sensor, respiratory pattern or patient diary.”2 Thus, MT is a surrogate for total sleep time (TST) when sleep is not recorded. The numerator in such studies is the total number of oxygen desaturations within MT, as epoch scoring is not used in the absence of sleep variables.

Given that ODI can be calculated using PSG or HSAT it is important to assess the impact of the common calculation methods used in these study types. An understanding of how ODI can differ across and within study types may allow for a consistent approach to diagnosis and appropriate treatment of OSA regardless of the study type used.

The first aim of this study was to investigate and quantify the impact of including vs excluding oxygen desaturations occurring during wake epochs, when calculating ODI within PSG. The second aim was to compare ODI calculation in PSG vs simulated HSATs, where the denominator was calculated using MT.

METHODS

This was an observational study in which consecutive type 2 home diagnostic PSGs using Compumedics Somté System V2 (Compumedics Ltd, Abbottsford, Australia) performed between November 2019 and March 2020 from the clinical sleep laboratory at Austin Health, Australia, were retrospectively reviewed. Signals available for analysis were EEG, electrooculogram, chin and leg electromyogram, electrocardiogram, thermistor, nasal pressure, thoracic and abdominal respiratory inductance plethysmography, SpO2 plethysmography, position, snore, and SpO2. The monitoring frequency of the inbuilt oximeter was 32 samples per second. To ensure focus on the variables of interest, strict exclusion criteria were utilized. Studies were excluded if > 5% of a study had SpO2 signal loss or poor signal quality during dark time (“lights off” to “lights on”). To obtain 100 PSGs in the analysis, excluded PSGs were replaced with the next consecutive study. PSGs were de-identified and demographic information was collected. Each PSG had a score set created with original clinical sleep scoring completed by experienced sleep scorers who participate regularly in an inter- and intra-laboratory scoring concordance program,4,5 ensuring that all oxygen desaturations ≥ 3% were scored in sleep and wake epochs during dark time. Dark time (or time available for sleep) was calculated by the original sleep scorer using the patient’s reported lights-out and lights-on times. Although clinical and research literature utilizes both 3% ODI and 4% ODI, we focused on 3% ODI in this study, as current definitions favor using a 3% ODI,2,3 based on the recommended hypopnea definition that also incorporates a 3% oxygen desaturation criteria. 4% ODI calculations can be found in Table S2 (268.1KB, pdf) , Table S3 (268.1KB, pdf) , Table S4 (268.1KB, pdf) , Table S5 (268.1KB, pdf) , Table S6 (268.1KB, pdf) , Table S7 (268.1KB, pdf) , Figure S1 (268.1KB, pdf) , and Figure S2 (268.1KB, pdf) .

Three different ODIs were calculated from each PSG: (1) all oxygen desaturation events/TST (ODIall), (2) oxygen desaturation events spanning sleep epochs only/TST (ODIsleep), and (3) all oxygen desaturation events/MT (ODIHSAT). Additionally, a fourth definition included oxygen desaturation events beginning in sleep epochs only/TST (ODIstart), as we were aware of some clinical laboratories using this method.

To obtain ODIHSAT, signals used to score sleep (EEG, electrooculogram, and electromyogram) were deleted, leaving only channels typically used in type 3 studies (nasal pressure, thoracic and abdominal respiratory inductance plethysmography, position, snore, and SpO2). After excluding artifact, MT was determined by marking periods of probable wakefulness as “lights on” during total dark time, by a single experienced sleep scorer who participates regularly in an inter- and intra-laboratory scoring concordance program.6 In line with Zhao et al,7 probable wakefulness was scored where there were consecutive epochs with consistent and sustained movement artifact, particularly when that movement was coincident with a position change (Figure 2).

Figure 2. Example PSG epochs demonstrating probable wakefulness used for calculation of monitoring time (A) and when sleep would be continued (B).

Figure 2

Abdo = abdominal RIP signal, Flow = nasal flow signal, Pleth = oximeter plethysmography, Pos = body position, PSG = polysomnography, Snore = snore signal, SpO2% = % oxygen saturation, Therm = thermistor signal, Thor = thoracic RIP signal.

Statistical analysis

All statistical analyses were performed with Stata 17.0 (StataCorp LP, College Station, TX, USA). As most variables were skewed, values are reported as medians and interquartile ranges. A P value < .05 was considered to indicate statistical significance.

Comparison between ODI calculation methods

For PSGs, a Bland-Altman plot8 was produced to display the difference between ODIall and ODIsleep vs the mean of the 2 ODIs. To investigate whether the discrepancy in ODIs increased as the mean ODI increases, linear regression modeling was performed with both variables log-transformed due to skewness. In order to estimate the clinical implications of using ODI based on HSAT rather than PSG in the determination of OSA severity, a cross-tabulation illustrating ODI severity misclassification was compiled based on established apnea-hypopnea index (AHI) cutoffs.9 OSA severity was classified as ODI < 5 events/h (normal/no OSA diagnosis), ODI ≥ 5 to < 15 events/h (mild OSA), ODI ≥ 15 to < 30 events/h (moderate OSA), and ODI ≥ 30 events/h (severe OSA).

Demographic and PSG variables contributing to ODI differences

Linear regression was used to assess the univariate relationships between the difference in ODIall and ODIsleep with demographic and sleep variables. To meet analysis assumptions, data were transformed prior to analysis as follows: due to positive skewness, AHI, sleep latency, and wake after sleep onset were square-root transformed, whereas sleep efficiency, SpO2 nadir, and %TST SpO2 below 95% were reflected and then square-root transformed due to negative skewness. The %TST SpO2 below 90% was log-transformed, and SpO2 baseline during sleep was reflected and then log-transformed. Given that all difference values for ODIsleep – ODIall were negative, they were both converted to absolute values before log-transformation. Backward stepwise regression modeling was then performed, with differences in ODIsleep and ODIall as the dependent variables, and all explanatory variables entered into each model with manual removal of the least significant, until the models were an optimal fit for the data.

To specifically investigate the impact of sleep efficiency on ODI when including vs excluding oxygen desaturations scored in wake epochs, Kruskal-Wallis H tests were calculated across sleep efficiency categories based on a quartile split.

RESULTS

One hundred and fifty-nine PSGs were assessed, with 59 studies excluded due to > 5% total dark time with poor SpO2 signal quality. The demographic and sleep variables for the 100 included PSG studies are detailed in Table 1. The patients from the excluded PSGs were similar in age (P = .16), sex (P = 1.0), and body mass index (P = .50) to those included in the study.

Table 1.

Demographics and sleep parameters.

Parameter Min Max Value
Age, y 18.0 86.0 52.5 (41.5, 65.0)
Sex, % male 63
BMI, kg/m2 17.8 55.0 31.5 (27.0, 36.9)
TST, min 148.0 513.5 348.5 (303.3, 409.8)
Sleep efficiency, % 35.5 94.8 75.3 (65.1, 84.5)
Sleep latency, min 0 104.5 17.0 (7.3, 33.0)
WASO, min 14.0 255.5 79.0 (48.5, 140.3)
NREM sleep, min 126.5 418.0 283.5 (249.8, 320.3)
REM sleep, min 0.0 171.0 66.0 (47.0, 87.3)
AHI, events/h 0.0 88.7 20.5 (12.0, 38.0)
SpO2 baseline wake, % 90.0 98.0 95.0 (94.0, 96.0)
SpO2 baseline sleep, % 87.0 97.0 94.5 (93.0, 96.0)
TST SpO2 < 95%, % 0.6 100.0 74.7 (34.5, 91.2)
TST SpO2 < 90%, % 0.0 77.3 1.3 (0.2, 7.6)
TST SpO2 < 88%, % 0.0 50.2 0.2 (0.0, 3.2)
SpO2 nadir, % 49.0 94.0 85.0 (79.0, 88.0)

n = 100. Values given as median (IQR) or n (%). AHI = apnea-hypopnea index, BMI = body mass index, Max = maximum, Min = minimum, NREM = non–rapid eye movement, REM = rapid eye movement, TST = total sleep time, WASO = wake after sleep onset.

Comparison of ODI in sleep only vs sleep and wake

The median (interquartile range [IQR]) 3% ODIall was 22.8 events/h (13.1, 44.1) compared to ODIsleep, which was 17.6 events/h (11.5, 35.2). The median (IQR) difference (ODIsleep – ODIall) was –3.9 events/h (–8.2, –1.9), equating to a decrease of 21.0% (8.7%, 33.2%) from ODIall to ODIsleep.

As Figure 3 illustrates, the discrepancy between the 2 ODI indices was larger with increasing average ODI [F(1,98) = 42.62, P < .001, R2 = 30.31%; with log-transformations]. For every 10% increase in the average ODI, the discrepancy between the ODIall and ODIsleep increased by 7.6%.

Figure 3. Bland-Altman plot illustrating agreement between ODIs (3%) including (ODIall) vs excluding oxygen desaturations entirely in wake epochs (ODIsleep), showing median, 5th, and 95th percentiles.

Figure 3

The data demonstrate that, as the average of ODIall and ODIsleep increases, the discrepancy between these oxygen desaturation indices becomes larger with a wider spread. ODI = oxygen desaturation index, ODIall = oxygen desaturation index calculated including all oxygen desaturations with total sleep time as the denominator, ODIsleep = oxygen desaturation index calculated with desaturations entirely in wake epochs excluded, with total sleep time as the denominator.

When the ODI calculation was limited to oxygen desaturations beginning in sleep epochs only, the median (IQR) for ODIstart was 16.3 events/h (9.8, 33.1), which was significantly less (–1.0 [–0.4, –2.2] events/h) than ODIsleep (P < .001). Figure 4 shows a comparison of the 3 ODI indices generated from PSG.

Figure 4. Comparison between ODI indices generated during polysomnography.

Figure 4

ODIall includes all oxygen desaturations, ODIsleep includes oxygen desaturations spanning sleep epochs, and ODIstart includes only oxygen desaturations that begin in sleep epochs. Total sleep time was the denominator for all indices. All pairwise comparisons were significant (P < .001). ODI = oxygen desaturation index.

Univariate linear regression revealed that several demographic and sleep-related variables were significantly related to the magnitude of the difference between ODIall and ODIsleep, the strongest of which was sleep efficiency, SpO2 baseline during sleep, SpO2 baseline during wake, and minutes of non–rapid eye movement sleep (see Table S1 (268.1KB, pdf) in the supplemental material). However, regression modeling revealed that the best-fitting model contained only 3 predictors, with decreasing sleep efficiency, SpO2 baseline during wake, and SpO2 nadir explaining 52.7% of the variance in the difference between ODIall and ODIsleep (Table 2).

Table 2.

Final model for prediction of difference between 3% oxygen desaturations indices including (ODIall) vs excluding (ODIsleep) oxygen desaturations solely in wake epochs.

Coefficient 95% CI Beta t P Value
Variable
 Sleep efficiency sqrt 3.70 2.68, 4.73 .51 7.16 <.001
 SpO2 baseline (wake) −1.92 −3.04, –.79 −.28 −3.39 .001
 SpO2 nadir square root 2.04 .59, 3.48 .22 2.80 .006
 Constant 166.72 56.85, 276.58 3.01 .003
Final model
R2 .5270
 Adjusted R2 .5123
 Standard error 7.58
F ratio (3, 96) 35.66
 Significance <.001

n = 100. The best-fitting model contained 3 predictors, with decreasing sleep efficiency, decreasing SpO2 baseline during wake and decreasing SpO2 nadir explaining 52.7% of the variance in the difference between ODIall and ODIsleep log-transformed. Sleep efficiency and SpO2 nadir reflected before square root transformation. CI = confidence interval, ODI = oxygen desaturation index, ODIall = oxygen desaturation index calculated including all oxygen desaturations with total sleep time as the denominator, ODIsleep = oxygen desaturation index calculated with desaturations entirely in wake epochs excluded, with total sleep time as the denominator.

Considering that sleep efficiency was the strongest predictor of the magnitude of the difference between ODIall and ODIsleep, Figure 5 demonstrates the difference in ODI at different sleep efficiencies. The difference in ODIall and ODIsleep was significantly greater with decreasing sleep efficiency [χ2(3) = 59.44, P < .001] (Figure 5), with all pairwise comparisons between sleep efficiency quartiles being significant (P < .025). Sleep efficiency was also the strongest predictor of the magnitude of the difference between ODIall and ODIsleep for 4% ODI (see Table S6 (268.1KB, pdf) and Table S7 (268.1KB, pdf) ).

Figure 5. The median (IQR) decrease in ODI across sleep efficiency quartiles when desaturations entirely in wake are included (ODIall) compared with when they are excluded (ODIsleep) from calculations.

Figure 5

All pairwise comparisons were significant (P < .025). IQR = interquartile range, ODI = oxygen desaturation index, ODIall = oxygen desaturation index calculated including all oxygen desaturations with total sleep time as the denominator, ODIsleep = oxygen desaturation index calculated with desaturations entirely in wake epochs excluded, with total sleep time as the denominator.

Comparison of ODI calculation based on HSAT vs PSG

The median (IQR) 3% ODIall was 22.8 events/h (14.3, 44.1) and ODIHSAT was 17.4 events/h (11.3, 35.2). The median (IQR) difference (ODIHSAT – ODIall) was –4.5 events/h (–10.9, –2.0), equating to a decrease in ODI of 21.6% (11.1%, 33.8%) when using MT rather than TST as the denominator (Figure 6). The median increase in MT compared to TST across the 100 PSGs was 98.8 (51.8, 154.5) minutes. As the Bland-Altman plot in Figure 6 illustrates, the discrepancy between ODIall and ODIHSAT was larger with increasing average ODI [F(1, 98) = 261.00, P < .001, R2 = 72.70%].

Figure 6. Bland-Altman plot illustrating agreement between ODIs based on PSG vs HSAT, showing median, 5th, and 95th percentiles.

Figure 6

The data demonstrate that, as the average of ODIall and ODIHSAT increases, the discrepancy between these oxygen desaturation indices becomes larger with a wider spread. HSAT = home sleep apnea test, ODI = oxygen desaturation index, ODIall = oxygen desaturation index calculated with total sleep time as the denominator, ODIHSAT = oxygen desaturation index calculated with monitoring time as the denominator, PSG = polysomnography.

Classifying the severity of ODI based on HSAT rather than PSG would lead to 21% of patients being misclassified into a less severe ODI category based on established AHI cutoffs (Table 3). No patients were “false positives” on HSAT. Each misclassified patient was only classified to the next lowest severity category.

Table 3.

Classification of ODI severity based on PSG vs HSAT.

HSAT PSG
ODI < 5 events/h ODI ≥ 5 to < 15 events/h ODI ≥ 15 to < 30 events/h ODI ≥ 30 events/h
ODI < 5 events /h 8 2 0 0
ODI ≥ 5 to < 15 events/h 0 15 12 0
ODI ≥ 15 to < 30 events/h 0 0 27 7
ODI ≥ 30 events/h 0 0 0 29

n = 100. PSG calculation based on ODIall. HSAT = home sleep apnea test, ODI = oxygen desaturation index, ODIall = oxygen desaturation index calculated including all oxygen desaturations with total sleep time as the denominator, PSG = polysomnography.

Results for 4% ODI were similar to 3% ODI in all comparisons and can be found in Table S2 (268.1KB, pdf) , Table S3 (268.1KB, pdf) , Table S4 (268.1KB, pdf) , Table S5 (268.1KB, pdf) , Table S6 (268.1KB, pdf) , Table S7 (268.1KB, pdf) , Figure S1 (268.1KB, pdf) , and Figure S2 (268.1KB, pdf) .

DISCUSSION

The present study aimed to investigate and quantify the impact of using different methods to calculate ODI. Within PSG, we found an overall decrease in ODI of 21.0% by excluding oxygen desaturations entirely in wake epochs. As average ODI increased, the discrepancy between the indices increased. Using accepted AHI severity cutoffs, this led to 21% of patients being misclassified into the next-lowest severity category. These results demonstrate that, within our dataset of patients being investigated for OSA, oxygen desaturations can frequently occur in epochs scored as wake, resulting in an underestimation of ODI if they are excluded from the calculation. Following from this, we found that sleep efficiency was the strongest predictor of the magnitude of the difference between ODIall and ODIsleep. The impetus to perform the present study was the observation that, in severe OSA when respiratory events result in frequent cortical arousals, many oxygen desaturations occurred in epochs scored as wake. This occurs due to the normal physiological lag time between event and oximetry measurement, and recent Australasian commentary on the AASM Scoring Manual therefore recommends that oxygen desaturations in epochs scored as wake should be included in the numerator of ODI,3 thus ensuring sleep efficiency does not impact on the ODI outcome.

We are faced with 2 imperfect choices when calculating the numerator of ODI within PSG: to exclude oxygen desaturations occurring in wake epochs, knowing that some may be related to valid respiratory events, or include all oxygen desaturations knowing that some may not be due to sleep-related events. In these latter cases, the denominator does not include wake time, and so may overestimate the rate per hour of sleep. This discord in ODI calculation within PSG occurs because epoch scoring guidelines were intended to efficiently calculate sleep, not to be of use in respiratory scoring. Norman et al10 and Wilson et al11 found that epoch scoring had a similar impact on AHI and arousal index calculations, respectively, and Norman et al discussed the finding that the use of such epoch scoring has been found to be unreliable when sleep is fragmented.12 Since our results indicate that those individuals with an elevated average ODI would be more likely to receive variable outcomes based on the calculation method and study type, it is arguably preferable to potentially overestimate ODI (and hence, OSA severity) than to underestimate, in terms of risks to individual patient health and the effect of disease in the community.

The methods of scoring sleep and wake in discrete epochs are a simplification. Participants may be asleep for part of an epoch scored as awake and vice versa. Even where continuous (non–epoch-based analysis) is used, due to physiological lag of SpO2 measurements, desaturations in wakefulness may be sleep-related, and so would be of interest. This suggests that including all oxygen desaturations in ODI calculations would be the most appropriate method. Additionally, including all oxygen desaturations occurring during PSG allows for comparison of the event count from HSATs.

We also found that a calculation using MT as the denominator (reflecting HSAT) produced an average ODI 21.6% lower than when the denominator was TST (reflecting PSG), which could potentially lead to an underestimation of deoxygenation severity. Similarly, in a report from the European Sleep Apnea Database, Escourrou et al13 found that a calculation using total analyzed time (time in bed between lights out and lights on) as the denominator resulted in a significantly lower ODI than when the denominator was TST. It is well established that AHI is underestimated in HSAT, when compared with PSG.7,1315 This is due to the different denominators and apnea-hypopnea definitions used. Scoring rules for HSAT without sleep monitoring do not allow for using TST in the denominator, or for the scoring of hypopneas that are associated with arousal in the absence of an associated oxygen desaturation.2 In contrast, ODI calculation between PSG and HSAT (provided all desaturations are included) is not impacted by differing rules as it is simply a count of the number of desaturations, so ODI differences can be assumed to be due only to the denominator used. When MT is used for quantification of respiratory events during HSAT the term respiratory event index is recommended instead of AHI. An alternative term for ODI during HSAT would allow for better interpretation and comparison.

Oximetry is a key PSG measurement. In relation to health outcomes, 4% ODI is independently associated with arterial hypertension16 and in the prediction of oxidative stress, which plays an important role in the development of cardiovascular disease.17,18 When there is a high clinical probability of OSA, ODI 4% is valid to “rule in” moderate to severe OSA (AHI ≥ 15 events/h),1922 and ODI is used in some countries as a metric to diagnose OSA.1 Furthermore, there is a wealth of data that can be obtained from the pulse oximeter signal, which can have clinical utility in assessing sleep-disordered breathing.23 The current study demonstrates that standardization of ODI indices within PSG could decrease heterogeneity between future studies. If ODI calculations are standardized, clearer relationships between ODI and health outcomes, such as cardiovascular disease, could potentially emerge.

HSATs are increasingly used due to their decreased waiting time24 and expense in comparison to in-laboratory studies, and their potential to allow for longer sleep time. This could conceivably increase reliance on ODI as a metric for the diagnosis of OSA. Thus, a model of how ODI compares across recording types is important for physicians, as it can guide their decision making in respect to treatment, and it is essential to allow pooling of results across studies. OSA severity underestimation could potentially lead to inappropriate or inadequate treatment pathways. Potential solutions to mitigate this underestimation risk may include the use of methods to more accurately quantify MT using non-EEG signals, such as using heart-rate variability or peripheral arterial tonometry to quantify sleep.2527 However, until such methods are in widespread clinical use, clinicians should be aware of the risks of ODI underestimation in HSATs.

This study highlights the need for further standardization in the calculation of ODI. Standardization of ODI calculation across study types could conceivably also allow for more robust validation of both medical and consumer wearable devices and their integration into clinical pathways, by guiding device manufacturers in the transparency of settings and the design of algorithms for ODI calculation.25 Device parameters such as signal averaging time have the potential to markedly influence desaturation indices.28 Measurement artifact is a substantial issue in oximetry, and so standards around the acceptability of data, perhaps based on the plethysmography waveform, are warranted, as well as standard definitions for baseline and for determining the end of a desaturation event Although multiple barriers exist, such as a lack of transparency into algorithms used and the use of “black box” algorithms and machine learning, if guidelines are clear within the AASM Scoring Manual2 this may encourage device manufacturers to meet the medical standards for pulse oximeters, allowing for clinically relevant data. Some pulse oximeters integrated into smart watches are already approved by regulatory bodies, which may or may not reflect accuracy performance, depending on the level of regulatory approval attained by each device.29 Given the wide and increasing availability of wearable devices with oximeters, their potential use for this purpose warrants further consideration.

The current findings demonstrate the potential impact of a lack of a clearly defined standard for the calculation of ODI across study types. Clinical sleep services use a variety of ODI calculation and reporting methods within PSG, ranging from only incorporating desaturation events starting in wake to including all desaturation events in the time available for sleep. Our review of the current literature found no studies that reported whether desaturations occurring in wake epochs were included in the numerator of ODI calculation within PSG, highlighting the need for transparent reporting of methods. Additionally, there are no clear guidelines to assist clinicians in interpreting ODI results obtained from PSG compared with those obtained from HSATs. Although including all oxygen desaturations in the numerator of the calculation in PSG would allow for more accurate comparison to HSATs in terms of event count, our results show that the time dilution effect in HSATs means that physicians must use caution when interpreting results.

Our study has some limitations, as follows:

  1. Our analysis included PSGs performed for the investigation of OSA from a single sleep center, so results may differ with another patient sample.

  2. Sleep staging was completed by a range of scientists, so our results may be influenced by interscorer variability; however, all scorers were trained, experienced, and participated in an external interscorer reliability program.6

  3. We included only oxygen desaturations in the calculation of ODIsleep that overlapped a sleep epoch, which required a custom software script that may not be available in commercial software. We considered this to be a separate definition to desaturations that start in a sleep epoch. We thought it pertinent to test this secondary case, as it represented the worst-case scenario in the discrepancy between calculations.

  4. The estimation of MT was completed visually. During the present study, probable wakefulness was scored by identifying periods of consistent and sustained movement artifact in the available signals, thereby visually approximating actigraphy. Continued advancements in the accuracy and reliability of actigraphs used in a clinical setting may provide a more accurate MT than used in the present study; in 2018 the AASM clinical practice guideline supported the utility of actigraphy to assist in the diagnosis of OSA.30,31

  5. Our study did not address the effect of skin color on the accuracy of ODI. Evidence suggesting that the accuracy of pulse oximeters is decreased in individuals with darker skin indicates that ODI may be inaccurate in such individuals.32,33 Future research should include data on skin color in relation to ODI calculation.

  6. Our study focused on the calculation of ODI but did not quantify the influence of calculation on clinical decision making, and so further study in this area may be useful. Nevertheless, our study provides a comprehensive analysis of the impact of using various methods to calculate ODI.

CONCLUSIONS

The current study has demonstrated how the ODI is reduced when desaturations occurring in wake epochs are excluded from the index, and more so when sleep efficiency is reduced. Additionally, ODI obtained from a method used in HSATs was lower than that obtained from a method used in PSG. This study provides insight to researchers and clinicians regarding the potential differences in results obtained from different ODI calculations and emphasizes the need for improved standardization and transparency of ODI calculation.

DISCLOSURE STATEMENT

All authors have seen and approved this manuscript. Work for this study was performed at the sleep laboratory within the Department of Respiratory and Sleep Medicine at Austin Health. The authors report no conflicts of interest.

ACKNOWLEDGMENTS

The authors thank the staff at the Austin Health Sleep Laboratory for their expertise in clinical sleep analysis.

ABBREVIATIONS

AASM

American Academy of Sleep Medicine

AHI

apnea-hypopnea index

EEG

electroencephalography

HSAT

home sleep apnea test

IQR

interquartile range

MT

monitoring time

ODI

oxygen desaturation index

OSA

obstructive sleep apnea

PSG

polysomnography

TST

total sleep time

REFERENCES

  • 1. Ng Y, Joosten SA, Edwards BA, et al . Oxygen desaturation index differs significantly between types of sleep software . J Clin Sleep Med. 2017. ; 13 ( 4 ): 599 – 605 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Troester MM, Quan SF, Berry RB, et al. ; for the American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. Version 3 . Darien, IL: : American Academy of Sleep Medicine; ; 2023. . [Google Scholar]
  • 3. Jorgensen G, Downey C, Goldin J, Melehan K, Rochford PD, Ruehland WR . An Australasian commentary on the AASM Manual for the Scoring of Sleep and Associated Events . Sleep Biol Rhythms. 2020. ; 18 ( 3 ): 163 – 185 . [Google Scholar]
  • 4. Collop NA . Home sleep apnea testing for obstructive sleep apnea in adults. Waltham, MA: UpToDate, Inc. 2017. http://www.uptodate.com . Accessed June 19, 2022.
  • 5. Ruehland WR, Rochford PD, Pierce RJ, Singh P, Thornton AT . External proficiency testing improves inter-scorer reliability of polysomnography scoring . Sleep Breath. 2023. ; 27 ( 3 ): 923 – 932 . [DOI] [PubMed] [Google Scholar]
  • 6. QSleep . QSleep Quality Assurance Program. 2021. . qsleep.com.au . Accessed October 8, 2022.
  • 7. Zhao YY, Weng J, Mobley DR, et al . Effect of manual editing of total recording time: implications for home sleep apnea testing . J Clin Sleep Med. 2017. ; 13 ( 1 ): 121 – 126 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Bland JM, Altman DG . Measuring agreement in method comparison studies . Stat Methods Med Res. 1999. ; 8 ( 2 ): 135 – 160 . [DOI] [PubMed] [Google Scholar]
  • 9. American Academy of Sleep Medicine Task Force . Sleep-related breathing disorders in adults: recommendations for syndrome definition and measurement techniques in clinical research. The Report of an American Academy of Sleep Medicine Task Force . Sleep. 1999. ; 22 ( 5 ): 667 – 689 . [PubMed] [Google Scholar]
  • 10. Norman MB, Middleton S, Sullivan CE . The use of epochs to stage sleep results in incorrect computer-generated AHI values . Sleep Breath. 2011. ; 15 ( 3 ): 385 – 392 . [DOI] [PubMed] [Google Scholar]
  • 11. Wilson DL, Tolson J, Churchward TJ, Melehan K, O’Donoghue FJ, Ruehland WR . Exclusion of EEG-based arousals in wake epochs of polysomnography leads to underestimation of the arousal index . J Clin Sleep Med 2022. ; 18 ( 5 ): 1385 – 1393 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Norman RG, Pal I, Stewart C, Walsleben JA, Rapoport DM . Interobserver agreement among sleep scorers from different centers in a large dataset . Sleep. 2000. ; 23 ( 7 ): 901 – 908 . [PubMed] [Google Scholar]
  • 13. Escourrou P, Grote L, Penzel T, et al. ESADA Study Group . The diagnostic method has a strong influence on classification of obstructive sleep apnea . J Sleep Res. 2015. ; 24 ( 6 ): 730 – 738 . [DOI] [PubMed] [Google Scholar]
  • 14. Tan HL, Gozal D, Ramirez HM, Bandla HPR, Kheirandish-Gozal L . Overnight polysomnography versus respiratory polygraphy in the diagnosis of pediatric obstructive sleep apnea . Sleep. 2014. ; 37 ( 2 ): 255 – 260 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Bianchi MT, Goparaju B . Potential underestimation of sleep apnea severity by at-home kits: rescoring in-laboratory polysomnography without sleep staging . J Clin Sleep Med. 2017. ; 13 ( 4 ): 551 – 555 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Tkacova R, McNicholas WT, Javorsky M, et al. European Sleep Apnoea Database Study Collaborators . Nocturnal intermittent hypoxia predicts prevalent hypertension in the European Sleep Apnoea Database cohort study . Eur Respir J. 2014. ; 44 ( 4 ): 931 – 941 . [DOI] [PubMed] [Google Scholar]
  • 17. Yamauchi M, Nakano H, Maekawa J, et al . Oxidative stress in obstructive sleep apnea . Chest. 2005. ; 127 ( 5 ): 1674 – 1679 . [DOI] [PubMed] [Google Scholar]
  • 18. Punjabi NM, Newman AB, Young TB, Resnick HE, Sanders MH . Sleep-disordered breathing and cardiovascular disease: an outcome-based definition of hypopneas . Am J Respir Crit Care Med. 2008. ; 177 ( 10 ): 1150 – 1155 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Hang LW, Wang HL, Chen JH, et al . Validation of overnight oximetry to diagnose patients with moderate to severe obstructive sleep apnea . BMC Pulm Med. 2015. ; 15 ( 1 ): 24 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Oeverland B, Skatvedt O, Kvaerner KJ, Akre H . Pulseoximetry: sufficient to diagnose severe sleep apnea . Sleep Med. 2002. ; 3 ( 2 ): 133 – 138 . [DOI] [PubMed] [Google Scholar]
  • 21. Rashid NH, Zaghi S, Scapuccin M, Camacho M, Certal V, Capasso R . The value of oxygen desaturation index for diagnosing obstructive sleep apnea: a systematic review . Laryngoscope. 2021. ; 131 ( 2 ): 440 – 447 . [DOI] [PubMed] [Google Scholar]
  • 22. Chung F, Liao P, Elsaid H, Islam S, Shapiro CM, Sun Y . Oxygen desaturation index from nocturnal oximetry: a sensitive and specific tool to detect sleep-disordered breathing in surgical patients . Anesth Analg. 2012. ; 114 ( 5 ): 993 – 1000 . [DOI] [PubMed] [Google Scholar]
  • 23. Terrill PI . A review of approaches for analysing obstructive sleep apnoea-related patterns in pulse oximetry data . Respirology. 2020. ; 25 ( 5 ): 475 – 485 . [DOI] [PubMed] [Google Scholar]
  • 24. Flemons WW, Douglas NJ, Kuna ST, Rodenstein DO, Wheatley J . Access to diagnosis and treatment of patients with suspected sleep apnea . Am J Respir Crit Care Med. 2004. ; 169 ( 6 ): 668 – 672 . [DOI] [PubMed] [Google Scholar]
  • 25. Casal R, Di Persia LE, Schlotthauer G . Sleep-wake stages classification using heart rate signals from pulse oximetry . Heliyon. 2019. ; 5 ( 10 ): e02529 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Baumert M, Cowie MR, Redline S, et al . Sleep characterization with smart wearable devices: a call for standardization and consensus recommendations . Sleep. 2022. ; 45 ( 12 ): zsac183 . [DOI] [PubMed] [Google Scholar]
  • 27. Schnall RP, Sheffy JK, Penzel T . Peripheral arterial tonometry—PAT technology . Sleep Med Rev. 2022. ; 61 : 101566 . [DOI] [PubMed] [Google Scholar]
  • 28. Pretto JJ, Roebuck T, Beckert L, Hamilton G . Clinical use of pulse oximetry: official guidelines from the Thoracic Society of Australia and New Zealand . Respirology. 2014. ; 19 ( 1 ): 38 – 46 . [DOI] [PubMed] [Google Scholar]
  • 29. Kirszenblat R, Edouard P . Validation of the Withings ScanWatch as a wrist-worn reflective pulse oximeter: prospective interventional clinical study . J Med Internet Res. 2021. ; 23 ( 4 ): e27503 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Smith MT, McCrae CS, Cheung J, et al . Use of actigraphy for the evaluation of sleep disorders and circadian rhythm sleep-wake disorders: an American Academy of Sleep Medicine clinical practice guideline . J Clin Sleep Med. 2018. ; 14 ( 7 ): 1231 – 1237 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Acker JG, Becker-Carus C, Büttner-Teleaga A, et al . The role of actigraphy in sleep medicine . Somnologie (Berl). 2021. ; 25 ( 2 ): 89 – 98 . [Google Scholar]
  • 32. Sjoding MW, Dickson RP, Iwashyna TJ, Gay SE, Valley TS . Racial bias in pulse oximetry measurement . N Engl J Med. 2020. ; 383 ( 25 ): 2477 – 2478 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Al-Halawani R, Charlton PH, Qassem M, Kyriacou PA . A review of the effect of skin pigmentation on pulse oximeter accuracy . Physiol Meas. 2023. ; 44 ( 5 ): 05TR01 . [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine are provided here courtesy of Springer

RESOURCES