Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Apr 1.
Published in final edited form as: J Clin Hypertens (Greenwich). 2013 Feb 12;15(4):279–288. doi: 10.1111/jch.12073

Screening for Severe Obstructive Sleep Apnea Syndrome in Hypertensive Outpatients

Indira Gurubhagavatula 1,2,3, Barry G Fields 2,3, Christian R Morales 2, Sharon Hurley 2, Grace W Pien 2,3, Lindsay C Wick 2, Bethany A Staley 2, Raymond R Townsend 4, Greg Maislin 2
PMCID: PMC3621016  NIHMSID: NIHMS432825  PMID: 23551728

Abstract

We attempted to validate a two-stage strategy to screen for severe obstructive sleep apnea syndrome (s-OSAS) among hypertensive outpatients, with polysomnography (PSG) as the gold standard. Using a prospective design, we recruited outpatients with hypertension from medical outpatient clinics. Interventions included: 1) assessment of clinical data; 2) home sleep testing (HST); and 3) 12-channnel, in-laboratory PSG. We developed models using clinical or HST data alone (single-stage models) or clinical data in tandem with HST (two-stage models) to predict s-OSAS. For each model, we computed area-under-receiver-operating-characteristic curves (AUC), sensitivity, specificity, negative likelihood ratio, and negative post-test probability (NPTP). Models were then rank-ordered based upon AUC values and NPTP. HST used alone had limited accuracy (AUC=0.727, ,NPTP = 2.9%). However, models that used clinical data in tandem with HST were more accurate in identifying s-OSAS, with lower NPTP: 1) facial morphometrics (AUC=0.816, NPTP=0.6%); 2) neck circumference (AUC=0.803, NPTP=1.7%); and Multivariable Apnea Prediction Score (AUC = 0.799, NPTP =1.5%) where sensitivity, specificity and NPTP were evaluated at optimal thresholds. Therefore, HST combined with clinical data can be useful in identifying s-OSAS in hypertensive outpatients, without incurring greater cost and patient burden associated with in-laboratory PSG. These models were less useful in identifying OSAS of any severity.

Keywords: Obstructive Sleep Apnea Syndrome, Hypertension, Polysomnography, Home Sleep Testing

Introduction

Obstructive sleep apnea (OSA) affects approximately one third of individuals with secondary hypertension and is one of its major identifiable causes.1 Large-scale studies that associated2 and implicated OSA in the development of incident hypertension3 support this designation, even after controlling for obesity, a major risk factor for OSA. Some randomized trials have shown that treating OSA with positive airway pressure (PAP) reduces blood pressure,47 particularly if OSA is severe and participants are sleepy.4, 5, 811 These results, along with the high prevalence of OSA among sufferers of hypertension, support screening patients with hypertension for severe sleep obstructive sleep apnea syndrome (s-OSAS; severe OSA associated with sleepiness).12

In-laboratory polysomnogram (PSG) is unsuitable for screening13 due to complexity, expense, and inaccessibility. Signs and symptoms may identify persons at risk for OSA,14, 15 using questionnaires.16 While particularly useful in lean subjects,16 symptoms are often nonspecific or under-reported. Facial morphometrics17 have not added predictive value to body mass index (BMI, a proxy for obesity), which has been used for risk assessment. In 2005, the American Academy of Sleep Medicine (AASM) deemed current clinical models insufficient for predicting apnea severity.18

We previously validated a two-stage screening tool for OSA.19, 20 We first applied a risk score that combined data from symptoms, age, gender, and BMI to everyone. In the second stage, we conducted overnight oximetry in a subset at intermediate risk. Portable sleep monitors21 that assess respiratory effort and airflow in addition to oximetry22 have since become available, allowing for unattended home sleep testing (HST).

In the current study, we used HST to validate our two-stage model in hypertensive outpatients. We screened for s-OSAS since the greatest benefits from PAP, including reduction in BP 4, 9, 10 and cardiovascular event-rates, 23 occur in this group. The two-stage screening tool was also applied to hypertensive patients with any OSA associated with sleepiness, regardless of OSA severity.

Methods

The Institutional Review Boards of the University of Pennsylvania and Philadelphia VA Medical Centers approved this protocol. All subjects provided informed consent.

Subject selection

We recruited consecutive outpatients with hypertension aged 30–65 years from internal medicine practices at the Philadelphia VA Medical Center and the Hypertension Clinic at the University of Pennsylvania. We defined hypertension as systolic BP ≥140 mm Hg, diastolic BP ≥90 mm Hg, or the use of any antihypertensive medication.2 We excluded those who had prior PSGs, or could not participate because of self-reported illness, pain, or circadian sleep disturbances (Figure 1).

Figure 1. Flow Diagram.

Figure 1

Abbreviations: PSG = polysomnogram; HST = home sleep test; MVAP = multivariable apnea prediction score

Interventions offered for all subjects

1. Demographics, Apnea Symptoms, Physical Examination

Demographics, symptoms, tobacco and alcohol use, and the Epworth Sleepiness Scale (ESS)24 were self-reported. We measured BMI and neck circumference (NC).

2. Facial Morphometrics Score combined BMI, NC, degree of overjet of the maxilla over the mandible, palatal height, and intermolar width. Higher scores signified greater apnea risk.17

3. Home Sleep Testing (HST) (AutoSet PDS, ResMed Corp., San Diego, CA) consisted of unattended pulse oximetry, chest and abdominal movement, and airflow by nasal pressure for one night. Following technician-led instruction, subjects self-applied the sensors at home, wore them for one night, and returned them in-person. We applied the AutoSet’s automated scoring algorithm to downloaded data 25, which used a 50% drop in airflow from the baseline to define a hypopnea and a 75% drop in airflow for ≥10 seconds for an apnea. Desaturation was not required to score apneas or hypopneas.25 Therefore, while oximetry was worn according to the manufacturer’s protocol, it was not used in automated scoring. We chose in-home monitoring and automated scoring to reflect typical clinical practice.

4. In-laboratory PSG recorded electro-encephalograms, eye, chin and pre-tibial muscle activity, electrocardiography, oximetry, chest and abdominal respiratory effort, and airflow by nasal cannula and oral thermistor (Formerly Sandman System, nowEmbla® Systems Inc, Broomfield, CO). Technicians performed PSGs prospectively, after questionnaires and unattended sleep studies, while blind to questionnaire data, apnea risk scores, facial morphometrics, and unattended sleep study data. They scored PSGs26 and computed the apnea-hypopnea index (AHI) as ([apneas+hypopneas]/hours of sleep time). An apnea was ≥10 seconds of airflow cessation. A hypopnea required ≥10 seconds of reduction in airflow: 1) either ≥50%, or 2) ≥30% with ≥4% fall in SaO2 or an arousal.

Missing Data

We conducted multiple imputation for missing age, BMI, NC, ESS, symptoms, PSG AHI and unattended HST AHI (uAHI) values using a well-validated method,27 using PROC MI (SAS Systems, Cary, NC).

Case definition

We defined a case of s-OSAS as AHI≥30events/hour + ESS>10.8 Alternatively, we considered AHI≥5 events/hour + ESS>10 as any case of OSAS.

Risk assessment

For all subjects with available data, we assessed risk via single- and two-stage strategies:

1. Single-stage strategies (risk scores): Our base model, the multivariable apnea prediction (MVAP) score, combined symptoms, BMI, age, and gender to compute s-OSAS risk.16 To elucidate symptoms, subjects self-rated their frequency of snoring, choking, and witnessed apneas on a Likert scale, range=0–4. We combined this score with BMI, age, and gender using a previously validated16 multiple logistic regression (SAS v9.2, Cary, NC) and obtained a risk score (range=0–1, 1=high risk, 0=absent risk for s-OSAS). Individual analyses for BMI, NC, age, facial morphology, and symptoms were also completed.

2. Two-Stage Strategies

First stage: We categorized subjects into high, intermediate or low risk groups based on their first-stage test score (e.g. the MVAP). “Upper bound” separated high from intermediate risk groups. “Lower bound” separated intermediate from low risk groups (FIGURE 2; see “Model development and validation” below). We predicted that those with risk score>upper bound would have s-OSAS, and those with risk score<lower bound would not.

Figure 2. Study Design.

Figure 2

Abbreviations: HST = home sleep test; AHI = apnea-hypopnea index; s-OSAS = severe OSA associated with sleepiness

Second stage: Those with intermediate scores (between upper and lower bound) were predicted to have s-OSAS if the uAHI was >“uAHIthreshold.” If uAHI <uAHIthreshold, they were predicted to be free of s-OSAS.19 The analysis was repeated for any patient with OSAS regardless of severity.

Model development and validation

While single-stage strategies had a single parameter (i.e., cut-point), two-stage strategies had three parameters: the upper and lower bounds and the uAHIthreshold. The “optimal” parameter set minimized misclassification rate and maximized specificity; we identified it using an exhaustive enumeration algorithm with SAS v9.2 (Cary, NC) (see below). Using this “optimal” parameter set, we computed sensitivity, specificity, negative likelihood ratio (LRneg),28 and negative post-test probability in a randomly-selected 70% estimation sample and in the remaining 30% validation sample. We also computed the AUC29 in both samples. Using bootstrap resampling, we generated 95% confidence intervals (CI’s) for AUC, sensitivity, specificity, and LRneg.

Determination of the optimal cut-points for the single-stage strategies

We first selected 9 candidate cut-points for each model that divided the sample into 10 groups of equal number. For each candidate cut-point, we calculated error rate as (FP + 1.2 X FN). We weighted false negatives (missed cases) at 1.2 times the false positives, because we presumed that missing cases of severe OSAS was a more serious error than falsely labeling a normal patient with severe OSAS. In sensitivity analyses, we considered values from 1.0–2.0 in increments of 0.1. We chose 1.2 when our sensitivity analysis showed that higher weights did not improve the LRneg. The optimum cut-point was identified as the one that minimized this error rate. In cases of ties, the solution that maximized specificity was chosen. We have outlined this rationale elsewhere.30, 31

Determination of the optimal cut-points for the two-stage strategies

The two-stage strategy had three parameters: the upper and lower bounds of the first stage test and uAHIthreshold. We considered six uAHI-thresholds: 5, 10, 15, 20, 25, and 30/hour. For the first stage test, we considered candidate cut-points (9 upper and 9 lower bound values) as described for the single-stage strategies. Thus, we considered a total of 6X9X9=486 possible parameter sets and computed (FP + 1.2 X FN) for each parameter set. We selected the optimum parameter set as the one that minimized this error rate and maximized specificity.

Quantification of the discriminatory power of each model

Using the optimum cut-points of each of the models, we computed sensitivity, specificity and LRneg. We computed LRneg as (1-sensitivity)/specificity. We computed the negative post-test probability associated with LRneg, after applying a Bayesian formula and the prevalence of OSA in a 70% estimation sample (6.9%). We also computed a prediction discrimination index as the area under receiver-operating-characteristic curve (AUC) constructed from the logistic model predicted values. For the 30% validation sample, this was done using the formula AUC = (D+1)/2, where D represents the Somer’s D statistic of the proc logistic function. We rank-ordered the discriminatory power of the models by sorting them based on the value of AUC, with larger values of AUC denoting the best-performing models. The AUC for each model is an estimate of the probability that a randomly selected case has a larger predicted value than a randomly selected control.

Generation of non-parametric 95% bootstrap confidence limits

Using the estimation cohort, we generated 1000 bootstrap re-samples with replacement via SAS programming. For each re-sample, we computed 1) AUC and 2) the optimum cut-point and its associated sensitivity, specificity and LRneg. We selected from these distributions the 2.5th and 97.5th percentile values of AUC, sensitivity, specificity, and LRneg to define the 95% non-parametric confidence limits.

Model validation

We computed percent difference as (AUC in validation sample – AUC in estimation sample)/(AUC in estimation sample). We categorized models as “robust” if this percent difference in AUC between validation and estimation samples was ≥ −5%, and “not robust” if this percent difference in AUC was <-5%. The optimal model was the one which was robust and had the highest AUC.

Results

Subject characteristics

We enrolled 250 patients (Table I) after excluding those with self-reported pain (N=7), medical illness (N=7), jet lag (N=2), or night shift work (N=19). Sample, estimation, and validation subsets were similar, except for a larger percentage of Caucasians in the estimation subset. The average (SD) age, NC, and BMI were: 52.6 (7.7) years, 42.2 (4.5) cm, and 32.1 (7.4) kg/m2 respectively. S-OSAS and any OSAS frequency distribution by BMI category is shown (FIGURE 3). In the subjects with and without s-OSAS, the average (SD) blood pressures were 145/87 (12/11) and 139/82 (15/10), respectively.

Table I.

Subject Characteristics

Participants
(N = 250)*
Estimation
Sample
(N= 146)
Validation
Sample
(N= 52)
p-value
Age, Gender, Race
   Mean Age (SD), years 52.6 (7.70) 52.6 (7.49) 53.2 (8.37) 0.95
   Men (%) 200 (80.0%) 119 (81.5%) 43 (82.7%) 0.48
   Caucasian (%) 101 (40.4%) 51 (34.9%) 15 (28.9%) 0.04
   African-American (%) 147 (58.8%) 95 (63.7%) 37 (71.2%) 0.06
Apnea Risk, Mean (SD)
   BMI, kg/m2 32.1 (7.36) 32.6 (7.35) 31.8 (6.37) 0.23
   NC, cm 42.2 (4.53) 42.5 (4.38) 42.0 (4.39) 0.23
   MVAP 0.53 (0.25) 0.56 (0.23) 0.49 (0.31) 0.06
   Morphometrics 68.5 (24.7) 69.7 (24.5) 70.9 (24.9) 0.45
Smokers (%)
   Current smokers 72 (28.8%) 38 (26.0%) 22 (42.3%) 0.25
   Ever smokers 83 (33.2%) 43 (29.4%) 24 (46.2%) 0.13
Blood Pressure, Mean (SD)
   Systolic 139.0 (14.9) 138.3 (15.0) 141.0 (14.5) 0.26
   Diastolic 82.2 (9.2) 82.5 (9.6) 81.4 (8.3) 0.48
Alcohol consumers (%)
   At least one drink/day 155 (61.8%) 86 (58.9%) 33 (63.4%) 0.25
   At least 12 oz beer per week 102 (40.6%) 56 (39.7%) 21 (40.4%) 0.74
   At least 4 oz spirit per week 68 (27.3%) 36 (25.0%) 17 (32.7%) 0.32
   At least 6 oz wine per week 71 (28.4%) 41 (28.4%) 16 (30.8%) 0.99
ESS score >10 62 (25.6%) 45 (31.0%) 17 (32.7%) 0.21
OSAS by PSG (%)
   Mild (AHI ≥5 & ESS>10) 49 (24.7%) 33 (22.8%) 16 (30.8%) 0.25
   Mod (AHI≥15 & ESS>10) 29 (14.7%) 20 (13.8%) 9 (17.3%) 0.54
   Severe (AHI≥30 & ESS>10) 15 (7.6%) 10 (6.9%) 5 (9.6%) 0.53

Abbreviations: SD = Standard deviation; BMI = body mass index; NC = neck circumference; MVAP = multivariable apnea prediction score; OSAS = obstructive sleep apnea syndrome; AHI = apnea-hypopnea index; ESS = Epworth sleepiness score; PSG = polysomnogram.

*

198 participants (146 in Estimation Sample + 52 in Validation Sample) underwent in-lab PSG.

Figure 3. Frequency distribution of OSAS by BMI category.

Figure 3

Abbreviations: BMI = Body Mass Index; s-OSAS =severe obstructive sleep apnea syndrome

Apnea risk

The mean (SD) for MVAP scores (n=224) was 0.53 (0.25). The mean (SD) morphometric score (n=160) was 68.5 (24.7); the score could not be computed for remaining participants due to missing molars. Whereas a score of ≥70 carried 98% sensitivity and 100% specificity in a prior study,17 in our study, the optimal cut-point of 80 yielded sensitivity of 66.9%, specificity of 70.1% and negative post-test probability of 0.6% for s-OSAS.

In-lab sleep studies

Of 198/250 (79.2%) who agreed to in-lab PSG, 159 (80%) had OSA: 67 (34%) mild (AHI 5–14.9/hour); 43 (22%) moderate (AHI 15–29.9/hour); 49 (25%) severe (AHI≥30/hour). Sleepiness (ESS>10) occurred in 62/242 (25.6%). A total of 49 patients (24.7%) had at least mild OSAS (AHI≥5/hour plus ESS>10), and 15 patients (7.6%) had s-OSAS (AHI ≥30/hour plus ESS>10) (Table I). The cohort’s mean (SD) AHI was 22.5 (22.9)/hour.

Unattended sleep studies

Of 208 subjects who agreed to unattended HST, 192 had adequate recordings. Of these, 76.9% had usable data after one attempt, and an additional 20.2% after a second attempt. The mean (SD) uAHI was 15 (13.8)/hour. A total of 25.7% had uAHI ≥5/hour with ESS >10, and 6% had uAHI ≥30/hour with ESS >10. Unattended HST underestimated in-laboratory AHI for values between 33 – 45/hour (Figure 4).

Figure 4. Bland-Altman Analysis.

Figure 4

Unattended sleep studies tended to underestimate in-laboratory AHI for an approximate value of 3.5 – 3.8 log units/hour, which corresponds to an average of portable AHI + PSG AHI of 33 – 45/hour. We used log transformation of the AHIs in our prediction models to reduce the undue influence of large values.

Abbreviations: AHI = Apnea-Hypopnea Index; uAHI = AHI from unattended home sleep test; PSG = polysomnogram

AUCs and percent difference: s-OSAS

AUCs in the estimation and validation subsets are listed for single-stage and two-stage models, along with percent difference between those subsets (Table II). All single-stage models were robust, with percent difference ranging from −0.1% to 13.3%, and all (except age) had similar discriminatory power, with AUCs ranging from 0.663 for symptoms to 0.689 for NC (most useful). Age was not a useful single-stage model, with an AUC of 0.463.

Table II.

Relative Discriminatory Power and Robustness of Models Used in Single- and Two-Stage Algorithms (N=250)

Single-stage models Two-stage models*
MODEL AUC,
Estimation
Sample (95% CI)
Difference**
(%)
AUC,
Estimation
Sample
(95% CI)
Difference**
(%)
NC 0.689 5.1 0.803 −0.1
(0.674 – 0.704) (0.786 – 0.820)
Morphometrics 0.685 4.5 0.816 8.2
(0.663 – 0.707) (0.801 – 0.830)
MVAP 0.684 1.4 0.799 6.6
(0.668 – 0.700) (0.777 – 0.822)
BMI 0.665 7.2 0.768 7.8
(0.646 – 0.685) (0.744 – 0.793)
Symptoms 0.663 6.3 0.735 3.7
(0.640 – 0.685) (0.710 – 0.760)
Age 0.463 13.3 0.718 6.7
(0.444 – 0.481) (0.700 – 0.737)

Abbreviations: NC = neck circumference; BMI = body mass index; MVAP = Multivariable Apnea Prediction Score; AUC = area under the receiver operating characteristic curve; CI = confidence interval.

*

Model applied as first stage, followed by HST for those in intermediate risk group

**

Difference = (AUC in validation sample – AUC in estimation sample)/(AUC in estimation sample)

The two-stage models (Table II) using HST in the second stage were more accurate than one-stage models, with higher AUC values ranging from 0.718 (age) to 0.816 (morphometrics). The most useful one-stage model, NC, performed better when combined with unattended sleep studies, yielding an AUC of 0.803. The top 3 performing two-stage models (AUC’s near or above 0.800) were NC, morphometrics, and MVAP.

Accuracy of models: sensitivity, specificity, negative post-test probability: s-OSAS

We report the accuracy of these single- and two-stage models (Table III, upper and lower panels, respectively). In the single-stage model, use of MVAP ≥0.483 had the greatest sensitivity for detecting s-OSAS (91.5%), and the second-lowest NPTP for s-OSAS (1.5%). Use of NC ≥42.6 cm had similar sensitivity (85.6%) and NPTP (1.7%). Facial morphometrics score ≥ 80.0 offered the highest specificity (70.1%) but the second-lowest sensitivity (66.9%) for detecting for s-OSAS, with the lowest NPTP of s-OSAS among single stage models (0.6%).

Table III.

Discriminatory Power of Models for Predicting Severe Obstructive Sleep Apnea Syndrome*

One-stage model Cut-point SENS SPEC Neg
LR
NPTP
NC (cm) 42.550 0.856 0.510 0.244 0.017
Morphometrics 79.998 0.669 0.701 0.473 0.006
MVAP 0.483 0.915 0.439 0.190 0.015
BMI (kg/m2) 30.937 0.819 0.505 0.334 0.018
Symptoms 1.000 0.774 0.551 0.394 0.035
Age (years) 48.275 0.629 0.299 1.641 0.029
Two-stage model LB UB HST
cut-point
SENS SPEC Neg
LR
NPTP
NC (cm) 40.65 46.55 23.5 0.836 0.770 0.212 0.020
Morphometrics 40 95 23.25 0.942 0.689 0.076 0.037
MVAP 0.41 0.79 18 0.882 0.716 0.162 0.015
BMI (kg/m2) 25 41.25 23.75 0.829 0.708 0.219 0.027
Symptoms 1.2 3.1 20 0.629 0.840 0.442 0.031
Age (years) 32.5 65 15.25 0.743 0.694 0.368 0.119

Abbreviations: NC = neck circumference; BMI = body mass index; SENS = sensitivity; MVAP = Multivariable Apnea Prediction Score; SPEC = specificity; NegLR = negative likelihood ratio; NPTP = Negative post-test probability; LB = lower bound; UP = upper bound; uAHI = Apnea-Hypopnea Index from unattended home sleep test; HST = Home Sleep Test

*

Severe obstructive sleep apnea syndrome = AHI ≥ 30 events/hour + ESS > 10

For unattended sleep studies used alone (Table IV), the optimal cut-point for detecting S-OSAS was 16 events/hour with sensitivity=74.7%, specificity=70.6%, LRneg=0.357 and NPTP= 2.9%.

Table IV.

Discriminatory Power of Home Sleep Testing for Any or Severe Obstructive Sleep Apnea Syndrome

HST
Cutpoint
(uAHI)
AUC SENS SPEC Neg LR NPTP
Severe OSAS 16.0 0.727 0.747 0.706 0.357 0.029
Any OSAS 8.9 0.591 0.718 0.478 0.573 0.159

Abbreviations: AUC = area under the receiver operating characteristic curve; SENS = sensitivity; SPEC = specificity; NegLR = negative likelihood ratio; NPTP = Negative post-test probability; HST = Home Sleep Test; uAHI = Apnea-Hypopnea Index from unattended HST; s-OSAS = severe obstructive sleep apnea syndrome (AHI ≥ 30 events/hour + ESS > 10); any OSAS = (AHI ≥ 5 events/hour + ESS > 10)

We evaluated HST when it was used in tandem with one stage models (MVAP, NC, facial morphometry). The optimal upper and lower bounds for these first-stage tests, and optimal cutpoint for HST are shown in Table III, lower panel. We found that: 1) MVAP followed by HST had 88.2% sensitivity, 71.6% specificity, and NPTP of 1.5%. 2) NC followed by HST had sensitivity of 83.6% and specificity of 77.0% with a NPTP of 2.0%. 3) Facial morphometry followed by HST identified s-OSAS with 94.2% sensitivity, 68.9% specificity, and NPTP of 3.7%, but morphometry was not feasible in 64/224 (28.6%) because of the absence of teeth.

Accuracy of Models: Any OSAS

AUCs in the estimation and validation subsets using criteria for any OSAS (AHI ≥ 5 events/hour; ESS>10) are listed for single-stage and two-stage models (Table V). This definition yielded a higher prevalence of 22.9%. All models were less useful for predicting any OSAS than for predicting s-OSAS.

Table V.

Discriminatory Power of Models for predicting any obstructive sleep apnea syndrome*

One-Stage Model AUC Cut-point SENS SPEC Neg
LR
NPTP
NC (cm) 0.612 42.650 0.849 0.520 0.254 0.078
Morphometrics 0.579 50.976 0.872 0.299 0.428 0.124
MVAP 0.614 0.559 0.694 0.565 0.524 0.148
BMI (kg/m2) 0.609 29.168 0.794 0.444 0.460 0.132
Symptoms 0.630 0.916 0.790 0.537 0.380 0.112
Age (years) 0.507 45.825 0.780 0.236 1.013 0.251
Two-Stage Model AUC LB UB HST
cut-point
SENS SPEC Neg
LR
NPTP
NC (cm) 0.658 39.1 43.4 21 0.733 0.584 0.436 0.126
Morphometrics 0.658 31 96 11.25 0.867 0.449 0.279 0.085
MVAP 0.672 0.255 0.69 13.5 0.805 0.540 0.349 0.104
BMI (kg/m2) 0.649 24 36.25 13.5 0.789 0.509 0.394 0.115
Symptoms 0.652 1.175 3.175 20 0.580 0.723 0.578 0.161
Age (years) 0.606 30.75 59.25 11 0.772 0.441 0.436 0.126

Abbreviations: NC = neck circumference; BMI = body mass index; MVAP = multivariable apnea prediction score; AUC = area under the receiver operating characteristic curve; SENS = sensitivity; SPEC = specificity; NegLR = negative likelihood ratio; NPTP = Negative post-test probability; LB = lower bound; UP = upper bound. HST = home sleep testing; uAHI = Apnea-Hypopnea Index from unattended HST

*

Any obstructive sleep apnea syndrome = AHI ≥ 5 events/hour + ESS > 10

Unattended HST was not useful in case-finding any OSAS (AUC=0.591). (Table IV). Overall, two-stage models that screen for any OSAS (Table V, lower panel) were significantly less useful than two-stage models that screen for s-OSAS (AUC range = 0.606–0.672 vs. 0.718–0.816, respectively; Table III, lower panel).

Missing data

Multiple imputation did not introduce significant bias. Percent difference in AUC between pre-imputation and post-imputation data was minimal (0.2% for NC) to absent (all other single-stage models).

Discussion

Among hypertensive outpatients, between 30 and 40% have OSA.32 Our case definition, severe OSAS with sleepiness, was selected based on data from the Sleep Heart Health Study (SHHS),11 which suggested that self-reported sleepiness may indicate susceptibility to cardiovascular sequelae of OSA and mark patients who should receive treatment priority. Blood pressure reduction may also be greater in OSA patients who receive PAP treatment and report sleepiness.4, 9, 10 Even with this added criterion, the proportion of our cohort with s-OSAS was nearly twice that of other middle-aged populations.33 This finding is not surprising, given the high prevalence of risk factors for OSA among veterans, including hypertension, obesity, male gender, African-American race, and habitual alcohol consumption (Table I).34

Single-stage models with the best discriminatory power in screening for s-OSAS were NC, facial morphometrics, and MVAP. MVAP and NC proved particularly good screening tools with sensitivities of 91.5% and 85.6% respectively at optimal cut points of MVAP = 0.483 and NC = 42.6 cm. Morphometry was less feasible than MVAP or NC because 28.6% of the sample lacked first, second or third molars, so intermolar distance could not be measured to compute the facial morphometry score. Given the relative ease of measuring NC in routine clinical practice, our findings support to its utility for initial patient evaluation for s-OSAS.

Despite its mathematical complexity,16 the MVAP can be computed using mobile, desktop, or web-based applications using easily-obtainable clinical information, which makes the MVAP a viable option in mass s-OSAS screening. MVAP calculation also lends itself well to emerging technologies such as sleep telemedicine, where BMI, gender, age, and symptoms can be obtained remotely and used to obtain the risk score.

Each of the three top-performing single-stage models demonstrated improved discriminatory power when used in tandem with HST. Facial morphometrics followed by HST showed the best power (AUC = 0.816), although NC and MVAP were almost as powerful in the two-stage models (AUCs = 0.803 and 0.799 respectively). In all models, combining easily-obtainable clinical information combined with HST (cut point AHI = 16 events/hour) increased discriminatory power with higher AUC values.

We evaluated the impact two-stage algorithms would have on the volume of in-laboratory PSGs. The two-stage models using NC, MVAP or BMI were negative for s-OSAS 66–67% of the time. Given a missed case rate of only 0.5–1% in this group, one could argue that confirmatory PSGs are probably not justified on an economic basis, unless the cost of missing a case proved to be inordinately high. A reduction in in-laboratory PSGs by 66–67% may not only lower diagnostic costs, but also reduce patient burden, while increasing availability and accessibility of in-laboratory studies for those who are not candidates for unattended studies. Given the potential for adverse health effects of missing cases, the low missed case rate is a desirable feature of this program.

All models had less discriminatory power when we considered OSAS cases of any severity. HST alone was marginally useful, with an AUC of only 0.591 for identifying any OSAS. Adding clinical data to these studies improved accuracy in finding any OSAS only modestly; AUCs range 0.606–0.672 for two-stage models.

Our hypopnea definition did not require desaturation; this liberal definition is appropriate because it reduces the likelihood of missing cases during screening. Used for Medicare coverage for PAP, definitions that require a desaturation12 would lower disease prevalence, and therefore improve the negative predictive value of our screening models. Additionally, our study reveals a modest decline in HST sensitivity for detecting any OSA (AHI ≥5 events/hour) when pulse oximetry is not considered. HST devices incorporating pulse oximetry have sensitivity of at least 82.5% for detecting any OSA,35 while our AutoSet-based scoring had 74.7% sensitivity.

Feasibility

We conducted unattended sleep studies in 208/250 (83.2%) subjects. We obtained usable data in 192/208 (92.3%). Only 48/208 (23.1%) required repeat testing. Similar failure rates were found in another study.36 Future monitors may be simpler to assemble, yet more accurate and successful. While we obtained BMI, NC, and symptom scores in nearly everyone, absent molars precluded us from obtaining morphometric scores in nearly one-third.

Comparison with prior studies and limitations in design

Symptoms were useful in our cohort, contrasting with other sleep-center cohorts.16 In our non-referral, more obese group, symptoms added robustness to the results, yielding comparable AUC’s in the validation and estimation samples. Thus, symptoms may be useful in general populations outside of sleep centers who have a high prevalence of OSA-related risk factors. Symptom data may be less useful, however, in occupational screening, where reporting may be inaccurate.19

BMI was a powerful predictor of OSA in prior studies,17, 19 with AUCs of 0.938 and 0.802, respectively. BMI was less useful in our program, with AUC of 0.665. We calculated the coefficient of variation (CV=SD/mean*100%) to assess whether BMI was less useful in our group due to lower variance. However, CV =22.9% in our study was comparable to CV=23.1% elsewhere.17 We surmise that unmeasured risk factors for OSA among the non-obese patients in our group may have reduced the utility of BMI in this study. Additionally, exclusion of non-hypertensive patients may have degraded BMI predictive value.

HST tended to underestimate in-laboratory AHI for the 33–45 event/hour range. These findings are similar to those from previous work comparing HST and in-laboratory PSG.37 AHI on PSG is calculated as the number of respiratory events per hour of sleep, while HST monitors report this index based on hours of use. Since HST cannot differentiate sleep from wakefulness, its respiratory indices may be lower than on PSG.

Additionally, our AutoSet device utilized only nasal cannula pressure changes to identify apneas and hypopneas, as opposed to full PSG which utilizes nasal pressure and an oronasal thermistor in respiratory event detection. When using nasal pressure alone (without a thermistor) to detect respiratory events, less-stringent criteria to score respiratory events may reduce missed cases without raising false positive rates.38 Monitors that use more stringent criteria for scoring events than those used by the AutoSet may miss more cases, and reduce the usefulness of unattended HST.

Strengths and Limitations

Ours is the first investigation of screening for OSAS among a general population of hypertensive outpatients, rather than sleep center referrals,22 with unattended recordings of airflow and respiratory effort at home (HST).21 The AASM has published guidelines22 and the Centers for Medicare and Medicaid Services has approved the use of this type of monitor for diagnosing OSA and initiating positive airway pressure (PAP) therapy.39 The strengths of this investigation include: 1) prospective application of HST and full PSGs, after administration of first-stage screening tools; 2) blinding of PSG scorers to all other clinical data; 3) the use of a general medical rather than subspecialty-based population; and 4) use of a definition of hypertension that is consistent with that used in prior investigations, such as the SHHS.2

Given that our sample was comprised largely of Caucasian and African-American men, additional studies are needed to refine these models for other ethnic groups in whom s-OSAS may occur in thin individuals with intraoral or craniofacial risk factors,4042 and symptoms may be more important.16 The limited proportion of women (20%) in this largely veteran population reduces our ability to generalize these results to them. Similarly, while age was more useful in other settings, single-stage models that contained age in this cohort had AUC<0.7, perhaps because of the restriction of our sample to 30–65 years. Future evaluations of OSA screening strategies should include other ethnic, gender, and age groups.

Conclusions

Two-stage models can be used to screen for s-OSAS to reduce the requirement for in-laboratory PSGs in hypertensive outpatients, a population with high disease prevalence.2 Models utilizing a facial morphometric score, neck circumference, and the MVAP at optimal cut points as the first stage in these models demonstrated greatest utility. They are less helpful in screening patients for OSAS of any severity. Unattended HST works best when used in tandem with clinical data rather than used alone, particularly in screening for any OSAS, because of the high likelihood of missed cases.

Acknowledgments

Competing Interests and Funding

This work was supported by NIH grant K23 RR16068, RO1-OH009149, T32 HL07713, and by the Veterans Integrated Services Network (VISN) 4 Competitive Pilot Project Fund. ResMed, Inc. provided an unrestricted loan for use of portable diagnostic sleep study equipment (AutoSet™) and had no role in protocol development, data collection, storage, analysis, or manuscript preparation.

References

  • 1.Chobanian AV, Bakris GL, Black HR, et al. The seventh report of the joint national committee on prevention, detection, evaluation, and treatment of high blood pressure: The JNC 7 report. JAMA. 2003;289:2560–2572. doi: 10.1001/jama.289.19.2560. [DOI] [PubMed] [Google Scholar]
  • 2.Nieto FJ, Young TB, Lind BK, et al. Association of sleep-disordered breathing, sleep apnea, and hypertension in a large community-based study. Sleep Heart Health Study. JAMA. 2000;283:1829–1836. doi: 10.1001/jama.283.14.1829. [DOI] [PubMed] [Google Scholar]
  • 3.Peppard PE, Young T, Palta M, et al. Prospective study of the association between sleep-disordered breathing and hypertension. N Engl J Med. 2000;342:1378–1384. doi: 10.1056/NEJM200005113421901. [DOI] [PubMed] [Google Scholar]
  • 4.Becker HF, Jerrentrup A, Ploch T, et al. Effect of nasal continuous positive airway pressure treatment on blood pressure in patients with obstructive sleep apnea. Circulation. 2003;107:68–73. doi: 10.1161/01.cir.0000042706.47107.7a. [DOI] [PubMed] [Google Scholar]
  • 5.Pepperell JC, Ramdassingh-Dow S, Crosthwaite N, et al. Ambulatory blood pressure after therapeutic and subtherapeutic nasal continuous positive airway pressure for obstructive sleep apnoea: A randomised parallel trial. Lancet. 2002;359:204–210. doi: 10.1016/S0140-6736(02)07445-7. [DOI] [PubMed] [Google Scholar]
  • 6.Duran-Cantolla J, Aizpuru F, Montserrat JM, et al. Continuous positive airway pressure as treatment for systemic hypertension in people with obstructive sleep apnoea: Randomised controlled trial. BMJ. 2010;341:c5991. doi: 10.1136/bmj.c5991. [DOI] [PubMed] [Google Scholar]
  • 7.Haentjens P, Van Meerhaeghe A, Moscariello A, et al. The impact of continuous positive airway pressure on blood pressure in patients with obstructive sleep apnea syndrome: Evidence from a meta-analysis of placebo-controlled randomized trials. Arch Intern Med. 2007;167:757–764. doi: 10.1001/archinte.167.8.757. [DOI] [PubMed] [Google Scholar]
  • 8.Flemons WW, Buysse D, Redline S, et al. Sleep-related breathing disorders in adults: Recommendations for syndrome definition and measurement techniques in clinical research. The report of an American Academy of Sleep Medicine task force. Sleep. 1999;22:667–689. [PubMed] [Google Scholar]
  • 9.Barbe F, Mayoralas LR, Duran J, et al. Treatment with continuous positive airway pressure is not effective in patients with sleep apnea but no daytime sleepiness. A randomized, controlled trial. Ann Intern Med. 2001;134:1015–1023. doi: 10.7326/0003-4819-134-11-200106050-00007. [DOI] [PubMed] [Google Scholar]
  • 10.Robinson GV, Stradling JR, Davies RJ. Obstructive sleep apnoea/hypopnoea syndrome and hypertension. Thorax. 2004;59:1089–1094. doi: 10.1136/thx.2003.015875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kapur VK, Resnick HE, Gottlieb DJ, et al. Sleep disordered breathing and hypertension: Does self-reported sleepiness modify the association? Sleep. 2008;31:1127–1132. [PMC free article] [PubMed] [Google Scholar]
  • 12.Baumel MJ, Maislin G, Pack AI. Population and occupational screening for obstructive sleep apnea: Are we there yet? Am J Respir Crit Care Med. 1997;155:9–14. doi: 10.1164/ajrccm.155.1.9001281. [DOI] [PubMed] [Google Scholar]
  • 13.Somers VK, White DP, Amin R, et al. Sleep apnea and cardiovascular disease. An American Heart Association/American College of Cardiology Foundation scientific statement from the American Heart Association Council for High Blood Pressure Research Professional Education Committee, Council on Clinical Cardiology, Stroke Council, and Council on Cardiovascular Nursing. In collaboration with the National Heart, Lung, and Blood Institute National Center on Sleep Disorders Research (National Institutes of Health) Circulation. 2008;118:1080–1111. doi: 10.1161/CIRCULATIONAHA.107.189375. [DOI] [PubMed] [Google Scholar]
  • 14.Viner S, Szalai JP, Hoffstein V. Are history and physical examination a good screening test for sleep apnea? Ann Intern Med. 1991;115:356–359. doi: 10.7326/0003-4819-115-5-356. [DOI] [PubMed] [Google Scholar]
  • 15.Hoffstein V, Szalai JP. Predictive value of clinical features in diagnosing obstructive sleep apnea. Sleep. 1993;16:118–122. [PubMed] [Google Scholar]
  • 16.Maislin G, Pack AI, Kribbs NB, et al. A survey screen for prediction of apnea. Sleep. 1995;18:158–166. doi: 10.1093/sleep/18.3.158. [DOI] [PubMed] [Google Scholar]
  • 17.Kushida CA, Efron B, Guilleminault C. A predictive morphometric model for the obstructive sleep apnea syndrome. Ann Intern Med. 1997;127:581–587. doi: 10.7326/0003-4819-127-8_part_1-199710150-00001. [DOI] [PubMed] [Google Scholar]
  • 18.Kushida CA, Littner MR, Morgenthaler T, et al. Practice parameters for the indications for polysomnography and related procedures: An update for 2005. Sleep. 2005;28:499–521. doi: 10.1093/sleep/28.4.499. [DOI] [PubMed] [Google Scholar]
  • 19.Gurubhagavatula I, Maislin G, Nkwuo JE, et al. Occupational screening for obstructive sleep apnea in commercial drivers. Am J Respir Crit Care Med. 2004;170:371–376. doi: 10.1164/rccm.200307-968OC. [DOI] [PubMed] [Google Scholar]
  • 20.Gurubhagavatula I, Maislin G, Pack AI. An algorithm to stratify sleep apnea risk in a sleep disorders clinic population. Am J Respir Crit Care Med. 2001;164:1904–1909. doi: 10.1164/ajrccm.164.10.2103039. [DOI] [PubMed] [Google Scholar]
  • 21.Ng SS, Chan TO, To KW, et al. Validation of Embletta portable diagnostic system for identifying patients with suspected obstructive sleep apnoea syndrome (osas) Respirology. 2010;15:336–342. doi: 10.1111/j.1440-1843.2009.01697.x. [DOI] [PubMed] [Google Scholar]
  • 22.Collop NA, Anderson WM, Boehlecke B, et al. Clinical guidelines for the use of unattended portable monitors in the diagnosis of obstructive sleep apnea in adult patients. Portable monitoring task force of the American Academy of Sleep Medicine. J Clin Sleep Med. 2007;3:737–747. [PMC free article] [PubMed] [Google Scholar]
  • 23.Marin JM, Carrizo SJ, Eugenio V, Agusti AGN. Long-term cardiovascular outcomes in men with obstructive sleep apnoea-hypopnoea with or without treatment with cotinuuous positive airway pressure: an observational study. Lancet. 2005;365:1046–1053. doi: 10.1016/S0140-6736(05)71141-7. [DOI] [PubMed] [Google Scholar]
  • 24.Johns MW. A new method for measuring daytime sleepiness: The Epworth sleepiness scale. Sleep. 1991;14:540–545. doi: 10.1093/sleep/14.6.540. [DOI] [PubMed] [Google Scholar]
  • 25.Kiely JL, Delahunty C, Matthews S, et al. Comparison of a limited computerized diagnostic system (rescare autoset) with polysomnography in the diagnosis of obstructive sleep apnoea syndrome. Eur Respir J. 1996;9:2360–2364. doi: 10.1183/09031936.96.09112360. [DOI] [PubMed] [Google Scholar]
  • 26.Rechtschaffen A, Kales A. A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects. Washington, DC: U.S. Government Printing Office; 1968. [Google Scholar]
  • 27.Wayman JC. Multiple imputation for missing data: What is it and how can I use it; Presented at Annual meeting of the American Educational Research Association; Chicago IL. Center for Social Organization of Schools. Johns Hopkins University; 2003. [Google Scholar]
  • 28.Simel DL, Samsa GP, Matchar DB. Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. J Clin Epidemiol. 1991;44:763–770. doi: 10.1016/0895-4356(91)90128-v. [DOI] [PubMed] [Google Scholar]
  • 29.Gonen M. Analyzing receiver operating characteristic curves with SAS. Cary, NC: SAS Institute, Inc; 2007. [Google Scholar]
  • 30.Gurubhagavatula I, Maislin G, Nkwuo JE, Pack AI. Occupational screening for obstructive sleep apnea in commercial drivers. Am. J. Respir. Crit. Car Med. 2004;170:371–631. doi: 10.1164/rccm.200307-968OC. [DOI] [PubMed] [Google Scholar]
  • 31.Morales CR, Hurley S, Wick LC, Staley B, Pack FM, Gooneratne NS, Maislin G, Pack A, Gurbhagavatula I. In-Home, Self-Assembled Sleep Studies are useful in Diagnosing Sleep Apnea in the Elderly. Sleep. 2012;35(11):1491–1501. doi: 10.5665/sleep.2196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Worsnop CJ, Naughton MT, Barter CE, Morgan TO, Anderson AI, Pierce RJ. The prevalence of obstructive sleep apnea in hypertensives. Am J Respir Crit Care Med. 1998;157(1):111–115. doi: 10.1164/ajrccm.157.1.9609063. [DOI] [PubMed] [Google Scholar]
  • 33.Young T, Palta M, Dempsey J, et al. The occurrence of sleep-disordered breathing among middle-aged adults. N Engl J Med. 1993;328:1230–1235. doi: 10.1056/NEJM199304293281704. [DOI] [PubMed] [Google Scholar]
  • 34.Caples SM, Gami AS, Somers VK. Obstructive sleep apnea. Ann Intern Med. 2005;142:187–197. doi: 10.7326/0003-4819-142-3-200502010-00010. [DOI] [PubMed] [Google Scholar]
  • 35.Collop NA, Tracy SL, Kapur V, et al. Obstructive sleep apnea devices for out-of-center (OOC) testing: technology evaluation. J Clin Sleep Med. 2011;7(5):531–548. doi: 10.5664/JCSM.1328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Kuna ST, Gurubhagavatula I, Maislin G, et al. Noninferiority of functional outcome in ambulatory management of obstructive sleep apnea. Am J Respir Crit Care Med. 2011;183:1238–1244. doi: 10.1164/rccm.201011-1770OC. [DOI] [PubMed] [Google Scholar]
  • 37.Ayappa I, Norman RG, Suryadevara M, Rapoport DM. Comparison of limited monitoring using a nasal-cannula flow signal to full polysomnolgraphy in sleep-disordered breathing. Sleep. 2004;27:1171–1179. doi: 10.1093/sleep/27.6.1171. [DOI] [PubMed] [Google Scholar]
  • 38.Thornton AT, Singh P, Ruehland WR, Rochford PD. AASM criteria for scoring respiratory events: interaction between apnea sensor and hypopnea definition. Sleep. 2012;35(3):425–432. doi: 10.5665/sleep.1710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Department of Health and Human Services, Center for Medicare and Medicaid Services. [accessed 23 December 2010];Decision memo for continuous positive airway pressure (CPAP) therapy for obstructive sleep apnea (OSA) 2008 Mar 13; www.cms.hhs.gov/mcd/viewdecisionmemo.asp?id=204.
  • 40.Kim J, In K, Kim J, et al. Prevalence of sleep-disordered breathing in middle-aged Korean men and women. Am J Respir Crit Care Med. 2004;170:1108–1113. doi: 10.1164/rccm.200404-519OC. [DOI] [PubMed] [Google Scholar]
  • 41.Udwadia ZF, Doshi AV, Lonkar SG, et al. Prevalence of sleep-disordered breathing and sleep apnea in middle-aged urban Indian men. Am J Respir Crit Care Med. 2004;169:168–173. doi: 10.1164/rccm.200302-265OC. [DOI] [PubMed] [Google Scholar]
  • 42.Lee RWW, Vasudavan S, Hui DS, et al. Differences in craniofacial structures and obesity in Caucasian and Chinese patients with obstructive sleep apnea. Sleep. 2010;33:1075–1080. doi: 10.1093/sleep/33.8.1075. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES