Abstract
Objective:
To aid in the identification of patients with moderate-to-severe sleep-disordered breathing (SDB), we developed and validated a simple screening tool applicable to both clinical and community settings.
Methods:
Logistic regression analysis was used to develop an integer-based risk scoring system. The participants in this derivation study included 132 patients visiting one of 2 hospitals in Japan, and 175 residents of a rural town. The participants in the present validation study included 308 employees of a company in Japan who were undergoing a health check.
Results:
The screening tool consisted of only 4 variables: sex, blood pressure level, body mass index, and self-reported snoring. This tool (screening score) gave an area under the receiver operating characteristic curve (ROC) of 0.90, sensitivity of 0.93, and specificity of 0.66, using a cutoff point of 11. Predicted and observed prevalence proportions in the validation dataset were in close agreement across the entire spectrum of risk scores. In the validation dataset, the area under the ROC for moderate-to-severe SDB and severe SDB were 0.78 and 0.85, respectively. The diagnostic performance of this tool did not significantly differ from that of previous, more complex tools.
Conclusion:
These findings suggest that our screening scoring system is a valid tool for the identification and assessment of moderate-to-severe SDB. With knowledge of only 4 easily ascertainable variables, which are routinely checked during daily clinical practice or mass health screening, moderate-to-severe SDB can be easily detected in clinical and public health settings.
Citation:
Takegami M; Hayashino Y; Chin K; Sokejima S; Kadotani H; Akashiba T; Kimura H; Ohi M; Fukuhara S. Simple four-variable screening tool for identification of patients with sleep-disordered breathing. SLEEP 2009;32(7):939-948.
Keywords: Sleep-disordered breathing, screening, sensitivity, specificity, validation
SLEEP-DISORDERED BREATHING (SDB), INCLUDING OBSTRUCTIVE SLEEP APNEA, WAS INITIALLY CONSIDERED A RARE DISORDER; HOWEVER, RECENT epidemiologic studies have revealed that it is fairly prevalent in the general adult population.1,2 Apnea and hypopnea during sleep increase the risk of cardiovascular disease, including hypertension, arrhythmia, and myocardial infarction, as well as cerebrovascular disease.3 Moreover, because it may lead to motor vehicle and public transportation accidents, it is now also considered a serious social concern.4,5 SDB is therefore considered a problem requiring attention from both clinical and public health perspectives.
Because SDB is rarely recognized as potentially fatal, however, and given the difficulty affected patients have in recognizing their condition, only a small proportion of those with moderate-to-severe SDB receive appropriate therapy,6 notwithstanding the availability of several highly effective treatments.7
Regarding the diagnosis of SDB, polysomnography (PSG) has been used as a gold standard, and cardiorespiratory monitoring may be used for diagnosis.8 These machines require overnight sleep testing and are thus time-consuming and burdensome, and neither is suitable for community-based screening. We therefore considered that a user-friendly screening tool may improve the identification of patients with moderate-to-severe SDB. To our knowledge, several questionnaires and prediction rules have been used for mass screening9–12; however, one includes numerous variables, and the others are not appropriate in occupational and community settings. Moreover, a comprehensive comparison of these questionnaires has yet to be conducted.
Here, we sought to develop and validate a simple, user-friendly, integer-based, prediction rule with a relatively small number of variables to screen subjects for moderate-to-severe SDB. We also wanted to compare the predictive performance of this model with those previously developed.
METHODS
Subjects and Data Collection
The derivation dataset used to derive the screening tool and the validation dataset used to test the external validity of this tool were collected separately. To ensure the generalizability of the screening tool, derivation data were gathered in 2 settings (university hospital and community settings). First, we included consecutive patients undergoing PSG testing in 2 medical university hospitals in Japan between July 1999 and December 2002. These patients underwent pulse oximetry as part of PSG testing, and, when diagnosed with SDB, completed a self-administered questionnaire. The physician who ordered the PSG also collected information on patient characteristics and clinical history. Second, we included a sample of subjects from a previous population-based survey. Of the 5,107 residents who had participated in the previous survey, we included those who consented to undergo pulse oximetry in the current study. This survey, originally conducted to clarify the impact of factors related to the subjects' social and physical environment on health-related quality of life and/or sleep quality, has been described elsewhere.13 Briefly, the cohort consisted of all residents 20 years old or older living in Naie, Hokkaido Prefecture, a rural community in Japan. Participants in the original survey were invited to voluntarily undergo pulse oximetry for our study, and those who agreed were invited to participate in a subsequent overnight study. Public health nurses acquired the history of each participant.
For the validation dataset, a cross-sectional survey was conducted among all employees of a wholesale company in Osaka, Japan between January 2004 and December 2005.14 The survey included a self-administered questionnaire and medical examination, along with sleep tests using a cardiorespiratory monitor (type 3 portable monitor) and an actigraph to diagnose SDB.
Subjects with a history of treatment for SDB or other sleep disorders such as narcolepsy, were excluded, as were those who were unable to complete pulse oximetry or cardiorespiratory testing because of inappropriate operation of the testing machine or because it did not work well.
The population-based study in Naie was approved by the Institutional Review Board of the Public Health Research Foundation, and the hospital-based and occupational studies were authorized by the Institutional Review Board of the Kyoto University Graduate School of Medicine.
SDB Measure
During the derivation process, we defined SDB in accordance with a guideline endorsed by the British Thoracic Society, namely: ≥ 10 respiratory events per hour and arterial oxygen desaturation ≥ 4% from the baseline saturation value obtained during the sleep study.15 It has been reported that, as the oxygen desaturation index (ODI; measured by overnight pulse oximetry) increases, the risk of cardiovascular disease increases, as it does with an increase in the apnea-hypopnea index (AHI; measured by PSG).15 The guideline uses an ODI of ≥ 10 to diagnose sleep apnea, without confirmation by PSG16,17 and recommends treatment at this level.15
For validation, the respiratory disturbance index (RDI: number of apneas and hypopneas per hour of sleep) was calculated from data obtained from both the actigraph and cardiorespiratory (type 3 portable monitor).14 To improve the accuracy of RDI measurement, the validation study used a type 3 portable monitor rather than a pulse oximeter. Type 3 monitors are defined as devices with a minimum of 4 monitored channels,18 including ventilation or airflow (with at least 2 channels of respiratory movement, or respiratory movement and airflow), heart rate or electrocardiogram, and oxygen saturation. In our validation study, a type 3 monitor recorded chest and abdominal respiratory movements, nasal pressure, oxygen saturation, heart rate, and body position. Type 3 monitors are considered standard in sleep studies and an alternative to PSG in diagnosing SDB.19 Given that type 3 monitors are sometimes unable to measure sleep duration, however, this variable was measured using an actigraph on the basis of the similarity in findings between measurement of sleep duration in bed using an actigraph and PSG.20 Sleep duration at night was measured by wrist actigraph tracing21 in conjunction with a sleep diary, which documented when the participant went to bed at night and arose in the morning. Sleep onset was estimated by noting the sustained cessation of movement of the wrist on the actigraphy tracing, and rousing was noted by the appearance of wrist movements on the tracing. With regard to type 3 monitor analysis, sleep duration was defined as the estimated time duration between sleep onset and final rousing. SDB analyses incorporated the main segments of comprehensible type 3 monitor recordings within the estimated sleep spans. Apneas (cessation of breathing ≥ 10 sec) and hypopneas ( ≥ 50% reduction in respiratory effort with ≥ 3% oxygen desaturation for ≥ 10 sec) were visually scored by at least 2 medical doctors specializing in respiratory medicine. Subjects with an RDI of ≥ 15 and ≥ 30 were considered to have moderate-to-severe SDB or severe SDB, respectively.
Measurement and Definition of Independent Variables
Potential candidate variables for predictors were identified based on a review of the literature, clinical relevance, and routine availability; the selection of variables for abstraction was further guided by physicians specializing in sleep medicine. Potential candidate variables were classified as demographic characteristics, comorbid conditions, and clinical features (including symptoms). Comorbid conditions included diabetes mellitus, cardiovascular disease, and cerebrovascular disease. Clinical features included snoring, body mass index (BMI), and blood pressure. Laboratory features, such as blood tests and spirometry test data, were excluded because these are unsuitable for screening the general population. Symptoms were assessed using a self-administered questionnaire which asked about morning headaches, daytime sleepiness, vitality related to daytime sleepiness, and psychological well-being. Subjective daytime sleepiness was assessed using the Epworth Sleepiness Scale (ESS), a valid and reliable self-administered questionnaire instrument for measuring subjective daytime sleepiness.22,23 The ESS score can then help clinicians to separate patients into those who are clinically normal or those who are excessively sleepy and may have sleep disorders, using a cutoff point of 10. Vitality and psychological well-being was assessed using the Vitality (VT) and Mental Health (MH) subscales of the Medical Outcome Study Short-Form 36-Item Health Survey (SF-36). The SF-36 is a valid and reliable instrument for measuring health-related quality of life.24,25
Statistical Analysis
The association between SDB and candidate variables was determined using the t-test for continuous variables and Pearson's χ2 test for categorical variables. BMI values were grouped into 6 categories, with the cutoff points modified using reported research data applicable to Japanese.26 Blood pressure values were categorized using the cutoff points described in the World Health Organization Hypertension guideline.27 We examined the strength and shape of the relationships of continuous variables with the log odds of SDB using cubic spline plots.28 These functions were then used to develop and refine the multivariable regression models reported previously.29 Candidate variables with P-values < 0.10 were included as covariates in a multiple logistic regression model,30 and those with P-values < 0.05 were retained in the final multivariable logistic regression model.31 Because our aim was to develop a simple and practical tool for screening, we excluded predictors determined on discussion with clinicians to be clinically unimportant. Model discrimination was assessed by the area under the receiver operating characteristic (ROC) curve,32 and calibration was assessed using the Hosmer-Lemeshow χ2 statistic.30
An integer score-based prediction rule for the prevalence of SDB was developed from the logistic regression model using a regression coefficient-based scoring method.33,34 To generate a simple integer-based point score for each predictor variable, the coefficient scores were assigned by dividing the β-coefficients by the criterion value, which is the sum of the smallest and second-smallest coefficients in the model multiplied by 0.4 and rounded up to the nearest integer. To achieve a value of 1 for the variable with the smallest coefficient score, the denominator was selected. Missing predictor variables were replaced with a zero. The overall risk score was calculated by summing all component scores for each participant.35,36
To assess the performance of the screening tool in the derivation dataset, we calculated the sensitivity, specificity, and likelihood ratios for positive and negative test results, post-test probability of positive and negative results, and area under the ROC curve by varying the positive threshold of our screening score (total risk score ≥ 9, ≥ 11, or ≥ 14). Screening scores were also assigned to different risk classes by quartile, and the prevalence of SDB observed in each was compared using the χ2 test for trend. We validated the screening tool internally using the leave-one-out cross-validation method, in which all cases but one were used to train the screening tool. The rule was then applied to the single excluded case.37–39 This procedure was repeated for every case, until each had been left out once.40
We externally validated the screening score by separately assessing model performance in the validation dataset in the same manner as in the derivation dataset for outcomes of moderate-to-severe SDB, or severe SDB. We also tested model performance for outcome of the definition of SDB using 4% desaturation with pulse oximetry. In addition, we used the hypothetical screening strategies proposed by Gurubhagavatula et al.41 in the validation dataset by combining our screening score with the pulse oximetry results to determine if the subject undergoes PSG or continuous positive airway pressure (CPAP) titration, thereby dividing the participants into 3 groups according to risk score. We evaluated 2 different strategies by varying the risk score threshold. For strategy 1, subjects in the high-risk group (scores ≥ 14 [upper threshold]) underwent PSG or CPAP titration, whereas subjects in the low-risk group (scores < 9 [lower threshold]) underwent no further testing. The remaining participants (intermediate group) were subjected to pulse oximetry, and, if their ODI was ≥ 10, they also underwent PSG or CPAP titration; otherwise, no further testing was done. In strategy 2, the upper and lower thresholds were set at 14 and 11, respectively. Test performance for these strategies (sensitivity, specificity, area under the ROC, likelihood ratio) was evaluated by defining each as positive if subjects were in the high or intermediate risk groups with ODI ≥ 10, using moderate-to-severe SDB or severe SDB as the reference.
We also compared the area under the ROC curves of our screening score to those of other reported diagnosis tools, such as the Multivariable Apnea Risk Index (MAP) and Sleep Apnea Clinical Score (SACS),9,10 using Egger's method (algorithm described by DeLong, et al.).42 For this analysis, statistical significance was defined at the P < 0.05 level.
All analyses were performed using SAS statistical software version 8.02 (SAS Institute Inc., Cary, NC, USA) and STATA statistical software version 9.2 (Stata Corp. LP, College Station, TX, USA).
RESULTS
Subject Characteristics in the Derivation and Validation Datasets
The derivation dataset contained 307 subjects (132 subjects visiting hospitals and 175 local residents). Subject participation in the derivation study is illustrated by a flow diagram (Figure 1). The validation dataset contained 308 workers; 16 were excluded from the current analysis because of incomplete cardiorespiratory monitor testing. Table 1 shows the demographic and clinical characteristics of the subjects in the 2 data sets. The percentage of women in the derivation and validation data sets was 33.9% and 1.0%, respectively. In the derivation dataset, 149 of 307 subjects were diagnosed as having SDB (ODI ≥ 10), while a total of 128 (97.0%) of the 132 subjects who visited the hospital and 21 (12.0%) of the 175 volunteers residing in the rural area were diagnosed with SDB. The prevalence of moderate (RDI ≥ 15) and severe (RDI ≥ 30) SDB in the external validation dataset was 22.4% (n = 69) and 6.8% (n = 21), respectively.
Table 1.
Characteristic | Derivation Dataset (n = 307) | Validation Dataset (n = 308) | P value |
---|---|---|---|
Age, mean (SD) | 49.9 (14.4) | 43.8 (8.3) | < 0.001 |
Men, % | 66.1 | 99.0 | < 0.001 |
Body mass index (kg/m2), mean (SD) | 25.5 (4.2) | 23.7 (3.3) | < 0.001 |
Smoking history | < 0.001‡ | ||
Present smoker, % | 31.4 | 55.4 | |
Never smoker, % | 43.1 | 19.7 | |
Snoring, % | 80.5 | 48.9 | < 0.001 |
Excessive daytime sleepiness | |||
Epworth Sleepiness Scale, mean (SD) | 7.1 (4.3) | 6.6 (3.7) | 0.165 |
Hypertension , % | 48.9 | 16.1 | < 0.001 |
Systolic blood pressure (mm Hg), mean (SD) | 128.2 (19.3) | 129.0 (14.3) | 0.562 |
Diastolic blood pressure (mm Hg), mean (SD) | 82.5 (12.5) | 80.9 (10.8) | 0.081 |
Diabetes, % | 4.6 | 9.3 | 0.037 |
Cerebrovascular disease, % | 3.6 | 1.8 | 0.127 |
Cardiovascular disease, % | 5.5 | 3.6 | 0.337 |
Oxygen desaturation index*, mean (SD) | 22.1 (28.3) | 7.9 (10.1) | < 0.001 |
Respiratory disturbance index†, mean (SD) | NA | 10.7 (11.6) | NA |
Oxygen desaturation index is defined as a fall of greater than 4% per hour in arterial oxygen saturation during sleep study.
Respiratory disturbance index is the number of apneas and hypopneas per hour of sleep study.
P value for trend, χ2 test.
Predictors of SDB
The results of univariate analysis for all potential predictors are shown in Table 2. Multivariable logistic regression analysis revealed that sex, BMI, blood pressure, snoring, and vitality were significant variables in the identification of SDB. Because no significant association between vitality and SDB could be determined when analysis was restricted to the local residents only (P = 0.792), vitality was not included in the final multiple logistic regression model. Although univariate analysis correlated daytime sleepiness with SDB, there was no significant effect on other factors after adjustment in the multivariable model (P = 0.216). The factors described above were independently associated with a final diagnosis of SDB. Sex, BMI, blood pressure, and snoring remained in the final multivariable model for predicting SDB (Table 3).
Table 2.
Characteristic | Non-SDB (n = 158) | SDB (n = 149) | P value |
---|---|---|---|
Age, mean (SD) | 49.6 (15.2) | 50.3 (13.6) | 0.666 |
Men, % | 41.1 | 92.6 | < 0.001 |
Body mass index (kg/m2), % | < 0.001‡ | ||
< 21.0 | 22.8 | 2.0 | |
21.0-22.9 | 26.0 | 6.1 | |
23.0-24.9 | 24.7 | 12.8 | |
25.0-26.9 | 12.6 | 18.2 | |
27.0-29.9 | 10.8 | 32.4 | |
≥ 30.0 | 3.2 | 28.4 | |
Blood pressure (mm Hg), % | < 0.001‡ | ||
SBP < 140 and DBP < 90 | 71.5 | 39.9 | |
140 ≤ SBP < 160 or 90 ≤ DBP < 100 | 21.5 | 37.2 | |
160 ≤ SBP < 180 or 100 ≤ DBP < 110 | 6.3 | 18.2 | |
SBP ≥ 180 or DBP ≥ 110 | 0.6 | 4.7 | |
Snoring, % | 65.2 | 96.6 | < 0.001 |
Current smoker, % | 28.1 | 34.9 | 0.204 |
Comorbid condition | |||
Diabetes, % | 5.1 | 4.0 | 0.664 |
Cerebrovascular disease, % | 4.4 | 2.7 | 0.411 |
Cardiovascular disease, % | 7.0 | 4.0 | 0.261 |
Symptom | |||
Excessive daytime sleepiness (ESS > 11), % | 19.9 | 44.4 | < 0.001 |
Vitality† (SF-36), mean (SD) | 50.6 (9.5) | 46.2 (10.6) | < 0.001 |
Mental Health† (SF-36), mean (SD) | 50.9 (9.0) | 48.5 (9.7) | 0.029 |
*SDB = sleep-disordered breathing; SBP = systolic blood pressure; DBP = diastolic blood pressure; ESS = Epworth Sleepiness Scale; SF-36 = Medical outcome study short-form 36-item health survey.
Vitality and mental health is the norm score of the SF-36 domain.
P value for trend, χ2 test
Table 3.
Characteristic | β Regression Coefficient (95% CI) | Screening Score Assigned |
---|---|---|
Sex | ||
women | ref | 0 |
men | 2.33 (1.52–3.14) | +4 |
Body mass index, kg/m2† | 0.68 (0.45–0.91) | +1 |
Blood pressure, mm Hg‡ | 0.73 (0.27–1.20) | +1 |
Snoring§ | ||
no or unknown | ref | 0 |
yes | 2.02 (0.84–3.21) | +4 |
Screening score was assigned by dividing the β-coefficient by the absolute value of two-fifths of the amount added for the smallest and second-smallest coefficients in the model and rounding up to the nearest integer.
Body mass index categories: < 21.0, 21.0–22.9, 23.0–24.9, 25.0–26.9, 27.0–29.9, ≥ 30.
Blood pressure categories: systolic blood pressure (SBP) < 140 or diastolic blood pressure (DBP) < 90, SBP 140–159 or DBP 90–99, SBP160–179 or DBP 100–109, SBP ≥ 180 or DBP ≥ 110.
Snoring: snoring almost everyday or often is considered “yes,” and snoring sometimes or almost never is considered “no”.
Hosmer-Lemeshow statistics, 3.76 (degree of freedom = 7, P = 0.81).
Screening Score
An integer-based score was assigned to each variable according to the β-coefficient estimated in the final model (Table 3), after which the risk score for each subject was calculated. With regard to variables, sex was set at 1 for males and 0 for females; body mass index ( < 21.0, 21.0–22.9, 23.0–24.9, 25.0–26.9, 27.0–29.9, ≥ 30) was assigned a value between 1 and 6; blood pressure (systolic blood pressure [SBP] < 140 or diastolic blood pressure [DBP] < 90, SBP 140–159 or DBP 90–99, SBP 160–179, or DBP 100–109, SBP ≥ 180 or DBP ≥ 110) was assigned a value between 1 and 4; and snoring was assigned 1 for a response of snoring almost every day or often, and 0 for snoring sometimes, almost never, or unknown. Overall risk for each participant was calculated by adding the component scores for each variable, with missing predictor variables replaced by a zero. The range of the overall risk score was from 2 to 18. The mean risk score in the derivation dataset was 11.1 (SD = 4.2). In the derivation dataset, a positive linear trend was observed between risk score quartile and prevalence of SDB; the prevalence of moderate-to-severe SDB was 2.7% for quartile 1 (with a corresponding score of ≤ 7), 31.3% for quartile 2 (score: 8 to 11), 63.2% for quartile 3 (score: 12 to 14), and 91.0% for quartile 4 (score: ≥ 15) (χ2 test for trend, P < 0.001) (Figure 2). We observed similar trends for risk of both moderate-to-severe SDB and severe SDB in the validation dataset (χ2 test for trend, P < 0.001 and P < 0.001, respectively).
Test Performance According to Derivation Data
Table 4 presents the model performance indices. In the derivation set, the area under the ROC curve was 0.90 (95% confidential interval [95% CI]: 0.87 to 0.94) for the logistic regression model (n = 305), and 0.90 (95% CI: 0.87 to 0.93) for the screening score (n = 307; the value of missing data was zero). In the derivation dataset, the area under the ROC curve was higher for females (0.90 [n = 104]) than males (0.82 [n = 203]), and the value for community residents in the same dataset was 0.83 (n = 175). The area under the ROC curve of BMI was 0.82. BMI alone showed a sensitivity of 0.79 and a specificity of 0.73 when a cutoff point of 25 was used. A large percentage of positive results were attained at this cutoff point (52.1%), and 31 subjects (10.1%) were underdiagnosed.
Table 4.
Index | Estimate (95%CI) |
||
---|---|---|---|
Derivation indices† (n = 307) | Validation indices‡ (n = 308) |
||
(n = 149) | Moderate-to-severe SDB (n = 69) | Severe SDB (n = 21) | |
Area under ROC curve | 0.900 (0.867–0.933) | 0.781 (0.718–0.844) | 0.845 (0.773–0.916) |
Cutoff, ≥ 9 | n = 219 (71.3%) | n = 206 (66.9%) | |
Sensitivity | 0.987 (0.952–0.998) | 0.913 (0.820–0.967) | 1.000 (0.839–1.000) |
Specificity | 0.544 (0.463–0.624) | 0.402 (0.339–0.467) | 0.355 (0.300–0.414) |
LR+ / LR− | 2.165 / 0.025 | 1.526 / 0.217 | 1.551 / 0.000 |
PTP+ / PTP− | 0.671 / 0.023 | 0.306 / 0.059 | 0.102 / 0.000 |
Cutoff, ≥ 11 | n = 193 (62.9%) | n = 132 (42.9%) | |
Sensitivity | 0.933 (0.880–0.967) | 0.739 (0.619–0.837) | 0.952 (0.762–0.999) |
Specificity | 0.658 (0.579–0.732) | 0.661 (0.597–0.721) | 0.610 (0.551–0.667) |
LR+ / LR− | 2.730 / 0.102 | 2.181 / 0.395 | 2.441 / 0.078 |
PTP+ / PTP− | 0.720 / 0.088 | 0.386 / 0.102 | 0.152 / 0.006 |
Cutoff, ≥ 14 | n = 109 (35.5%) | n = 37 (12.0%) | |
Sensitivity | 0.651 (0.569–0.727) | 0.333 (0.224–0.457) | 0.571 (0.340–0.782) |
Specificity | 0.924 (0.871–0.960) | 0.941 (0.904–0.968) | 0.913 (0.874–0.943) |
LR+ / LR− | 8.572 / 0.378 | 5.691 / 0.708 | 6.560 / 0.470 |
PTP+ / PTP− | 0.890 / 0.263 | 0.622 / 0.170 | 0.324 / 0.033 |
Strategy 1** | |||
Sensitivity | NA | 0.826 (0.716–0.907) | 1.000 (0.839–1.000) |
Specificity | NA | 0.916 (0.874–0.948) | 0.805 (0.754–0.849) |
LR+ / LR− | NA | 9.872 / 0.190 | 5.125 / 0.000 |
PTP+ / PTP− | NA | 0.740 / 0.052 | 0.273 / 0.000 |
Strategy 2** | |||
Sensitivity | NA | 0.710 (0.588–0.813) | 0.952 (0.762–0.999) |
Specificity | NA | 0.925 (0.884–0.955) | 0.836 (0.788–0.877) |
LR+ / LR− | NA | 9.429 / 0.314 | 5.816 / 0.057 |
PTP+ / PTP− | NA | 0.731 / 0.083 | 0.299 / 0.004 |
*SDB = sleep-disordered breathing; ROC = receiver operating characteristic; LR = likelihood ratio; PTP + = post-test probability of positive result; PTP − = post-test probability of negative result.
In derivation, SDB was defined as an oxygen desaturation index (ODI) ≥ 10.
Moderate and severe SDB was defined as a respiratory disturbance index ≥ 15 and ≥ 30, respectively.
Strategy 1 was defined with an upper cutoff of 14 and lower cutoff of 9. Strategy 2 was defined with an upper cutoff of 14 and lower cutoff of 11. Subjects underwent polysomnography (PSG) or continuous positive airway pressure (CPAP) titration if the score was higher than the upper cutoff, and ruled out if the score was lower than the lower cutoff. The intermediate group underwent pulse oximetry, with PSG or CPAP titration if ODI ≥ 10, and no further testing if ODI < 10.
Model Validation
The discriminative ability of the model was maintained with an area under the ROC curve value of 0.88 for the logistic model, and 0.88 for the internal validation exercise screening score. For the external validation set (n = 308), an area under the ROC curve value of 0.78 (95% CI: 0.72 to 0.84) indicated moderate-to-severe SDB, while a value of 0.85 (95% CI: 0.77 to 0.92) indicated severe SDB. The area under the ROC curve for outcome of the definition of SDB using the model development was similar to that measured for moderate-to-severe SDB (0.76 [95% CI: 0.69 to 0.82]). The sensitivity and specificity values at a cutoff point of 11 were 0.73 and 0.67, respectively. The predicted and observed prevalence proportions in the validation dataset were in close agreement across the entire spectrum of screening scores (Figure 2).
In the workplace sample of the validation dataset for moderate-to-severe SDB, the respective post-test probabilities of positive and negative results were 0.39 and 0.10 using a cutoff point of 11; 0.31 and 0.06 using a cutoff point of 9; and 0.62 and 0.17 using a cutoff point of 14. The sensitivity and specificity values for severe SDB were similar for moderate-to-severe SDB (Table 4).
When 3 criteria with 2 cutoff points (9 and 14) were applied, the strategy showed a sensitivity of 0.83 (95% CI: 0.72 to 0.91), specificity of 0.92 (95% CI: 0.87 to 0.95), and positive likelihood ratio of 9.87 for moderate-to-severe SDB. Application of the 3 criteria with the cutoff points of 11 and 14 produced a similar specificity but lower sensitivity. For severe SDB, both strategies showed high sensitivity and specificity (Table 4).
The post-test probabilities were affected by the prevalence of the target population, which is same as pre-test probabilities. Figure 3 shows the change in the post-test probability of the positive and the negative tests, based on pre-test probability.
Comparison of Our Screening Score with Other Tools
For SACS, the areas under the ROC curve (0.82 [95% CI: 0.75 to 0.89]) for moderate-to-severe SDB and severe SDB (0.86 [95% CI: 0.79 to 0.93]) were not significantly different from the screening score (n = 209, P = 0.786, P = 0.833, respectively). For moderate-to-severe SDB, SACS had a sensitivity of 0.81, a specificity of 0.72, and a likelihood ratio of 2.91 when its cutoff point was 7. For severe SDB, SACS had a sensitivity of 0.88, specificity of 0.65, and likelihood ratio of 2.49 at the same cutoff point.
For MAP, the areas under the ROC curve (0.79 [95% CI: 0.73 to 0.85]) for both moderate-to-severe SDB and severe SDB (0.86 [95% CI: 0.79 to 0.93]) were not significantly different from the screening score (n = 302, P = 0.600, P = 0.571, respectively). For severe SDB, MAP had a sensitivity of 0.80, specificity of 0.72, and positive likelihood ratio of 2.86 when its cutoff point was 0.30. When its cutoff point was 0.55, MAP had a sensitivity of 0.35, specificity of 0.96, and positive likelihood ratio of 8.97.
DISCUSSION
In this study, we developed and validated a markedly simple screening tool for SDB consisting of only four variables: sex, BMI, blood pressure, and self-reported snoring. Three of these four variables (snoring excluded) are routinely checked during mass screenings or during daily clinical practice. Because it requires only the simple addition of a question on snoring to the regular subject record during periodic health checkups, this screening score is thus suitable for use in mass screening in occupational and community settings, and also in primary care settings. The screening score performed well during the derivation process, as well as in internal and external validation, and may be used to identify patients with asymptomatic and undiagnosed SDB.
Screening Tool Variables
This screening tool included the clinically important variables as well as sex, BMI, blood pressure, and snoring. Correlations between the respective variables and SDB have been reported previously,43 and are commonly used for clinical diagnosis.15 BMI proved to be the most powerful indicator, with an area under the ROC curve value of 0.82. This value is similar to that reported by Gurubhagavatula et al., who reported an area under the ROC curve for BMI of 0.80. Although the BMI cutoff point of 32.7 indicated high sensitivity and specificity in their study population, 25.0 was the appropriate cutoff point in our derivation dataset. These results suggest that the suitable cutoff point differs between Asians and Caucasians.
Although a model using BMI alone presents a simple, attractive alternative to current models, we chose a model combining BMI with other variables in multivariable prediction over a screening model using BMI alone for two reasons. First, screening using BMI alone fails to efficiently reduce the number of subjects requiring subsequent examination. Screening by BMI alone at a cutoff point of 25 retained a large percentage of positive results (52.1%) versus the strategy using two cutoff points of the screening score; yet 31 subjects (10.1%) were still underdiagnosed. Second, screening using BMI alone presents the possibility of underdiagnosis in non-obese SDB patients. In our validation study, 22 (31.9%) of the moderate-to-severe SDB patients were not obese. Additional variables are most important in the screening of non-obese subjects,41 and we expect our screening score will have incremental value in populations with a lower prevalence of obesity.
Daytime sleepiness is a common symptom of SDB. The ESS for subjective daytime sleepiness assessment did not correlate with SDB severity in our validation study population.14 We speculate that SDB subjects who do not consult a doctor have fewer subjective symptoms. For this reason, subjective symptoms may be unsuitable predictors of SDB in public health settings.
Application of Screening Score
The most important goal of screening is to effectively and efficiently identify subjects requiring subsequent examination, such as pulse oximetry, PSG, and possibly CPAP titration. Although a single positive criterion of this screening score is not sufficient, effective screening may be provided by the use of three criteria with two cutoff points.
When a positive SDB criterion of ≥ 11 was applied to our workplace sample in the validation data set, positive test results indicating the need for subsequent screening were obtained for 42.9% of employees. Generally, a cutoff point of 14 was found useful; the rate of positive results at this cutoff was low (12.0%), but underdiagnosis was only 14.9%. In contrast, a cutoff of 9 was useful for ruling out SDB, but positive test results reached as high as 66.9%. Among disadvantages inherent to screening programs is the inability of the screening score to reduce the number of subjects requiring subsequent examination. Here also, we were unable to diagnose SDB with a single score alone, and therefore propose the use of three criteria with two cutoff points. When cutoff points of 11 and 14 were used, only 30.8% of all target subjects were identified for pulse oximetry testing, of whom 12.0% would undergo PSG. Underdiagnosis occurred in 6.5% of subjects, but only one of these had severe SDB. Moreover, while overdiagnosis occurred in 5.8% of subjects; 72.2% of these had mild SDB.
Users will be able to optimize or customize this screening score to match the purpose of the study, either to maximize specificity or to prioritize efficiency, according to resource allocations in the target community. Here, we present two representative strategies using two different cutoff points tailored to these two objectives. A cutoff point of 9 was useful in ruling out SDB, and thus was established as the lower of two cutoff points with three criteria. In contrast, a cutoff point of 11 was potentially useful in preventing too many subjects from requiring further testing. An upper cutoff point between 14 and 15 was appropriate, because the high-risk group (those subjects above the upper cutoff point) would then undergo PSG or CPAP titration. Increasing the upper limit for the high-risk group is inefficient and unnecessary. When cutoff points of 8 and 14 were used, the model performance indices were almost the same as those achieved at cutoff points of 9 and 14 (data not shown). Further, when cutoff points of 10 and 14 were used, the model performance indices were almost the same as those achieved at 11 and 14 (data not shown), but the number of subjects requiring pulse oximetry increased. Moreover, when the upper limit was set at 15, the model performance indices were almost the same as those achieved at an upper limit of 14 (data not shown).
Pulse oximetry is commonly used to screen SDB. In the present study, oximetry testing in all subjects using an ODI cutoff point of 10 would result in 24.0% undergoing subsequent PSG testing, 3.6% being overdiagnosed, and only 1.9% of subjects being underdiagnosed. Combination screening using our screening score method and pulse oximetry testing may help decrease unnecessary testing over time.
Previous Studies and Comparison of Other Tools
To our knowledge, our study is the first to evaluate the test performance of different screening tools in an Asian population. The discriminatory power of our screening tool was not significantly different from that of MAP or SACS, and thus these two tools, which were developed without regard to ethnicity, may be applicable to Asian populations with lower BMIs. Nevertheless, we believe that our screening score may be more suitable because it consists of only four variables, all of which can be easily measured in both clinical and public health settings. For example, MAP asks questions about severe events, such as experiencing the cessation of breathing or struggling for breath during sleep, to which subjects may not return accurate responses, especially in an occupational setting. These drawbacks may also apply to subjective sleepiness.41 SACS requires information from the subject's partner, but this can be difficult to obtain. Indeed, we could use the data from only 209 subjects (67.9%) in our validation population because of incomplete information from the subjects' partners. SACS also requires neck circumference, making it subject to measurement error.44 Our screening score system may be more adaptable than SACS in community and/or occupational settings, whereas a primary care physician may easily obtain information from a patient's partner.
Our results suggest that different cutoff points should be used for MAP analysis of Asian populations. Originally, a MAP score of ≥ 0.55 was used as threshold, which presented a sensitivity of 0.81 and specificity of 0.73.41 When using this cutoff point in our study, however, sensitivity was less than 0.35, which was improved by changing the cutoff point to 0.30. Although MAP can be applied to Asian populations, the cutoff point needs to be reconsidered.
The STOP questionnaire was developed as an obstructive sleep apnea screening tool for surgical patients.11 The Berlin questionnaire, which was developed for use in primary care settings, offers high sensitivity and specificity for identifying patients with an RDI ≥ 5.12 In the interest of public health, we decided to identify subjects with moderate-to-severe SDB, given the importance of preventing secondary disease and traffic accidents caused by SDB. Although many SDB subjects do not suffer from symptoms of concomitant diseases like diabetes, moderate-to-severe SDB, especially severe SDB, increases the risk of cardiovascular events and motor vehicle accidents.43 Thus our screening score system may be more efficient in mass screening than other tools in occupational and community settings.
This study has several limitations. First, we could not perform PSG on subjects in this study. Second, the source population included two different groups, the hospital visit group and local resident group, and thus the model may not reflect the background information of SDB subjects who are more asymptomatic but who have not sought medical attention. Finally, participants of the validation study were employees of a particular company, which may further narrow the generalizability of our findings. Although the screening score was not externally validated for females and community residents, internal validation showed good model discrimination. Moreover, only female subjects and community residents showed no decrease in discrimination for the screening score in the derivation dataset, suggesting that the score may be adapted to females and community residents. Further studies are needed to externally validate the screening score in community-based surveys. Additionally, further studies are needed to determine the most appropriate cutoff point for this score and establish a screening strategy that includes pulse oximetry and PSG, and to decide whether to enact a screening program in a given community.
In conclusion, we developed and validated a simple tool for identifying moderate-to-severe SDB. This tool consists of only four variables, all of which are easily ascertained (sex, blood pressure, BMI, and self-reported snoring). Its diagnostic performance was similar to those of tools that are more complex and less reliable. With knowledge of these four variables only, moderate or severe SDB can be easily detected in clinical and public health settings.
DISCLOSURE STATEMENT
This was not an industry supported study. The authors have indicated no financial conflicts of interest.
ACKNOWLEDGMENTS
This work was supported by grants-in-aid from Outcomes Research of Interactable Disease Group, the Respiratory Failure Research Group, the Japanese Ministry of Health, Labor and Welfare, Special Coordination Funds for Promoting Science and Technology, research grants from PRESTO JST, the Suzuken Memorial Foundation, Takeda Science Foundation, Mitsui Life Social Welfare Foundation, Chiyoda Kenko Kaihatsu Jigyodan Foundation, and Health Science Center Foundation. We are grateful to the participants, their families, and their employers.
We are grateful to Dr. Itsunari Minami, Ms. Yuriko Nakayama-Ashida, Dr. Kensuke Sumi, Dr. Takaya Nakamura, Dr. Tomoko Wakamura, Ms. Sachiko Horita, and Dr. Yasunori Oka for their help in this study. We also wish to thank Dr. Shin Yamazaki and Dr. Satoshi Morita for his suggestions on our analysis.
Funding: This study was supported by grants from the Ministry of Health, Labor and Welfare, Japan.
REFERENCES
- 1.Kim J, In K, Kim J, et al. Prevalence of sleep-disordered breathing in middle-aged Korean men and women. Am J Respir Crit Care Med. 2004;170:1108–13. doi: 10.1164/rccm.200404-519OC. [DOI] [PubMed] [Google Scholar]
- 2.Young T, Palta M, Dempsey J, Skatrud J, Weber S, Badr S. The occurrence of sleep-disordered breathing among middle-aged adults. N Engl J Med. 1993;328:1230–5. doi: 10.1056/NEJM199304293281704. [DOI] [PubMed] [Google Scholar]
- 3.Shamsuzzaman AS, Gersh BJ, Somers VK. Obstructive sleep apnea: implications for cardiac and vascular disease. JAMA. 2003;290:1906–14. doi: 10.1001/jama.290.14.1906. [DOI] [PubMed] [Google Scholar]
- 4.Cooperative Group Burgos-Santander. Teran-Santos J, Jimenez-Gomez A, Cordero-Guevara J. The association between sleep apnea and the risk of traffic accidents. N Engl J Med. 1999;340:847–51. doi: 10.1056/NEJM199903183401104. [DOI] [PubMed] [Google Scholar]
- 5.Ulfberg J, Carter N, Edling C. Sleep-disordered breathing and occupational accidents. Scand J Work Environ Health. 2000;26:237–42. doi: 10.5271/sjweh.537. [DOI] [PubMed] [Google Scholar]
- 6.Young T, Evans L, Finn L, Palta M. Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle-aged men and women. Sleep. 1997;20:705–6. doi: 10.1093/sleep/20.9.705. [DOI] [PubMed] [Google Scholar]
- 7.Jenkinson C, Davies RJ, Mullins R, Stradling JR. Comparison of therapeutic and subtherapeutic nasal continuous positive airway pressure for obstructive sleep apnoea: a randomised prospective parallel trial. Lancet. 1999;353:2100–5. doi: 10.1016/S0140-6736(98)10532-9. [DOI] [PubMed] [Google Scholar]
- 8.An evidence review cosponsored by the American Academy of Sleep Medicine, the American College of Chest Physicians, and the American Thoracic Society. Flemons WW, Littner MR, Rowley JA, et al. Home diagnosis of sleep apnea: a systematic review of the literature. Chest. 2003;124:1543–79. doi: 10.1378/chest.124.4.1543. [DOI] [PubMed] [Google Scholar]
- 9.Flemons WW, Whitelaw WA, Brant R, Remmers JE. Likelihood ratios for a sleep apnea clinical prediction rule. Am J Respir Crit Care Med. 1994;150:1279–85. doi: 10.1164/ajrccm.150.5.7952553. [DOI] [PubMed] [Google Scholar]
- 10.Maislin G, Pack AI, Kribbs NB, et al. A survey screen for prediction of apnea. Sleep. 1995;18:158–66. doi: 10.1093/sleep/18.3.158. [DOI] [PubMed] [Google Scholar]
- 11.Chung F, Yegneswaran B, Liao P, et al. STOP questionnaire: a tool to screen patients for obstructive sleep apnea. Anesthesiology. 2008;108:812–21. doi: 10.1097/ALN.0b013e31816d83e4. [DOI] [PubMed] [Google Scholar]
- 12.Netzer NC, Stoohs RA, Netzer CM, Clark K, Strohl KP. Using the Berlin Questionnaire to identify patients at risk for the sleep apnea syndrome. Ann Intern Med. 1999;131:485–91. doi: 10.7326/0003-4819-131-7-199910050-00002. [DOI] [PubMed] [Google Scholar]
- 13.Yamazaki S, Sokejima S, Nitta H, Nakayama T, Fukuhara S. Living close to automobile traffic and quality of life in Japan: a population-based survey. Int J Environ Health Res. 2005;15:1–9. doi: 10.1080/09603120400018709. [DOI] [PubMed] [Google Scholar]
- 14.Nakayama-Ashida Y, Takegami M, Chin K, et al. Sleep-disordered breathing in the usual lifestyle setting as detected with home monitoring in a Japanese male working population. Sleep. 2008;31:419–25. doi: 10.1093/sleep/31.3.419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Scottish Intercollegiate Guidelines Network. A national clinical guideline. Edinburgh: Scottish Intercollegiate Guidelines Network; 2003. Management of obstructive sleep apnoea/hypopnoea syndrome in adults. [Google Scholar]
- 16.Chiner E, Signes-Costa J, Arriero JM, Marco J, Fuentes I, Sergado A. Nocturnal oximetry for the diagnosis of the sleep apnoea hypopnoea syndrome: a method to reduce the number of polysomnographies? Thorax. 1999;54:968–71. doi: 10.1136/thx.54.11.968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Series F, Marc I, Cormier Y, La Forge J. Utility of nocturnal home oximetry for case finding in patients with suspected sleep apnea hypopnea syndrome. Ann Intern Med. 1993;119:449–53. doi: 10.7326/0003-4819-119-6-199309150-00001. [DOI] [PubMed] [Google Scholar]
- 18.Chesson AL, Jr, Berry RB, Pack A. Practice parameters for the use of portable monitoring devices in the investigation of suspected obstructive sleep apnea in adults. Sleep. 2003;26:907–13. doi: 10.1093/sleep/26.7.907. [DOI] [PubMed] [Google Scholar]
- 19.Kushida CA, Littner MR, Morgenthaler T, et al. Practice parameters for the indications for polysomnography and related procedures: an update for 2005. Sleep. 2005;28:499–521. doi: 10.1093/sleep/28.4.499. [DOI] [PubMed] [Google Scholar]
- 20.Pack AI, Maislin G, Staley B, et al. Impaired performance in commercial drivers: role of sleep apnea and short sleep duration. Am J Respir Crit Care Med. 2006;174:446–54. doi: 10.1164/rccm.200408-1146OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Littner M, Kushida CA, Anderson WM, et al. Practice parameters for the role of actigraphy in the study of sleep and circadian rhythms: an update for 2002. Sleep. 2003;26:337–41. doi: 10.1093/sleep/26.3.337. [DOI] [PubMed] [Google Scholar]
- 22.Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep. 1991;14:540–5. doi: 10.1093/sleep/14.6.540. [DOI] [PubMed] [Google Scholar]
- 23.Johns MW. Reliability and factor analysis of the Epworth Sleepiness Scale. Sleep. 1992;15:376–81. doi: 10.1093/sleep/15.4.376. [DOI] [PubMed] [Google Scholar]
- 24.Ware JE, Jr., Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30:473–83. [PubMed] [Google Scholar]
- 25.Fukuhara S, Bito S, Green J, Hsiao A, Kurokawa K. Translation, adaptation, and validation of the SF-36 Health Survey for use in Japan. J Clin Epidemiol. 1998;51:1037–44. doi: 10.1016/s0895-4356(98)00095-x. [DOI] [PubMed] [Google Scholar]
- 26.Hu FB, Willett WC, Li T, Stampfer MJ, Colditz GA, Manson JE. Adiposity as compared with physical activity in predicting mortality among women. N Engl J Med. 2004;351:2694–703. doi: 10.1056/NEJMoa042135. [DOI] [PubMed] [Google Scholar]
- 27.Whitworth JA. 2003 World Health Organization (WHO)/International Society of Hypertension (ISH) statement on management of hypertension. J Hypertens. 2003;21:1983–92. doi: 10.1097/00004872-200311000-00002. [DOI] [PubMed] [Google Scholar]
- 28.Harrell FE, Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–87. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
- 29.Lee KL, Woodlief LH, Topol EJ, et al. Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction. Results from an international trial of 41,021 patients. GUSTO-I Investigators. Circulation. 1995;91:1659–68. doi: 10.1161/01.cir.91.6.1659. [DOI] [PubMed] [Google Scholar]
- 30.Hosmer DW, Lemeshow S. Applied Logistic Regression (Second Edition) New York: John Wiley & Sons, ; 2000. [Google Scholar]
- 31.Steyerberg EW, Eijkemans MJ, Harrell FE, Jr, Habbema JD. Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med. 2000;19:1059–79. doi: 10.1002/(sici)1097-0258(20000430)19:8<1059::aid-sim412>3.0.co;2-0. [DOI] [PubMed] [Google Scholar]
- 32.Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. doi: 10.1148/radiology.143.1.7063747. [DOI] [PubMed] [Google Scholar]
- 33.Moons KG, Harrell FE, Steyerberg EW. Should scoring rules be based on odds ratios or regression coefficients? J Clin Epidemiol. 2002;55:1054–5. doi: 10.1016/s0895-4356(02)00453-5. [DOI] [PubMed] [Google Scholar]
- 34.Tu JV, Naylor CD. Clinical prediction rules. J Clin Epidemiol. 1997;50:743–4. doi: 10.1016/s0895-4356(97)89028-2. [DOI] [PubMed] [Google Scholar]
- 35.Sullivan LM, Massaro JM, D'Agostino RB., Sr. Presentation of multivariate data for clinical use: The Framingham Study risk score functions. Stat Med. 2004;23:1631–60. doi: 10.1002/sim.1742. [DOI] [PubMed] [Google Scholar]
- 36.Wilson PW, D'Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation. 1998;97:1837–47. doi: 10.1161/01.cir.97.18.1837. [DOI] [PubMed] [Google Scholar]
- 37.Arana E, Delicado P, Marti-Bonmati L. Validation procedures in radiologic diagnostic models. Neural network and logistic regression. Invest Radiol. 1999;34:636–42. doi: 10.1097/00004424-199910000-00005. [DOI] [PubMed] [Google Scholar]
- 38.Scott JA. Pulmonary perfusion patterns and pulmonary arterial pressure. Radiology. 2002;224:513–8. doi: 10.1148/radiol.2242011353. [DOI] [PubMed] [Google Scholar]
- 39.Wasson JH, Sox HC, Neff RK, Goldman L. Clinical prediction rules. Applications and methodological standards. N Engl J Med. 1985;313:793–9. doi: 10.1056/NEJM198509263131306. [DOI] [PubMed] [Google Scholar]
- 40.Stone M. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B. 1974;36:111–47. [Google Scholar]
- 41.Gurubhagavatula I, Maislin G, Nkwuo JE, Pack AI. Occupational screening for obstructive sleep apnea in commercial drivers. Am J Respir Crit Care Med. 2004;170:371–6. doi: 10.1164/rccm.200307-968OC. [DOI] [PubMed] [Google Scholar]
- 42.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45. [PubMed] [Google Scholar]
- 43.Young T, Peppard PE, Gottlieb DJ. Epidemiology of obstructive sleep apnea: a population health perspective. Am J Respir Crit Care Med. 2002;165:1217–39. doi: 10.1164/rccm.2109080. [DOI] [PubMed] [Google Scholar]
- 44.Ulijaszek SJ, Kerr DA. Anthropometric measurement error and the assessment of nutritional status. Br J Nutr. 1999;82:165–77. doi: 10.1017/s0007114599001348. [DOI] [PubMed] [Google Scholar]