We appreciate the cautious optimism offered in the two letters responding to our editorial. Our criticism of self-reported sleep duration in epidemiology is not new (references 2–5 in our editorial [1]). In fact, the recent AASM consensus statement [2] regarding healthy sleep duration (NB: consensus is the lowest form of evidence) notes the limitations of self-reporting in detail, and specifically anticipates “opportunities for accurate, reliable, scalable objective sleep duration assessment in large epidemiological studies.”
While we concede that epidemiology publications may not intend to inform individualized medical decisions, the reality is that causal inferences about sleep duration reach the public and the clinic. The AASM consensus statement uses causal language (to “promote” health) and describes the process of inferring causality for short (but not long) sleep duration leading to adverse health outcomes, while also recognizing that causality remains unproven, stating that future studies of sleep extension should “address whether sleep plays a causal role in health …”. In parallel, both public and academic sources repeat the claim that modern society suffers from an increasing and epidemic loss of sleep, which has been refuted [3,4]. In the Dysfunctional Beliefs and Attitudes About Sleep, the #1 belief to be addressed for insomnia patients is “Need 8 h of sleep”, a sobering reminder of how information can become mis-interpreted to the point of perpetuating insomnia. We can imagine a reverse risk, which has not yet been studied, that adults with no symptoms who are sleeping “a healthy amount” might conclude they need not investigate sleep any further, even if they are in a high-risk group for sleep apnea, for instance. The same kind of risk-benefit balance we apply to diagnostic or therapeutic decisions should extend to research information that could be misunderstood, particularly in light of how quickly new claims travel from academia to mainstream media.
Both response letters interpreted our editorial as critical of all self-reporting. Although this was neither the letter nor the spirit of our piece, we have previously written about the inferential limitations of the STOP-Bang and Epworth [5,6], despite both being described as “validated”, because any test interpretation always requires three ingredients (sensitivity, specificity, and prior probability) [7]. The cited review of the STOP-Bang tool reports higher post-test risk after a negative screen in at-risk populations than the baseline general population risk [8], a reminder of paradoxes and challenges facing sleep apnea screening, which are independent of the self-reporting components. We have also published that symptoms contributed no additional information to predictive models of sleep apnea risk beyond what can be extracted from the medical record [9]. We do not consider this finding a criticism of symptoms; instead, it highlights the potential for automated clinical decision support.
Regarding the Sleep Heart Health Study (SHHS), Silva et al. [10] summarized the many confounds of self-reported duration in their publication, and reported a miniscule relationship between subjective and objective sleep duration (correlation of 0.16), a reminder that subjective duration is not simply a systematic deviation from objective duration. Leaving aside that other work failed to find an association with hypertension [11], other analysis of the SHHS showed R-values <0.05 between PSG features and systolic blood pressure [12], a reminder of the challenges of correlation inference even with objective metrics such as sleep apnea severity [13]. Regardless, even if we assume short self-reported duration is in fact capturing a causal and actionable predictor of hypertension, can we imagine an interventional trial to improve (objectively measured) blood pressure by extending self-reported sleep alone, without confirming the sleep changes objectively?
We take little solace in the example of smoking and lung cancer, in part because correlational findings in cross-sectional and epidemiological studies more commonly do not turn out to be causal, a reminder of the basic post hoc ergo propter hoc fallacy. Even in this smoking example, recent work showed that objective measures of smoking exposure (epigenetics) predicted morbidity and mortality even after adjustment for self-reported smoking [14].
We agree that at-home objective monitors require improvement, and have described the inferential and validation risks regarding consumer devices specifically [15]. We can articulate the steps required to overcome these limitations at hardware and software levels. However, we have difficulty imagining an approach to improve self-reported duration queries that would not benefit from concurrent objective measurement. Notably, the same authors cited [16] as evidence for concurrence between self-report and wrist actigraphy (though arguably wide variation: ~30 ± 78 min in n = 100 adults) themselves more recently emphasized both the advantages and the real-world implementation of objective monitoring in large active prospective cohorts [17].
Objective measurement is obligatory if we are claiming to assess an aspect of health that we already agree is objectively measurable, especially when self-report carries multiform uncertainties and when the causal inferences from self-report carry potential risk. No such claims exist for nausea or headache, which is why it would make no sense to argue for (non-existent) objective data. Again, we did not criticize self-report in general. An example was offered that quality of life scales predict outcomes in other settings – if that is clinically useful information, then so much the better. However, when the measurement (here, subjective sleep duration) presents a natural (but unproven) causal interpretation to the public and to physicians alike, and whereas forming beliefs and taking actions based on causal interpretation includes potential risk, we should insist on a higher bar. For the example of mortality risk, even if we ignore the variable findings of self-reported sleep duration (mixture of no association, inverse relation, or U-shaped relation [18]), we should not under-estimate the dual narrative errors of conflating subjective with objective sleep duration and conflating correlation with causation.
From a purely biological standpoint, when sleep is abnormal (presumably of interest to the epidemiologist, clinician, and patient alike), increased variability is likely for night-to-night (and also longer time scale) sleep duration, as well as other dimensions of sleep. We agree that a single night objective measure would fail to capture variability by definition. However, a single subjective estimate represents a kind of low-pass filter, which obscures inferences in non-systematic ways. Temporal variability in objective measures, as well as in perceptions of sleep, are of phenotypic and clinical interest. Moreover, the subjective–objective discordance, which may have state and trait properties, is itself of phenotypic and clinical interest [19].
The argument that asking about duration is more feasible from a cost perspective strikes us as a concerning form of compromise. We do not apply such reasoning in other settings. Asking about sleep latency of naps, or about witnessed apneas in sleep, would also be quick, affordable, and simple. Would replacing the MSLT (as implied by the cited article [20]) and respiration monitoring with questionnaires represent an advancement in care or science?
In summary, we submit that for any hypothesis regarding sleep duration, the scientific and clinical impact will necessarily be greater if objective measures are used. After decades of self-reported sleep duration epidemiology, we propose that the main novelty in this arena came from the addition of objective data [21], and we look with hope toward further exciting developments. We should promote and advance concurrent objective sleep measures, not in criticism of self-report per se, but rather because we can and should do better for the field and for our patients.
Footnotes
Conflict of interest
The ICMJE uniform disclosure form for potential conflicts of interest associated with this article can be viewed by clicking on the following link: http://dx.doi.org/10.1016/j.sleep.2017.07.017.
Contributor Information
Matt T. Bianchi, Neurology Department, Massachusetts General Hospital, Wang 7, 55 Fruit Street, Boston, MA 02114, USA; Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115, USA.
Robert J. Thomas, Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115, USA; Division of Pulmonary, Critical Care & Sleep, Department of Medicine, Beth, Israel; Deaconess Medical Center, Boston, MA 02215, USA
M. Brandon Westover, Neurology Department, Massachusetts General Hospital, Wang 7, 55 Fruit Street, Boston, MA 02114, USA.
References
- [1].Bianchi MT, Thomas RJ, Westover MB. An open request to epidemiologists: please stop querying self-reported sleep duration. Sleep Med Jul 2017;35:92–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Consensus Conference Panel, Watson NF, Badr MS, et al. Joint Consensus Statement of the American Academy of Sleep Medicine and Sleep Research Society on the recommended amount of sleep for a healthy adult: methodology and discussion. J Clin Sleep Med Off Publ Am Acad Sleep Med Aug 15, 2015;11(8):931–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Youngstedt SD, Goff EE, Reynolds AM, et al. Has adult sleep duration declined over the last 50+ years? Sleep Med Rev Aug 2016;28:69–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Horne J. Sleeplessness: assessing sleep need in society today. Palgrave Mac-Millan; 2016. [Google Scholar]
- [5].Eiseman NA, Westover MB, Mietus JE, et al. Classification algorithms for predicting sleepiness and sleep apnea severity. J Sleep Res Feb 2012;21(1):101–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Bianchi MT. Screening for obstructive sleep apnea: Bayes weighs in. Open Sleep J 2009;2:56–9. [Google Scholar]
- [7].Bianchi MT. Bayes' theorem and the rule of 100: a commentary on ‘Validity of administrative data for identification of obstructive sleep apnea’. J Sleep Res 2017;26(3):401. [DOI] [PubMed] [Google Scholar]
- [8].Nagappa M, Liao P, Wong J, et al. Validation of the STOP-Bang questionnaire as a screening tool for obstructive sleep apnea among different populations: a systematic review and meta-analysis. PLoS ONE 2015;10(12):e0143697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Ustun B, Westover MB, Rudin C, et al. Clinical prediction models for sleep apnea: the importance of medical history over symptoms. J Clin Sleep Med Off Publ Am Acad Sleep Med Feb 2016;12(2):161–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Silva GE, Goodwin JL, Sherrill DL, et al. Relationship between reported and measured sleep times: the sleep heart health study (SHHS). J Clin Sleep Med Off Publ Am Acad Sleep Med Oct 15, 2007;3(6):622–30. [PMC free article] [PubMed] [Google Scholar]
- [11].Lima-Costa MF, Peixoto SV, Rocha FL. Usual sleep duration is not associated with hypertension in Brazilian elderly: the Bambui Health Aging Study (BHAS). Sleep Med Oct 2008;9(7):806–7. [DOI] [PubMed] [Google Scholar]
- [12].Redline S, Min NI, Shahar E, et al. Polysomnographic predictors of blood pressure and hypertension: is one index best? Sleep Sep 2005;28(9):1122–30. [DOI] [PubMed] [Google Scholar]
- [13].Peppard PE. Is obstructive sleep apnea a risk factor for hypertension? – differences between the Wisconsin Sleep Cohort and the Sleep Heart Health Study. J Clin Sleep Med Off Publ Am Acad Sleep Med Oct 15, 2009;5(5):404–5. [PMC free article] [PubMed] [Google Scholar]
- [14].Bojesen SE, Timpson N, Relton C, et al. AHRR (cg05575921) hypomethylation marks smoking behaviour, morbidity and mortality. Thorax Jul 2017;72(7):646–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Russo K, Goparaju B, Bianchi MT. Consumer sleep monitors: is there a baby in the bathwater? Nat Sci Sleep 2015;7:147–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Zinkhan M, Berger K, Hense S, et al. Agreement of different methods for assessing sleep characteristics: a comparison of two actigraphs, wrist and hip placement, and self-report with polysomnography. Sleep Med Sep 2014;15(9):1107–14. [DOI] [PubMed] [Google Scholar]
- [17].Zinkhan M, Kantelhardt JW. Sleep assessment in large cohort studies with high-resolution accelerometers. Sleep Med Clin Dec 2016;11(4):469–88. [DOI] [PubMed] [Google Scholar]
- [18].Kurina LM, McClintock MK, Chen JH, et al. Sleep duration and all-cause mortality: a critical review of measurement and associations. Ann Epidemiol Jun 2013;23(6):361–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Harvey AG, Tang NK. (Mis)perception of sleep in insomnia: a puzzle and a resolution. Psychol Bull Oct 3, 2012;138(1):77–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Johns MW. Sensitivity and specificity of the multiple sleep latency test (MSLT), the maintenance of wakefulness test and the epworth sleepiness scale: failure of the MSLT as a gold standard. J Sleep Res Mar 2000;9(1):5–11. [DOI] [PubMed] [Google Scholar]
- [21].Vgontzas AN, Fernandez-Mendoza J, Liao D, et al. Insomnia with objective short sleep duration: the most biologically severe phenotype of the disorder. Sleep Med Rev Feb 15, 2013;17(4):241–54. [DOI] [PMC free article] [PubMed] [Google Scholar]