Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
letter
. 2014 Jul 25;180(6):656–658. doi: 10.1093/aje/kwu184

Re: “Validation of a Method for Reconstructing Historical Rates of Smoking Prevalence”

Dean R Lillard 1,2,3, Rebekka Christopoulou 1, Ana Gil Lacruz 4
PMCID: PMC4157701  PMID: 25063816

In their recent paper, Bilal et al. (1) incorrectly assert that no researchers have assessed the validity of historical smoking prevalence derived from retrospectively reported data. A large body of literature validates this type of data in general (24) and smoking data in particular (512). In fact, at least 2 previously published studies applied an improved method (8, 9) with better data.

Bilal et al. correct retrospectively derived smoking prevalence rates for differential mortality rates of smokers and nonsmokers and validate their estimates against contemporaneously reported data. However, they validate prevalence rates in just 5 prior years using paired surveys that differ by, at most, 20 years. The earlier studies validate prevalence rates in 11 (8) and 27 (9) prior years using paired surveys that differ by up to 40 years. More importantly, Bilal et al. correct for differential mortality bias with a method developed in 1983 (13), when age, sex, and cause-specific mortality data were unavailable. They ignore the updated method of Christopoulou et al. (9), which exploits these now widely available vital statistics. The new data allowed Christopoulou et al. to relax the assumption that differential mortality rates are constant within broad age categories. Following Peto et al. (14), these authors calculate the number of smoking-attributable deaths for each smoking-related disease by sex, year, and 5-year age category and apply the method of Harris (13) using these data. Thus, they reconstruct historical smoking prevalence for disaggregated demographic groups (by sex and birth cohort) and formally test when, and for which groups, reconstructed prevalence rates deviate from “true” rates.

The more detailed data and method lead to substantively different research and policy implications. Bilal et al (1) find that the mortality correction affects only the reconstructed smoking prevalence rates of men in the early decades of their study. For all other cases, they find no statistically significant differences. These results and their perception of the correction method's complexity lead Bilal et al. to advise against correcting for differential mortality rates for recent historical periods. Christopoulou et al. (9) showed that whether, when, and how the differential mortality adjustment affects smoking rates depends mostly on how old respondents were when interviewed, not on the time period of study.

To illustrate, we apply their method to the Spanish National Health Survey data used by Bilal et al. (for details, see the Web material available at http://aje.oxfordjournals.org/). In Figure 1, we plot the adjusted (dashed line) and unadjusted (solid line) smoking prevalence rates for Spanish men and women who were aged A) 60–69 years, B) 70–79 years, or C) 80 or more years in 2007. In Spain, as in other countries treated in the article by Christopoulou et al. (9), the 2 series overlap almost completely for the group of men aged 60–69 years and for all groups of women (whose smoking prevalence rates never exceed 14%). Figure 2A (for men) and Figure 2B (for women) plot for each year of each cohort's life the χ2 statistic derived from the standard Pearson test of independence for binary variables, which tests whether adjusted and unadjusted prevalence rates differ statistically. As in the study by Christopoulou et al. (9), rates differ statistically only for men who, when surveyed, were 70 or more years of age and differ statistically in years quite close to the survey year. Table 1 summarizes the results.

Figure 1.

Figure 1.

Correction of smoking prevalence for differential mortality rates of men (2 top lines) and women (2 bottom overlapping lines) at ages A) 60–69 years, B) 70–79 years, and C) 80 or more years in 2007. Retrospective smoking data are from the Spanish National Health Survey in 1995, 1997, 2001, 2003, and 2006.

Figure 2.

Figure 2.

χ2 test of differences between adjusted and unadjusted smoking prevalence rates for A) men and B) women at ages 60–69 years (solid line), 70–79 years (dotted line), and 80 or more years (dashed line) in 2007. The horizontal gray line marks the critical value at the 5% level of statistical significance.

Table 1.

Summary of Indicators of Unadjusted and Adjusted Peak Smoking Prevalence Rates for 3 Cohorts of Spanish Men and Women

Age in 2007, years, by Sex Mean Sample Sizea Peak Unadjusted Prevalence Peak Adjusted Prevalence Difference at Peak Rate Years With Statistically Significant Differences
Men
 60–69 2,235 67.26 68.53 1.27
 70–79 3,856 67.03 70.07 3.04 1944–1995
 ≥80 3,637 65.27 74.21 8.94 1924–1997
Women
 60–69 4,989 13.99 14.01 0.02
 70–79 5,858 5.15 5.15 0
 ≥80 3,706 2.32 2.32 0

a Mean sample size is the average number of individuals over all available years.

The findings are particularly pertinent to users of the growing number of surveys that interview individuals aged 50 years or more. Aging studies in the US Health and Retirement Study (15), the Survey of Health, Ageing, and Retirement in Europe (16), the English Longitudinal Study of Ageing (17), the Korean Longitudinal Study of Ageing (18), the China Health and Retirement Longitudinal Study (19), and the Longitudinal Aging Study of India (20) ask respondents to retrospectively report their lifetime smoking behaviors. Especially because these samples include many older respondents, researchers need to account for differential smoking-related mortality rates.

More broadly, the findings highlight the point that, to identify the degree of smoking diffusion in a particular country and year, as proposed by the cigarette epidemic model (21), the population smoking prevalence rate masks important heterogeneity across cohorts, sexes, and countries. This point, recently acknowledged in an updated “epidemic” model (22), was already made by Christopoulou et al. (9), not only with respect to sex, but also with respect to age.

An interesting contribution of Bilal et al. (1) is that they follow the cigarette epidemic model (21, 22) and estimate the correlation between reconstructed smoking prevalence and future lung cancer mortality rates (as a proxy for smoking-attributable mortality rate). Of course, the underlying structural relationships between smoking and disease are not captured in this reduced-form model, which weakens the policy relevance of the results. However, the more immediate point is that the exercise does not test their article's central claim—whether reconstructed smoking rates are valid.

Bilal et al. (1) also fail to acknowledge a broader body of literature that addresses other types of bias that are likely present in retrospective smoking data. For example, Kenkel et al. (23) treated an indicator of smoking initiation constructed from retrospective data as a standard binary variable measured with error and used an established technique to correct it (24). They showed that some of the bias occurs because people who smoke few cigarettes in early life do not identify themselves as smokers later in life (i.e., light-smoker bias). Further, Bar and Lillard (25) documented the bias that arises when people “heap” their retrospective reports of smoking cessation on units evenly divisible by 5 (i.e., heaping bias) and proposed a method to mitigate it. Of course, more work remains. The correction method described by Bar and Lillard (25) can be extended and enriched to control for factors that systematically predict recall errors. One might also refine the method of Christopoulou et al. (9) by relaxing the assumption that smokers who survive to answer a survey start and stop smoking at the same rate as smokers who die before they are surveyed. These and other issues are at the forefront of research that attempts to validate historical smoking rates generated with retrospective data.

Supplementary Material

Web Material

Acknowledgments

This project received funding from the National Institute on Aging (grant 1 R01 AG030379-03).

We thank Dr. M.J. Thun for generously providing the Cancer Prevention Study II (1982–1988) cause-specific mortality data.

Conflict of interest: None declared.

References

  • 1.Bilal U, Fernández E, Beltran P, et al. Validation of a method for reconstructing historical rates of smoking prevalence. Am J Epidemiol. 2014;179(1):15–19. doi: 10.1093/aje/kwt224. [DOI] [PubMed] [Google Scholar]
  • 2.Berney LR, Blane DB. Collecting retrospective data: accuracy of recall after 50 years judged against historical records. Soc Sci Med. 1997;45(10):1519–1525. doi: 10.1016/s0277-9536(97)00088-9. [DOI] [PubMed] [Google Scholar]
  • 3.Simpura J, Poikolainen K. Accuracy of retrospective measurement of individual alcohol consumption in men: a reinterview after 18 years. J Stud Alcohol. 1983;44(5):911–917. doi: 10.15288/jsa.1983.44.911. [DOI] [PubMed] [Google Scholar]
  • 4.Koenig LB, Jacob T, Haber JR. Validity of the lifetime drinking history: a comparison of retrospective and prospective quantity-frequency measures. J Stud Alcohol Drugs. 2009;70(2):296–303. doi: 10.15288/jsad.2009.70.296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Shipton D, Tappin DM, Vadiveloo T, et al. Reliability of self reported smoking status by pregnant women for estimating smoking prevalence: a retrospective, cross sectional study. BMJ. 2009;339:b4347. doi: 10.1136/bmj.b4347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Brigham J, Lessov-Schlaggar CN, Javitz HS, et al. Reliability of adult retrospective recall of lifetime tobacco use. Nicotine Tob Res. 2008;10(2):287–299. doi: 10.1080/14622200701825718. [DOI] [PubMed] [Google Scholar]
  • 7.Brigham J, Lessov-Schlaggar CN, Javitz HS, et al. Validity of recall of tobacco use in two prospective cohorts. Am J Epidemiol. 2010;172(7):828–835. doi: 10.1093/aje/kwq179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kenkel D, Lillard DR, Mathios A. Smoke or fire? Are retrospective smoking data valid? Addiction. 2003;98(9):1307–1313. doi: 10.1046/j.1360-0443.2003.00445.x. [DOI] [PubMed] [Google Scholar]
  • 9.Christopoulou R, Han J, Jaber A, et al. Dying for a smoke: How much does differential mortality of smokers affect estimated life-course smoking prevalence? Prev Med. 2011;52(1):66–70. doi: 10.1016/j.ypmed.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Krall EA, Valadian I, Dwyer JT, et al. Accuracy of recalled smoking data. Am J Public Health. 1989;79(2):200–202. doi: 10.2105/ajph.79.2.200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Pershagen G, Axelson O. A validation of questionnaire information on occupational exposure and smoking. Scand J Work Environ Health. 1982;8(1):24–28. doi: 10.5271/sjweh.2500. [DOI] [PubMed] [Google Scholar]
  • 12.Connor Gorber S, Schofield-Hurwitz S, Hardt J, et al. The accuracy of self-reported smoking: a systematic review of the relationship between self-reported and cotinine-assessed smoking status. Nicotine Tob Res. 2009;11(1):12–24. doi: 10.1093/ntr/ntn010. [DOI] [PubMed] [Google Scholar]
  • 13.Harris JE. Cigarette smoking among successive birth cohorts of men and women in the United States during 1900–80. J Natl Cancer Inst. 1983;71(3):473–479. [PubMed] [Google Scholar]
  • 14.Peto R, Lopez AD, Boreham J, et al. Mortality from tobacco in developed countries: indirect estimation from national vital statistics. Lancet. 1992;339(8804):1268–1278. doi: 10.1016/0140-6736(92)91600-d. [DOI] [PubMed] [Google Scholar]
  • 15.The Regents of the University of Michigan. Health and Retirement Study. http://hrsonline.isr.umich.edu/index.php?p=dbook . Accessed July 7, 2014. [Google Scholar]
  • 16.Börsch-Supan A, Brugiavini A, Jürges H, et al., editors. First Results From the Survey of Health, Ageing and Retirement in Europe (2004–2007) Mannheim, Germany: Mannheim Research Institute for the Economics of Aging; 2008. [Google Scholar]
  • 17.Banks J, Breeze E, Lessof C, et al., editors. Living in the 21st Century: Older People in England ELSA 2006 (Wave 3) London, United Kingdom: Institute for Fiscal Studies; 2011. [Google Scholar]
  • 18.Korean Longitudinal Study of Ageing. About KLoSA. http://www.kli.re.kr/klosa/en/about/introduce.jsp . Accessed July 14, 2014. [Google Scholar]
  • 19.Peking University Institute of Social Science Survey. China Health and Retirement Longitudinal Study. http://charls.ccer.edu.cn/en . Accessed July 7, 2014. [Google Scholar]
  • 20.Harvard Center for Population and Development Studies. LASI. http://www.hsph.harvard.edu/pgda/LASI/about.html . Accessed July 7, 2014. [Google Scholar]
  • 21.Lopez AD, Collishaw NE, Piha T. A descriptive model of the cigarette epidemic in developed countries. Tob Control. 1994;3(3):242–247. [Google Scholar]
  • 22.Thun M, Peto R, Boreham J, et al. Stages of the cigarette epidemic on entering its second century. Tob Control. 2012;21(2):96–101. doi: 10.1136/tobaccocontrol-2011-050294. [DOI] [PubMed] [Google Scholar]
  • 23.Kenkel DS, Lillard DR, Mathios AD. Accounting for misclassification error in retrospective smoking data. Health Econ. 2004;13(10):1031–1044. doi: 10.1002/hec.934. [DOI] [PubMed] [Google Scholar]
  • 24.Hausman JA, Abrevaya J, Scott-Morton FM. Misclassification of the dependent variable in a discrete-response setting. J Econom. 1998;87(2):239–269. [Google Scholar]
  • 25.Bar HY, Lillard DR. Accounting for heaping in retrospectively reported event data—a mixture-model approach. Stat Med. 2012;31(27):3347–3365. doi: 10.1002/sim.5419. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web Material

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES