Abstract
Epidemiologists are well aware of the negative consequences of measurement error in exposure and outcome variables to their ability to detect putative causal associations. However, empirical proof that remedying the misclassification problem improves estimates of epidemiologic effect is seldom examined in detail. Of all areas in cancer epidemiology, perhaps the best example of the consequences of misclassification and of the steps taken to circumvent them was the pursuit, beginning in the mid-1980s, of the human papillomavirus (HPV) infection–cervical cancer association. The stakes were high: Had the wrong conclusions been reached epidemiologists would have been led astray in the search for competing hypotheses for the sexually transmissible agent causing cervical cancer or in ascribing to HPV infection a mere ancillary role among many lifestyle, hormonal, and environmental factors. The article by Castle et al. in this issue of the Journal (Am J Epidemiol. 2010;171(2):155–163) provides a detailed account of the joint influences of improved HPV and cervical precancer measurements in gradually unveiling the strong magnitude of the underlying association between viral exposure and cervical lesion risk. In this commentary, the authors extend the findings of Castle et al. by providing additional empirical evidence in support of their arguments.
Keywords: cytology, measurement error, misclassification, papillomavirus infections, uterine cervical neoplasms, vaginal smears
Far better an approximate answer to the right question … than an exact answer to the wrong question.
John Tukey. Ann Math Stat 1962;3(1):13.
It is better to be vaguely right than exactly wrong.
Carveth Read, Logic, Deductive and Inductive. London, England: Grant Richards; 1898:272.
The 2008 Nobel Prize in Physiology or Medicine was awarded to Harald zur Hausen for his pioneer role in establishing the causal link between sexually transmissible human papillomavirus (HPV) infection and cervical cancer. His discovery paved the way for 2 promising disease prevention fronts: vaccination to prevent infection with the HPV genotypes that are responsible for the largest etiologic fraction of cervical cancer (HPV-16 and HPV-18) and the inclusion of HPV DNA tests in cervical cancer screening programs to augment traditional Papanicolaou cytology. Dr. zur Hausen's work nicely complemented earlier epidemiologic observations that cervical cancer was associated with sexual activity and, beginning in the early 1980s, it inspired a succession of molecular epidemiologic investigations that attempted to demonstrate that HPV infection was the true intermediate endpoint in the causal pathway from sexual activity to cervical cancer and its precursor lesions (1).
As shown in Figure 1, the epidemiology toolbox could be used to verify the connecting relations (the causal arrows in the diagram) between sexual activity and HPV infection (e.g., cross-sectional surveys of risk factor information and HPV prevalence) and between HPV infection and cervical cancer and its precursors (e.g., case-control and cohort studies). Demonstrating the remote distal connection between sexual activity and cervical cancer was no longer a pressing need in the early 1980s. However, a third and important objective was to demonstrate that this relation was mediated by HPV infection (2, 3). Accomplishing these objectives were major stepping stones in the process of getting to the aforementioned preventive strategies. The stakes were high: Had the wrong conclusions been reached the entire field would have been led astray in the pursuit of competing hypotheses for the putative sexually transmissible agent causing cervical cancer (e.g., herpes simplex virus) or in ascribing to HPV infection a merely ancillary role among many lifestyle, hormonal, and environmental factors. The delay in adopting the right strategies or implementing the wrong ones would have had negative public health consequences. Fast forward to the middle of this decade and we see that the epidemiologic aims have been fulfilled. The right questions were asked, but the evolution in molecular methods for HPV detection and in our understanding of the boundaries of disease definition led to an interesting process of discovery, the elements of which are just as useful as the final knowledge that served as evidence for the new prevention initiatives.
Figure 1.
Simplified etiologic model for cervical cancer showing the role of distal (sexual activity) and intermediate (HPV infection) variables and the ancillary role of cofactors. HPV, human papillomavirus.
EVIDENCE THAT IMPROVED MEASUREMENTS UNVEIL AN EPIDEMIOLOGIC ASSOCIATION
In this issue of the Journal, Castle et al. (4) provide empirical evidence for the epidemiologic tenet that random, nondifferential measurement error in the exposure or in the outcome leads to a bias toward the null. Although in the case of HPV this is easy to demonstrate via modeling (3, 5), one rarely has the opportunity to observe that the reverse is also true; that is, using more accurate methods of exposure and disease ascertainment makes a previously tenuous association increase in magnitude, thus providing credibility for the underlying causal mechanism. The HPV–cervical cancer story is a magnificent example of the dangers that lurk behind misclassification, worth being told in textbooks of intermediate or advanced epidemiology. It underscores the premise that epidemiologists will do well if they incorporate in their studies the best available measurement methods. This will be particularly rewarding in a multidisciplinary environment that fosters mutual understanding of the limits of such methods in a process that continuously attempts to improve the accuracy and relevance of measurement tools vis-à-vis the underlying causal mechanisms. Indeed, the article byline (4) shows seasoned epidemiologists and microbiologists who epitomize the concept of relentless multidisciplinary work toward an improved understanding of HPV and cervical cancer biology and diagnosis. The fact that they are seasoned is the likely reason why they decided to conduct their meticulous analysis of the joint impact of improved ascertainment of HPV and cervical lesions in unveiling the true magnitude of the association. The odds ratio estimates for the least misclassified combinations are what will ultimately matter for clinicians and policymakers; a competent novice team would have glossed over the nuances and gone straight to the computation using the least misclassified variables.
The study that served as the source of data for Castle et al. (4), the so-called Atypical Squamous Cells of Undetermined Significance/Low-Grade Squamous Intraepithelial Lesions Triage Study (ALTS), is a prominent investigation whose findings (6, 7) have revolutionized clinical guidelines for cervical cancer screening and management in the United States and internationally. Conceived in the mid-1990s and based on technologies of that time, ALTS continuously revisited HPV and disease measurements as new technologies emerged and the intermediate endpoints in the natural history of HPV infection and cervical neoplasia became more clearly understood (8). The fact that it was an investigation that originally intended to solve the clinical dilemma of managing patients with equivocal and low-grade cervical cytologic abnormalities in Papanicolaou tests is prudently given as a caveat by Castle et al. (4). For the purposes of their article, however, the ALTS investigation is probably the ideal source of data because of the high cumulative risk of cervical cancer precursors seen in women with these cervical smear abnormalities, which afforded the authors the statistical precision needed to reveal credible dose-response–like trends in the associations as the putative misclassification decreased in both exposure and disease variables.
In brief, their study focused on the second connecting causal arrow in Figure 1, showing unequivocally that the magnitude of the HPV–cervical lesion association increased in response to joint improvements in the credibility of exposure to oncogenic HPVs and in the stringency in grade and consensus requirement for the inherently subjective lesion diagnoses. Strictly speaking, Castle et al. (4) did not examine obsolete molecular methods of HPV DNA detection. Except for the inclusion of the prototype assay (line blot assay) that led to an improved version now commercially available (linear array assay), the HPV detection methods being compared can be considered state-of-the-art and differ only with respect to core technology (signal or target amplification) and intended purpose, that is, cervical cancer screening and clinical triage (Hybrid Capture 2 (Qiagen Corporation, Gaithersburg, Maryland) and AMPLICOR human immunodeficiency virus type 1 monitor test (Roche Molecular Systems, Inc., Pleasanton, California)) or highly sensitive detection and genotyping of cellular specimens in epidemiologic studies (line blot and linear array assays).
Unfortunately, Castle et al. (4) made only a passing remark about the impact of improved classification of the intermediate variable (HPV) on the first connecting causal arrow in Figure 1. They observed that the number of positive HPV tests was strongly associated with the lifetime number of sexual partners reported by the women, consistent with what was expected. The latter variable is an accepted marker of sexual activity that is consistently predictive of cervical cancer risk in epidemiologic studies (9).
OTHER EMPIRICAL EVIDENCE FOR THE ROLE OF IMPROVED CLASSIFICATION OF HPV AND CERVICAL LESIONS
To extend the observations of Castle et al. (4), we examined in an ecologic analysis-type scenario whether there was any evidence that improvements in HPV detection technology over time led to increases in the magnitude of the association with cervical cancer. Figure 2 shows a plot of odds ratios and 95% confidence intervals for this relation derived mostly from case-control studies (10–21). Studies are ordered by year of publication, which underscores the transition from nonamplified hybridization techniques to detect HPV DNA, prevailing in the 1980s, to the new era of amplified target detection via polymerase chain reaction (PCR) protocols. The graph shows that the magnitude of the association increased substantially, from 2- to 5-fold risk increases in the early studies to triple digits in the most recent investigations. Interestingly, even in the PCR era, there were lessons to be learned (Figure 2); the first PCR protocols were not as accurate as today's proven methods (such as the linear array assay in Castle et al. (4)). Moreover, studies based on early PCR methods were not as mindful of the required specimen handling and laboratory precautions to avoid intersample contamination. The latter problem was responsible in the early 1990s for erroneous findings of high HPV prevalence among asymptomatic women (22).
Figure 2.
Odds ratios and 95% confidence intervals for the association between human papillomavirus (HPV) infection (via HPV DNA detection) and invasive cervical cancer risk in successive molecular epidemiologic studies (mostly case-control) (from top to bottom, references 10–21). CI, confidence interval; NAH, nonamplified hybridization; PCR, polymerase chain reaction.
The above analysis focused on HPV misclassification but not on disease ascertainment error. Because the studies in Figure 2 refer to invasive cervical cancer and cervical lesion-free controls, it is plausible to assume that cancer cases would not have been misclassified as controls (the opposite is not necessarily true but each of the studies used eligibility criteria to minimize this possibility). Invasive cervical carcinoma is an irreversible condition with clear histopathologic definition. As such, it is unlikely to suffer from the vagaries of subjective interpretation and transience of the distinct grades of precancerous cervical lesion diagnoses that were so meticulously examined in the ALTS data by Castle et al. (4).
To examine the effect of lesion misclassification, we used preliminary data from the Ludwig-McGill cohort study, a repeated-measurement molecular investigation of the natural history of HPV infection and cervical neoplasia in a low-income neighborhood in São Paulo, Brazil (23, 24). Cervical specimens collected every 4–6 months were tested for HPV DNA by using a PCR assay, and cervical smears were initially read locally at a community cytology laboratory and then shipped to Montreal for blind reading at a Canadian-accredited cytopathology laboratory. Figure 3 shows the Kaplan-Meier cumulative plots of any-grade, incident squamous intraepithelial lesions (SILs) in the cohort as a function of HPV positivity at enrollment. Data on cytologic outcomes were based on smears read until August 1997, and the analysis included only women who were free from SILs at entry and had been tested for HPV (n = 887). The top graph shows the analysis with the local cytology reading, and the bottom graph shows the reanalysis based on the results from the Montreal laboratory. The curves are largely overlapping using the local cytology result, but a clear prognostic effect for HPV positivity appeared when we used the improved lesion classification observed with the Montreal laboratory readings. The Cox model hazard ratios for any-grade SILs given HPV positivity at entry were 1.4 (95% confidence interval: 0.6, 3.3) and 5.8 (95% confidence interval: 3.0, 11.1) for the above 2 outcome classifications, respectively (23). In view of these results, we made arrangements for retraining the cytotechnicians responsible for the local cervical smear readings to correct the problem.
Figure 3.
Kaplan-Meier plots of the cumulative incidence of any-grade squamous intraepithelial lesions (SILs) on Papanicolaou cytology results (n = 887) according to human papillomavirus (HPV) DNA positivity at enrollment received up to August 1997 in the Ludwig-McGill cohort study. Solid line, HPV negative; broken line, HPV positive. Top graph: analysis based on Papanicolaou diagnoses from the local cytology laboratory. Bottom graph: analysis based on review Papanicolaou diagnoses. Adapted with permission from the Pan American Health Organization (PAHO) (23).
Although the above example emphasizes the importance of minimizing disease misclassification in cervical cancer studies, it must be interpreted in light of the caveat that the cytologic diagnosis is a poor surrogate for the actual lesion status in the uterine cervix. One needs only to examine the contrast in the ALTS data between HPV and cytology results as per their trend of association with increasingly “harder” lesion endpoints, such as endpoint 4 in Table 4 of Castle et al. (4), to appreciate this fact. Yet, compared with our Brazilian data, the cytologic diagnoses in the ALTS are as reliable as one can have in North America, akin in quality to the ones based on reading our Brazilian subjects’ smears in Montreal. However, it is the level of cytology quality seen in the local community cytopathology laboratory in Brazil that is representative of the wider picture of cervical cancer prevention in developing countries. Therefore, efforts to increase cytology screening coverage of such high-risk populations are a costly and frivolous undertaking if cytology quality cannot improve dramatically. In light of recent findings using new screening technologies in developing countries, this state of affairs can be substantially remedied by shifting the screening paradigm from cytology to oncogenic HPV detection using affordable methods (25).
EMBARRASSMENT OF RICHES
Can the empirical illustration of the advantages of improved classification in HPV and disease variables in Castle et al. (4) be extended to other areas of cancer etiology? To be sure, there have been major advances in biomarker measurements for molecular epidemiologic investigations of precancerous lesions for other types of cancer. What cannot be easily reproduced is the knowledge derived from studies such as ALTS, in which the cumulative, repeated-sampling design helped the investigators to minimize to the extent possible the lesion outcome misclassification that would have occurred had they done the same exercise with a case-control or cross-sectional study. Moreover, epidemiologists facing the challenges of studying the natural history of precancers or early invasive cancers of the breast, prostate, ovary, and lung do not have the luxury of dealing with an organ site as accessible for sampling as the uterine cervix. Blind biopsies, needle aspiration, and imaging techniques are no match for the exquisite level of detail in disease ascertainment that cervical cancer epidemiologists are accustomed to having and that Castle et al. (4) took to a new extreme. At present, cancer epidemiologists in other areas will find comfort in the realization that the HPV–cervical lesion example is the best justification for their pursuit of continuously improving the measurement of their study variables.
Acknowledgments
Author affiliations: Departments of Epidemiology and Biostatistics and of Oncology, McGill University, Montreal, Quebec, Canada (Eduardo L. Franco, Joseph Tota).
Funding for the Ludwig-McGill cohort study was provided by grants from the US National Institutes of Health (CA70269) and the Canadian Institutes of Health Research (MA-13647, MOP-4936).
Conflict of interest: none declared.
Glossary
Abbreviations
- ALTS
Atypical Squamous Cells of Undetermined Significance/Low-Grade Squamous Intraepithelial Lesions Triage Study
- HPV
human papillomavirus
- PCR
polymerase chain reaction
- SIL
squamous intraepithelial lesion
References
- 1.Franco EL, Olsen J, Saracci R, et al. Epidemiology's contributions to a Nobel Prize recognition. Epidemiology. 2009;20(5):632–634. doi: 10.1097/EDE.0b013e3181afede2. [DOI] [PubMed] [Google Scholar]
- 2.Schiffman MH, Bauer HM, Hoover RN, et al. Epidemiologic evidence showing that human papillomavirus infection causes most cervical intraepithelial neoplasia. J Natl Cancer Inst. 1993;85(12):958–964. doi: 10.1093/jnci/85.12.958. [DOI] [PubMed] [Google Scholar]
- 3.Franco EL. The sexually transmitted disease model for cervical cancer: incoherent epidemiologic findings and the role of misclassification of human papillomavirus infection. Epidemiology. 1991;2(2):98–106. [PubMed] [Google Scholar]
- 4.Castle PE, Schiffman M, Wheeler CM, et al. Impact of improved classification on the association of human papillomavirus with cervical precancer. Am J Epidemiol. 2010;171(2):155–163. doi: 10.1093/aje/kwp390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Franco EL. Statistical issues in human papillomavirus testing and screening. Clin Lab Med. 2000;20(2):345–367. [PubMed] [Google Scholar]
- 6.Human papillomavirus testing for triage of women with cytologic evidence of low-grade squamous intraepithelial lesions: baseline data from a randomized trial. The Atypical Squamous Cells of Undetermined Significance/Low-Grade Squamous Intraepithelial Lesions Triage Study (ALTS) Group. J Natl Cancer Inst. 2000;92(5):397–402. doi: 10.1093/jnci/92.5.397. [DOI] [PubMed] [Google Scholar]
- 7.Solomon D, Schiffman M, Tarone R, et al. Comparison of three management strategies for patients with atypical squamous cells of undetermined significance: baseline results from a randomized trial. J Natl Cancer Inst. 2001;93(4):293–299. doi: 10.1093/jnci/93.4.293. [DOI] [PubMed] [Google Scholar]
- 8.Wright TC, Jr, Schiffman M. Adding a test for human papillomavirus DNA to cervical-cancer screening. N Engl J Med. 2003;348(6):489–490. doi: 10.1056/NEJMp020178. [DOI] [PubMed] [Google Scholar]
- 9.International Collaboration of Epidemiological Studies of Cervical Cancer. Cervical carcinoma and sexual behavior: collaborative reanalysis of individual data on 15,461 women with cervical carcinoma and 29,164 women without cervical carcinoma from 21 epidemiological studies. Cancer Epidemiol Biomarkers Prev. 2009;18(4):1060–1069. doi: 10.1158/1055-9965.EPI-08-1186. [DOI] [PubMed] [Google Scholar]
- 10.Meanwell CA, Cox MF, Blackledge G, et al. HPV 16 DNA in normal and malignant cervical epithelium: implications for the aetiology and behaviour of cervical neoplasia. Lancet. 1987;1(8535):703–707. doi: 10.1016/s0140-6736(87)90353-9. [DOI] [PubMed] [Google Scholar]
- 11.Reeves WC, Brinton LA, García M, et al. Human papillomavirus infection and cervical cancer in Latin America. N Engl J Med. 1989;320(22):1437–1441. doi: 10.1056/NEJM198906013202201. [DOI] [PubMed] [Google Scholar]
- 12.Donnan SP, Wong FW, Ho SC, et al. Reproductive and sexual risk factors and human papilloma virus infection in cervical cancer among Hong Kong Chinese. Int J Epidemiol. 1989;18(1):32–36. doi: 10.1093/ije/18.1.32. [DOI] [PubMed] [Google Scholar]
- 13.Peng HQ, Liu SL, Mann V, et al. Human papillomavirus types 16 and 33, herpes simplex virus type 2 and other risk factors for cervical cancer in Sichuan Province, China. Int J Cancer. 1991;47(5):711–716. doi: 10.1002/ijc.2910470515. [DOI] [PubMed] [Google Scholar]
- 14.Kanetsky PA, Mandelblatt J, Richart R, et al. Risk factors for cervical cancer in a black elderly population: preliminary findings. Ethn Dis. 1992;2(4):337–342. [PubMed] [Google Scholar]
- 15.Muñoz N, Bosch FX, de Sanjosé S, et al. The causal link between human papillomavirus and invasive cervical cancer: a population-based case-control study in Colombia and Spain. Int J Cancer. 1992;52(5):743–749. doi: 10.1002/ijc.2910520513. [DOI] [PubMed] [Google Scholar]
- 16.Shen CY, Ho MS, Chang SF, et al. High rate of concurrent genital infections with human cytomegalovirus and human papillomaviruses in cervical cancer patients. J Infect Dis. 1993;168(2):449–452. doi: 10.1093/infdis/168.2.449. [DOI] [PubMed] [Google Scholar]
- 17.Eluf-Neto J, Booth M, Muñoz N, et al. Human papillomavirus and invasive cervical cancer in Brazil. Br J Cancer. 1994;69(1):114–119. doi: 10.1038/bjc.1994.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Asato T, Nakajima Y, Nagamine M, et al. Correlation between the progression of cervical dysplasia and the prevalence of human papillomavirus. J Infect Dis. 1994;169(4):940–941. doi: 10.1093/infdis/169.4.940. [DOI] [PubMed] [Google Scholar]
- 19.Herrero R, Hildesheim A, Bratti C, et al. Population-based study of human papillomavirus infection and cervical neoplasia in rural Costa Rica. J Natl Cancer Inst. 2000;92(6):464–474. doi: 10.1093/jnci/92.6.464. [DOI] [PubMed] [Google Scholar]
- 20.Thomas DB, Qin Q, Kuypers J, et al. Human papillomaviruses and cervical cancer in Bangkok. II. Risk factors for in situ and invasive squamous cell cervical carcinomas. Am J Epidemiol. 2001;153(8):732–739. doi: 10.1093/aje/153.8.732. [DOI] [PubMed] [Google Scholar]
- 21.Muñoz N, Bosch FX, de Sanjosé S, et al. Epidemiologic classification of human papillomavirus types associated with cervical cancer. N Engl J Med. 2003;348(6):518–527. doi: 10.1056/NEJMoa021641. [DOI] [PubMed] [Google Scholar]
- 22.Tidy JA, Parry GC, Ward P, et al. High rate of human papillomavirus type 16 infection in cytologically normal cervices [letter] Lancet. 1989;1(8635):434. doi: 10.1016/s0140-6736(89)90023-8. [DOI] [PubMed] [Google Scholar]
- 23.Franco E, Villa L, Rohan T, et al. Design and methods of the Ludwig-McGill longitudinal study of the natural history of human papillomavirus infection and cervical neoplasia in Brazil. Ludwig-McGill Study Group. Rev Panam Salud Publica. 1999;6(4):223–233. doi: 10.1590/s1020-49891999000900001. [DOI] [PubMed] [Google Scholar]
- 24.Schlecht NF, Kulaga S, Robitaille J, et al. Persistent human papillomavirus infection as a predictor of cervical intraepithelial neoplasia. JAMA. 2001;286(24):3106–3114. doi: 10.1001/jama.286.24.3106. [DOI] [PubMed] [Google Scholar]
- 25.Sankaranarayanan R, Nene BM, Shastri SS, et al. HPV screening for cervical cancer in rural India. N Engl J Med. 2009;360(14):1385–1394. doi: 10.1056/NEJMoa0808516. [DOI] [PubMed] [Google Scholar]



