Abstract
High-risk human papillomavirus (HR-HPV) testing is increasingly important. We therefore examined the impact on accuracy of repeated versus one-time testing, type-specific versus pooled detection, and assay analytic sensitivity. By using a nested case-control design from the ASCUS-LSIL Triage Study, we selected women with incident cervical intraepithelial neoplasia grade 2 or grade 3 (CIN2/3; n = 325) and a random sample of women with <CIN2 as controls (n = 401). HPV DNA status was assessed using hybrid capture 2 (HC2), a pooled test for 13 HR-HPV types, and the linear array (LA) and the line blot assay (LBA), two PCR-based HPV genotyping assays, at enrollment and the 6-month follow-up visit. The relative sensitivity and specificity for different permutations of multiple measurements were compared to a single measurement using marginal regression models. We found that repeat detection of any HR-HPV (by HC2, LA, or LBA) and of type-specific persistence (by LA or LBA) were significantly more specific but less sensitive than use of a single time point measurement of any HR-HPV. Sensitivity decreased and specificity increased further when testing intervals were increased from 12 to 24 months. Including detection of borderline carcinogenic/noncarcinogenic HPV types with HR-HPV types decreased specificity for repeat measures of HPV with no impact on sensitivity. Similar patterns were observed when we used a CIN3 end point. We conclude that assay performance for detecting incident CIN2/3 was affected by which types were included, the analytic sensitivity of the assay, and the testing interval. These trade-offs need to be considered when assessing the potential overall clinical utility of repeated testing for HR-HPV DNA to identify women at risk for CIN2/3.
INTRODUCTION
Cervical infections by any of approximately 13 anogenital high-risk human papillomavirus (HR-HPV) types are generally recognized as the cause of cervical cancer (14, 23). Although the recent introduction of a highly effective prophylactic HPV vaccine has great promise for the prevention of persistent infections and precancerous lesions, cervical cancer screening will still be required because the current vaccines do not protect against all carcinogenic HPV types and do not treat preexisting HPV infections and related disease.
While anogenital HPVs are the most common sexually transmitted infections worldwide, only a small fraction of these infections lead to the development of invasive cancer and its precursor, cervical intraepithelial neoplasia 3 (CIN3) (17). Most HR-HPV infections clear within 1 to 2 years after initial detection; however, persistent detection of HR-HPV for a year or more is a major risk factor for progression to precancer and cancer (3, 10, 21).
HR-HPV DNA detection is more sensitive than Pap smears for detection of CIN2, CIN3, or cancer, and it has a higher negative predictive value (1, 15, 19). HR-HPV DNA testing is currently recommended for triage of cytological diagnoses of atypical squamous cells of undetermined significance (ASCUS), as a cotest with the Pap smear in the general screening of women ≥30 years of age, and for follow-up of women after colonoscopy and treatment (25). The high negative predictive value of HR-HPV DNA testing also allows for the safe extension of screening intervals from 1 to 3 years or more (5, 22).
The primary limitation of HR-HPV DNA testing as a primary screening tool is the relatively low specificity due to its inability to distinguish between benign, transient HPV infections and those that will persist and possibly develop into cervical precancer. As a result, HPV testing will label more women as screen positive, which can lead to unnecessary and sometimes invasive follow-up procedures. Given the demonstrated role of viral persistence on the relative and absolute risk of CIN3 (3, 11), there is the possibility that repeat measurements of HPV DNA over a given interval may improve the specificity of DNA detection by identifying persistent versus transient infections that predict significantly lower risk of CIN3. The goal of this analysis was to evaluate different methods/strategies for the detection of HPV persistence and their effects on identifying women with cervical precancerous lesions in the squamous cells of undetermined significance (ASCUS) and low-grade squamous intraepithelial lesions (LSIL) triage study (ALTS).
MATERIALS AND METHODS
Study design and population.
The ALTS (1997 to 2001) was a multisite, randomized clinical trial that compared three management strategies (immediate colposcopy [IC], HPV triage, and conservative management [CM]) for women referred for ASCUS (n = 3,488) or LSIL (n = 1,572) by conventional cytology (1, 16). (Note that ASCUS under the 1991 Bethesda system was slightly more inclusive, particularly of probable reactive changes and ASC-H [atypical squamous cells for which a high-grade intraepithelial lesion cannot be ruled out], than the ASCUS category of the 2001 Bethesda system.) The National Cancer Institute and local institutional review boards approved the study, and all participants provided written, informed consent.
At enrollment and follow-up visits over the 2-year duration, all women underwent a pelvic examination with collection of two cervical specimens: the first specimen in PreservCyt for ThinPrep cytology medium (Hologic, Bedford, MA) for HPV testing by Hybrid Capture 2 (HC2; Qiagen, Gaithersburg, MD), and the second in specimen transport medium (STM; Qiagen). Women in all three arms of the study were reevaluated by cytology every 6 months during the 2 years and were sent to colposcopy if cytology showed a high-grade squamous intraepithelial lesion (HSIL). An exit examination with colposcopy was scheduled for all women, regardless of study arm or prior procedures, at the completion of the follow-up. We refer readers to other references for details on randomization, examination procedures, patient management, and laboratory and pathology methods (1, 16).
This nested case-control study included women diagnosed with CIN2/3 by clinical center pathology or CIN3+ by a quality control (QC) pathology group during follow-up (defined as cases), as well as a random sample of women who did not develop CIN2/3 (defined as controls). Women were censored at the visit of CIN2+ diagnosis starting at the 12-month follow-up or at 24 months if they were disease free. Women diagnosed with CIN2/3 at enrollment and the 6 month follow-up were not included in this analysis. HPV DNA data generated by HC2, the linear array (LA), and the line blot assay (LBA) testing methods were utilized from all available study visits for both cases and controls that had at least 2 test results for all three testing methods. A total of 726 women met these criteria and were included in this analysis. While there were no differences in demographic factors among women with CIN2/3 included in this analysis compared to the entire ALTS cohort, women with <CIN2 in the current study were more likely to be ≥25 years of age (60.6% versus 52.4%; P = 0.004) and have a Pap referral diagnosis of ASCUS compared to those women with <CIN2 not included in our analysis (29.2% versus 19.7%; P < 0.001). However, adjustment by these factors in regression models did not significantly alter the relative differences in specificity by testing modality or interval.
HPV DNA testing.
HC2, a DNA test for a pool of 13 carcinogenic HPV types (HPV16, -18, -31, -33, -35, -39, -45, -51, -52, -56, -58, -59, and -68) was performed on residual PreservCyt specimens as previously described (20). STM specimens were tested for 27 HPV types (HPV6, -11, -16, -18, -26, -31, -33, -35, -39, -40, -42, -45, -51 to -59, -66, -68, -73, and -82 to -84) or the 27 types plus an addition 11 types (HPV61, -62, -64, -67, -69, -70 to -72, -81, -82v, and -89) by a PCR assay, LBA (Roche Molecular Systems, Pleasanton, CA), as previously described (2). LA (Roche), which detects 37 types (36 separate genotypes and 1 variant [type 82v]) of the 38 HPV types (excluding HPV57) detected by LBA, was compared to LBA and was found to be more analytically sensitive than LBA, primarily due to the increased amounts of DNA used in LA compared to LBA (2, 7). LA does not directly detect HPV52 but combines a set of probes that detects HPV33, -35, -52, and -58 combined (HPVmix). Specimens that test negative for HPV33, -35, and -58 individually and are positive for the HPVmix are considered HPV52 positive. The specimens that test positive for the HPVmix and HPV33, -35, and/or -58 have an uncertain HPV52 status. Therefore, all specimens that were positive for the HPVmix were tested further for HPV52 DNA by using a highly sensitive and specific TaqMan real-time PCR assay to (i) confirm detection of HPV52 DNA and (ii) detect possible coinfection of HPV52 with HPV33, -35, and/or -58 (13).
HPV types 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68 were considered carcinogenic types (18). In addition, a separate category of borderline carcinogenic/noncarcinogenic types was created that included HPV types 53, 66, 67, 70, 73, 82, and 82v, types untargeted but detected by HC2 as a result of cross-reactivity (4).
Pathology and treatment.
Clinical management was based on the clinical center pathologists' cytological and histological diagnoses. In addition, all referral smears, ThinPrep, and histology slides were sent to the Pathology Quality Control Group (QC Pathology) based at the Johns Hopkins Hospital for independent review and diagnosis. CIN2 or worse (CIN2/3) diagnosis based on the clinical center pathology or CIN3+ diagnosis based on the QC pathology review triggered treatment by loop electrosurgical excision procedure (LEEP). In addition, women with persistent LSIL or carcinogenic HPV-positive ASCUS at the time of the exit from the study were offered LEEP.
Statistical analysis.
A previous analysis conducted in the ALTS compared two testing strategies for detecting CIN3+ diagnosed in the second year of ALTS: 12-month repeat detection of HR-HPV DNA initially detected at enrollment as measured by pooled-probe HC2, compared with 12-month type-specific persistence as measured by LBA (6). We previously observed that HC2, a pooled test for 13 carcinogenic HPV genotypes, was more sensitive but less specific for CIN3+ than a positive LBA test for any of 13 carcinogenic HPV genotypes. We also observed that LA was more analytically sensitive than LBA, most likely the result of increased DNA input in the assay. As a consequence of the greater analytical sensitivity of LA (versus LBA), LA was similarly sensitive but less specific for CIN3 compared to HC2 (7). Lastly, 277 women (38.1%) at the enrollment visit were genotyped for 27 types of HPV using the LBA test at the enrollment visit only. These types initially tested were HR-HPV types similarly measured by linear array.
Here, we expanded on the previous studies to examine the impacts of different detection methods on the clinical performance of repeat measures of HPV DNA for identification of women with incident CIN2/3. Specifically, we considered the following parameters: (i) testing interval (baseline detection of pooled HR-HPV by HC2 or LA versus repeat detection at 12 or 24 months); (iii) repeatedly testing positive for any carcinogenic HPV type versus detection of type-specific HPV persistence; (iii) inclusion (versus not) of borderline carcinogenic HPV genotypes at baseline; (iv) 12-month versus 24-month interval; (iv) impact of analytic sensitivity (LA versus LBA).
The goal of this analysis was to estimate the sensitivity and specificity of a colposcopic referral threshold of different measures of HPV persistence for the detection of incident CIN2/3 at 12, 18, and 24 months; secondarily, we used CIN3 as our diagnostic end point, excluding CIN2 from the analysis. We compared performance estimates for repeat testing versus the detection of HR-HPV DNA at a single time point. Because of the case-control design of this analysis, changes in performance parameters by referral threshold and by HPV DNA testing methodology were summarized using sensitivity and specificity ratios. A marginal regression model was used with a robust variance to take into account the correlation of repeated testing occurring within the same individual across time (12). This methodology utilizes Wald's test to assess the statistical significance of the difference between performance measures and directly corresponds to McNemar's test, but it allows for simultaneous comparison across multiple groups as well as the generation of a relative measure of the difference in sensitivity and specificity. A P value of <0.05 was considered statistically significant. All analyses were performed using Stata version 11.1 (Stata, College Station, TX).
RESULTS
Overall, 325 women were diagnosed with CIN2/3 (241 with CIN3) during follow-up in this study. Women with CIN2/3 were more likely to be under 30 years of age, have a referral Pap of ASCUS (P < 0.001) rather than LSIL, be part of the conservative management study trial arm (P < 0.001), be from clinical centers 1 and 4, report former use of birth control at enrollment (P = 0.05), and be a current or former smoker (P < 0.001) than controls at enrollment (Table 1).
Table 1.
Distribution of demographic and behavioral factors across study subject samples and by case/control status
| Demographic or behavioral factor | No. (%) of controls (<CIN2; N = 401) in category | No. (%) of subjects with CIN2/3 (N = 325) | P value, control vs CIN2/3a | No. (%) of subjects with CIN3 (N = 241) | P value, control vs CIN3a |
|---|---|---|---|---|---|
| Age at enrollment | |||||
| <30 yrs | 253 (63.1) | 268 (82.5) | 0.001 | 198 (82.2) | |
| ≥30 yrs | 148 (36.9) | 57 (17.5) | 43 (17.8) | 0.001 | |
| Referral Pap interpretation | |||||
| ASCUS | 322 (80.3) | 189 (58.2) | 135 (56.0) | ||
| LSIL | 79 (19.7) | 136 (41.9) | 0.001 | 106 (43.9) | 0.001 |
| Study trial arm | |||||
| CM | 138 (34.4) | 161 (49.5) | 116 (48.1) | ||
| HPV | 118 (29.4) | 67 (20.6) | 48 (19.9) | ||
| IC | 145 (36.2) | 97 (29.9) | 0.001 | 77 (31.9) | 0.001 |
| Clinical center no. | |||||
| 1 | 117 (29.2) | 105 (32.3) | 86 (35.7) | ||
| 2 | 54 (13.5) | 62 (19.1) | 38 (15.8) | ||
| 3 | 106 (26.4) | 53 (16.3) | 42 (17.4) | ||
| 4 | 124 (30.9) | 105 (32.3) | 0.006 | 75 (31.1) | 0.05 |
| Parity prior to enrollment | |||||
| Never pregnant | 112 (27.9) | 93 (28.6) | 67 (27.8) | ||
| 0 | 49 (12.2) | 38 (11.7) | 28 (11.6) | ||
| 1 | 102 (25.4) | 79 (24.3) | 57 (23.7) | ||
| >1 | 138 (34.4) | 115 (35.4) | 1.0 | 89 (36.9) | 0.9 |
| Birth control use at enrollment | |||||
| Never | 13 (3.3) | 2 (0.6) | 2 (0.8) | ||
| Former | 222 (55.5) | 182 (56.5) | 136 (56.0) | ||
| Current | 165 (41.3) | 138 (42.9) | 0.05 | 103 (42.3) | 0.2 |
| Smoking status at enrollment | |||||
| Never | 229 (57.3) | 132 (40.6) | 103 (42.7) | ||
| Former | 53 (13.3) | 49 (15.1) | 32 (13.3) | ||
| Current | 118 (29.5) | 144 (44.3) | 0.001 | 106 (43.9) | 0.001 |
| Age of sexual debut | |||||
| <16 yrs | 134 (33.4) | 124 (38.2) | 86 (35.3) | ||
| ≥16 yrs | 267 (66.6) | 201 (61.9) | 0.2 | 155 (64.6) | 0.6 |
A chi-square test was used to compare distributions of demographic variables among cases versus controls.
Single versus repeat HR-HPV DNA detection.
In this population, single time point detection of one or more (any) HR-HPV type by either HC2 or LA had very high (relative) sensitivity (>95%) for incidentally detected CIN2/3 or CIN3 at the 24-month exit visit (Table 2). However, the (relative) specificity was poor (∼30%). Adding a second measurement of any HR-HPV at 12 months decreased sensitivity by approximately 8% (P < 0.001 for both assays) but doubled the specificity (P < 0.001 for both assays). Using a 24-month interval between HR-HPV measurements further decreased sensitivity by approximately 20% compared to the single measurement (P < 0.001 for both assays) and by 10% compared to 12-month repeat detection (P < 0.001 for both assays), but it increased specificity by an additional 20 to 30% (P < 0.001 for both assays). There were no appreciable differences in the sensitivity and specificity between HC2 and LA, with one exception: the 12-month repeat detection of HR-HPV by HC2 was more specific than LA (P < 0.001).
Table 2.
Relative sensitivities and specificities for incident CIN2/3 or CIN3 for a single baseline detection versus repeat detection at 12- or 24-month intervals of any HR-HPV genotype by HC2 or LAa
| Test and timing of HR-HPV measurement | CIN2/3 |
CIN3 |
<CIN2 |
||||||
|---|---|---|---|---|---|---|---|---|---|
| % sensitivity | Performance ratio (95% CI)b | Performance ratio (95% CI)c | % sensitivity | Performance ratio (95% CI)a | Performance ratio (95% CI)b | % specificity | Performance ratio (95% CI)b | Performance ratio (95% CI)c | |
| HC2 | |||||||||
| Baseline | 97.5 | REFd | 98.3 | REF | 30.4 | REF | |||
| Repeat,12 mos | 90.1 | 0.921 (0.886, 0.956) | REF | 91.2 | 0.922 (0.881, 0.964) | REF | 66.7 | 2.19 (1.92, 2.50) | REF |
| Repeat,24 mos | 77.0 | 0.803 (0.747, 0.862) | 0.870 (0.811, 0.934) | 77.0 | 0.789 (0.722, 0.864) | 0.854 (0.781, 0.934) | 80.1 | 2.64 (2.29, 3.03) | 1.20 (1.13, 1.28) |
| LA | |||||||||
| Baseline | 96.6 | REF | 97.5 | REF | 32.9 | REF | |||
| Repeat,12 mos | 89.5 | 0.924 (0.886, 0.964) | REF | 91.5 | 0.931 (0.889, 0.974) | REF | 59.4§ | 1.80 (1.61, 2.02) | REF |
| Repeat,24 mos | 78.3 | 0.821 (0.769, 0.877) | 0.891 (0.837, 0.948) | 80.5 | 0.834 (0.775, 0.897) | 0.899 (0.843, 0.959) | 76.5 | 2.33 (2.05, 2.65) | 1.29 (1.20, 1.38) |
Performance ratios with 95% confidence intervals are shown for comparisons between different measurements of HPV. In all cases, the differences between baseline measure and 12-month repeat measures and the 12-month and 24-month repeat measures were highly significantly different (P < 0.001). There was no statistically significant difference in performance between assays, except that LA was less specific than HC2 for the 12-month repeat performance (§, P < 0.001).
Performance ratio for 12-month or 24-month repeat HR-HPV versus single baseline HR-HPV measurement.
Performance ratio for 24-month versus 12-month repeat HR-HPV measurement.
REF, reference.
Type-specific versus pooled HPV DNA detection.
Because LA detected individual HPV genotypes, we were able to compare detection of type-specific HR-HPV persistence versus simulated pooling of HR-HPV (by using HC2) (Table 3). Requiring type-specific HR-HPV persistence significantly decreased the sensitivity for both end points by approximately 5% for a 12-month interval and by more than 20% for a 24-month interval while increasing the specificity by about 10% for both intervals.
Table 3.
Relative sensitivity and specificity for incident CIN2/3 or CIN3 for repeat detection at 12- or 24-month intervals of any HR-HPV types versus repeat detection of at least one common HR-HPV type at both times points (TSP) by LAa
| HPV repeat measurement interval | CIN2/3 |
CIN3 |
<CIN2 |
|||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| % sensitivity |
Performance ratio (95% CI) | P | % sensitivity |
Performance ratio (95% CI) | P | % specificity |
Performance ratio (95% CI) | P | ||||
| LAb | LA (TSP) | LAb | LA (TSP) | LAb | LA (TSP) | |||||||
| 12 mos | 89.5 | 84.7 | 0.947 (0.914, 0.981) | 0.003 | 91.5 | 86.5 | 0.946 (0.907, 0.986) | 0.008 | 59.4 | 66.6 | 1.12 (1.07, 1.17) | 0.001 |
| 24 mos | 78.3 | 60.0 | 0.766 (0.699, 0.839) | 0.001 | 80.5 | 62.5 | 0.777 (0.700, 0.862) | 0.001 | 76.5 | 84.3 | 1.10 (1.06, 1.14) | 0.001 |
Performance ratios with 95% confidence intervals are shown for comparisons between different measurements of HPV. TSP, type-specific persistence.
Reference.
Impact of HPV genotype specificity on relative test performance.
Expanding the definition of the HPV types beyond HR-HPV to include borderline carcinogenic/noncarcinogenic HPV types HPV53, -66, -67, -70, -73, -82, and -82v, the types that are often detected by HC2 (4), did not appreciably increase the sensitivity of a single baseline measurement by LA for CIN2/3 or CIN3 but reduced the specificity by 15% (P < 0.001) (Table 4). For repeat measures at 12-month and 24-month intervals, the additional HPV types increased sensitivity for both end points and decreased in specificity with a similar magnitude.
Table 4.
Relative sensitivity and specificity for incident CIN2/3 or CIN3 for repeat detection at a 12- or 24-month interval of any HR-HPV types at both time points versus any HR-HPV types or HPV types (53, 66, 67, 70, 73, 82, and 82v)a
| HPV measurement interval | CIN2/3 |
CIN3 |
<CIN2 |
|||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| % sensitivity |
Performance ratio (95% CI) | P | % sensitivity |
Performance ratio (95% CI) | P | % specificity |
Performance ratio (95% CI) | P | ||||
| LAb | LA* | LAb | LA* | LAb | LA* | |||||||
| Baseline | 96.6 | 97.5 | 1.01 (0.99, 1.02) | 0.08 | 97.5 | 98.3 | 1.01 (0.99, 1.02) | 0.2 | 32.9 | 27.9 | 0.848 (0.789, 0.912) | 0.001 |
| Repeat, 12 mos | 89.5 | 93.2 | 1.04 (1.01, 1.07) | 0.008 | 91.5 | 95.7 | 1.05 (1.01, 1.08) | 0.02 | 59.4 | 53.9 | 0.908 (0.871, 0.945) | 0.001 |
| Repeat, 24 mos | 78.3 | 85.0 | 1.09 (1.04, 1.14) | 0.001 | 80.5 | 85.2 | 1.06 (1.01, 1.11) | 0.02 | 76.5 | 70.5 | 0.922 (0.892, 0.952) | 0.001 |
CIN2 was excluded from the analysis. LA* denotes HR-HPV or related types. Performance ratios with 95% confidence intervals are shown for comparison between different measurements of HPV.
Reference.
Impact of HPV-type analytic sensitivity on relative test performance.
We previously demonstrated the LA had a greater analytic sensitivity for most HR-HPV types because of greater DNA input in the LA assay compared to LBA (17). For the single baseline measurement of HR-HPV, there were no significant differences in sensitivity for CIN2/3 or CIN3 between the two assays (Table 5). For multiple measurements, the sensitivity for CIN2/3 and CIN3 was greater for LA than LBA, and there was a tendency for these differences to be greater for the 24-month interval compared to the 12-month interval. For single and multiple measures of HR-HPV, LBA was significantly more specific than LA (P = 0.001 for all).
Table 5.
Relative sensitivity and specificity for incident CIN2/3 or CIN3 by LA versus LBA for a single baseline measurement, repeat measurements for any HR-HPV, or detection of at least one common type at both 12 and 24 monthsa
| HPV measurement and interval | CIN2/3 |
CIN3 |
<CIN2 |
|||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| % sensitivity |
Performance ratio (95% CI) | P | % sensitivity |
Performance ratio (95% CI) | P | % specificity |
Performance ratio (95% CI) | P | ||||
| LAb | LBA | LAb | LBA | LAb | LBA | |||||||
| Baseline HR-HPV | 96.6 | 96.0 | 0.994 (0.978, 1.01) | 0.4 | 97.5 | 96.7 | 0.991 (0.979, 1.00) | 0.2 | 32.9 | 39.4 | 1.19 (1.08, 1.32) | 0.001 |
| Repeat HR-HPV, 12 mos | 89.5 | 84.6 | 0.950 (0.913, 0.990) | 0.01 | 91.5 | 87.2 | 0.958 (0.920, 0.998) | 0.04 | 59.4 | 67.9 | 1.14 (1.09, 1.20) | 0.001 |
| Repeat HR-HPV, 24 mos | 78.3 | 66.9 | 0.855 (0.791, 0.924) | 0.001 | 80.5 | 70.6 | 0.880 (0.806, 0.961) | 0.004 | 76.5 | 81.7 | 1.07 (1.02, 1.11) | 0.001 |
| 12-mo type-specific persistence | 84.7 | 77.5 | 0.917 (0.869, 0.968) | 0.002 | 86.5 | 79.9 | 0.930 (0.878, 0.985) | 0.01 | 66.6 | 73.0 | 1.09 (1.05, 1.14) | 0.001 |
| 24-mo type-specific persistence | 60.0 | 50.0 | 0.826 (0.746, 0.914) | 0.001 | 62.5 | 54 | 0.853 (0.761, 0.957) | 0.007 | 84.3 | 88.9 | 1.05 (1.02, 1.08) | 0.001 |
Performance ratios with 95% confidence intervals are shown for comparisons between different measurements of HPV.
Reference.
DISCUSSION
We estimated the relative impact of different methods for detecting HPV persistence and its accuracy on identifying women at risk of CIN2/3. All the parameters we investigated influenced the sensitivity for detection of incident CIN2/3 or CIN3 with a concomitant trade-off in specificity, creating a receiver operator characteristic-like curve, as illustrated in Fig. 1.
Fig 1.
(a) Sensitivity for CIN2/3 and false positivity rate (1 − specificity) for the following measurements: single baseline, 12-month repeat detection, or 24-month detection of HR-HPV genotypes by HC2; single baseline, 12-month repeat detection, or 24-month detection of HR-HPV genotypes by LA; 12-month or 24-month detection of type-specific persistent HR-HPV by LA; single baseline, 12-month repeat detection, or 24-month detection of HR-HPV genotypes or HPV types (53, 66, 67, 70, 73, 82, and 82v) by LA (LA∗); and 12-month or 24-month detection of type-specific persistent HR-HPV or HPV types (53, 66, 67, 70, 73, 82, and 82v) by LA (LA∗). (b) Sensitivity for CIN3 and false positivity rate (1 − specificity) for the following measurements: single baseline, 12-month repeat detection, or 24-month detection of HR-HPV genotypes by HC2; single baseline, 12-month repeat detection, or 24-month detection of HR-HPV genotypes by LA; 12-month or 24-month detection of type-specific persistent HR-HPV by LA; single baseline, 12-month repeat detection, or 24-month detection of HR-HPV genotypes or HPV types (53, 66, 67, 70, 73, 82, and 82v) by LA (LA∗); 12-month or 24-month detection of type-specific persistent HR-HPV or HPV types (53, 66, 67, 70, 73, 82, and 82v) by LA (LA∗).
Single versus repeat HR-HPV DNA detection.
The biggest impact on performance was observed when we compared single baseline measurement to repeat measures of HR-HPV. Repeat detection of any HR-HPV or type-specific HR-HPV across either a 12- or 24-month interval increased specificity 2-fold but decreased sensitivity by ≤20% for detection of CIN2/3 or CIN3 compared with a single positive HR-HPV DNA test. Since most women will not have disease, the gains in specificity may have a much more profound impact on the general population undergoing routine screening. Increasing the testing interval from 12 to 24 months resulted in a further decline of sensitivity, suggesting that a fraction of the CIN2/3 identified was not preceded by long-term persistent HR-HPV detection and was caused by incident HPV infections acquired after enrollment. An increasing specificity from the single baseline measure, to repeat measures at a 12-month interval, to repeat measures at 24 months reinforces the notion that a majority of HR-HPV infections clear within 1 to 2 years (9).
Type-specific versus pooled HPV DNA detection.
The requirement for HPV type-specific persistence to define women who are persistently HR-HPV positive only slightly decreased sensitivity and increased specificity, suggesting that pooling of HPV genotypes is a very good proxy for measuring type-specific persistence, as previously noted (6). Furthermore, the changes in performance due to measurement of type-specific persistence suggest that some CIN2/3 lesions develop relatively rapidly after acquisition of the causal infection. College-aged women with incident HPV16 and -18 infections demonstrated a relatively elevated rate of incident CIN 2/3 within the first 36 months after infection (24). These rapidly developing lesions and their causal type-specific HPV infection exist, particularly in younger women, within the context of other multiple noncausal, yet oncogenic, HPV types. Thus, multiple, concurrent noncausal types may on the whole be more readily detected than the single causal type, leading to more false negatives at the type-specific level and thereby decreasing the sensitivity. Pooled detection of multiple oncogenic HPV genotypes can help minimize these errors, because the test is positive whether or not due to the causal HPV type but will result in decreases in specificity, potentially leading to an increased frequency of unnecessary follow-up screening.
Impact of non-HR-HPV cross-reactive types on relative test performance.
Including the non-HR-HPV types that HC2 is most likely to detect through cross-reactivity in the definition of HPV positivity increased sensitivity for repeat measures, perhaps through the same phenomena described above, i.e., more liberal definitions of HPV positivity avoid some of the uncommon errors in failing to detect a specific type(s). The cost again was a loss in specificity.
Impact of HPV-type analytic sensitivity on relative test performance.
The use of the more analytically sensitive LA for measurement of repeat positivity and type-specific persistence at 12 and 24 months proved to be more clinically sensitive but less specific for detection of CIN2/3 than its predecessor, LBA. Differences in performance between LA and LBA were most likely due to the 2-fold difference in the amount of input DNA utilized in the PCR, as previously discussed (2). The optimization of the linear array protocol for commercial use, such as incorporation of β-globin controls and the use of premixed PCR and reverse hybridization reagents, as well as the differences in DNA extraction methodologies used for each assay (LBA uses a manual centrifugation-based method involving proteinase K digestion followed by ethanol precipitation, whereas LA uses a robotic, nucleic acid-based purification method) may have contributed to the differences in performance between the two assays. Lastly, the lower sensitivity observed when using the LBA may also be a result of limited genotyping information collected at enrollment for approximately 40% of the women in this study due to initial use of a limited primer set. Sensitivity analyses that restricted the study sample to women with complete genotyping information from both the LA and LBA showed no significant change in the sensitivity or specificity across all measurement metrics (data not shown). This was to be expected, given that the types measured by LBA initially using this limited primer set included all high-risk HPV types measured by LA.
Study limitations.
The use of a nested case-control study that overselected women with CIN2/3 as cases increased the prevalence of disease and thus excluded the ability to accurately estimate predictive values or absolute risks that can be generalized to a larger population. Furthermore, the ALTS from which the current analysis was conducted recruited women primarily <30 years of age with mild to low-grade cervical abnormalities. Therefore, the age among women with CIN 2/3 in this case-control study was younger than what is typically observed in the general population of women undergoing routine screening (17). As a result of these limitations, this study focused on sensitivity and specificity as primary measures of performance. In addition, the case-control study was nested within a triage population of women who were enrolled based on referral from an ASCUS or LSIL cytological screening result and therefore were likely to have HPV infections with higher HPV viral loads. The impact of differences in analytic sensitivity on clinical sensitivity might be less pronounced in a population whose HPV viral loads are much above the thresholds of detection. Therefore, the generalizabilities of the sensitivity and specificity estimates for single and repeat HPV DNA detection to the general screening population are limited. These measurements should be viewed in terms of relative performance. Population-based prospective studies that perform multiple HPV DNA measurements across set screening intervals are needed to provide more unbiased estimates of these performance parameters.
Summary.
This study, one of the first to systematically evaluate the potential benefit of repeat HPV DNA testing compared to single testing across a number of different HPV DNA testing methods, found that repeat detection of HR-HPV DNA conferred a significantly higher specificity with concomitantly small reductions in sensitivity than did a single HR-HPV test, particularly in women 30 years old and younger (6, 8). Differences in performance varied little between testing methodology, suggesting that the modest analytic performance differences between currently validated HPV DNA assays have minimal impacts on the clinical performance. We did note that the more liberal the definition of a positive result (e.g., more types, fewer measurements, not requiring type-specific persistence, and greater analytic sensitivity), led to greater sensitivity for CIN2/3 but at a performance cost of lower specificity. In conclusion, the significant benefit observed in this study of enhanced specificity for ruling out women without disease needs to be evaluated in additional studies against the declines in sensitivity, in order to more clearly define the balance between benefit from the high sensitivity and harms due to potential overtreatment when using HR-HPV DNA testing.
ACKNOWLEDGMENTS
We thank Roslyn Howard in the Department of Epidemiology of Johns Hopkins University for performing the linear array assay and Elizabeth Johnson in the Department of Biostatistics for consultation regarding the implementation of the marginal regression models to assess relative sensitivity and specificity.
Patti Gravitt is a member of the Women's Health Scientific Advisory Board for Qiagen. Some of the equipment and supplies used in these studies were donated or provided at reduced cost by Qiagen Corporation (Gaithersburg, MD), Cytyc Corporation (Marlborough, MA), National Testing Laboratories (Fenton, MO), DenVu (Tucson, AZ), TriPath Imaging, Inc. (Burlington, NC), and Roche Molecular Systems Inc. (Alameda, CA).
This work was supported by an Institutional Research Cancer Epidemiology Fellowship funded by the National Cancer Institute (T32 CA0009314). ALTS was supported by National Cancer Institute, National Institutes of Health, Department of Health and Human Services contracts CN-55153, CN-55154, CN-55155, CN-55156, CN-55157, CN-55158, CN-55159, and CN-55105. This research was supported (in part) by the Intramural Research Program of the NIH, National Cancer Institute.
Footnotes
Published ahead of print 7 December 2011
REFERENCES
- 1. ASCUS-LSIL Triage Study (ALTS) Group 2003. Results of a randomized trial on the management of cytology interpretations of atypical squamous cells of undetermined significance. Am. J. Obstet. Gynecol. 188: 1383–1392 [DOI] [PubMed] [Google Scholar]
- 2. Castle PE, Gravitt PE, Solomon D, Wheeler CM, Schiffman M. 2008. Comparison of linear array and line blot assay for detection of human papillomavirus and diagnosis of cervical precancer and cancer in the atypical squamous cell of undetermined significance and low-grade squamous intraepithelial lesion triage study. J. Clin. Microbiol. 46: 109–117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Castle PE, et al. 2009. Short term persistence of human papillomavirus and risk of cervical precancer and cancer: population based cohort study. BMJ 339: b2569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Castle PE, et al. 2008. Human papillomavirus genotype specificity of hybrid capture 2. J. Clin. Microbiol. 46: 2595–2604 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Dillner J, et al. 2008. Long term predictive values of cytology and human papillomavirus testing in cervical cancer screening: Joint European cohort study. BMJ 337: a1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Gage JC, Schiffman M, Solomon D, Wheeler CM, Castle PE. 2010. Comparison of measurements of human papillomavirus persistence for postcolposcopic surveillance for cervical precancerous lesions. Cancer Epidemiol. Biomarkers Prev. 19: 1668–1674 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Gravitt PE, Schiffman M, Solomon D, Wheeler CM, Castle PE. 2008. A comparison of linear array and hybrid capture 2 for detection of carcinogenic human papillomavirus and cervical precancer in ASCUS-LSIL triage study. Cancer Epidemiol. Biomarkers Prev. 17: 1248–1254 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Gyllensten U, Lindell M, Gustafsson I, Wilander E. 2010. HPV test shows low sensitivity of Pap screen in older women. Lancet Oncol. 11: 509–510 (Letter.) [DOI] [PubMed] [Google Scholar]
- 9. Ho GY, Bierman R, Beardsley L, Chang CJ, Burk RD. 1998. Natural history of cervicovaginal papillomavirus infection in young women. N. Engl. J. Med. 338: 423–428 [DOI] [PubMed] [Google Scholar]
- 10. Ho GY, et al. 1995. Persistent genital human papillomavirus infection as a risk factor for persistent cervical dysplasia. J. Natl. Cancer Inst. 87: 1365–1371 [DOI] [PubMed] [Google Scholar]
- 11. Kjaer SK, et al. 2002. Type specific persistence of high risk human papillomavirus (HPV) as indicator of high grade cervical squamous intraepithelial lesions in young women: population based prospective follow up study. BMJ 325: 572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Leisenring W, Pepe MS, Longton G. 1997. A marginal regression modelling framework for evaluating medical diagnostic tests. Stat Med. 16: 1263–1281 [DOI] [PubMed] [Google Scholar]
- 13. Marks M, et al. 2009. Confirmation and quantitation of human papillomavirus type 52 by Roche Linear Array using HPV52-specific TaqMan E6/E7 quantitative real-time PCR. J. Virol. Methods 156: 152–156 [DOI] [PubMed] [Google Scholar]
- 14. Munoz N, Castellsague X, de Gonzalez AB, Gissmann L. 2006. HPV in the etiology of human cancer. Vaccine 24(Suppl. 3): S1–S10 [DOI] [PubMed] [Google Scholar]
- 15. Ronco G, et al. 2010. Efficacy of human papillomavirus testing for the detection of invasive cervical cancers and cervical intraepithelial neoplasia: a randomised controlled trial. Lancet Oncol. 11: 249–257 [DOI] [PubMed] [Google Scholar]
- 16. Schiffman M, Adrianza ME. 2000. ASCUS-LSIL Triage Study. Design, methods and characteristics of trial participants. Acta Cytol. 44: 726–742 [DOI] [PubMed] [Google Scholar]
- 17. Schiffman M, Castle PE, Jeronimo J, Rodriguez AC, Wacholder S. 2007. Human papillomavirus and cervical cancer. Lancet 370: 890–907 [DOI] [PubMed] [Google Scholar]
- 18. Schiffman M, et al. 2005. The carcinogenicity of human papillomavirus types reflects viral evolution. Virology 337: 76–84 [DOI] [PubMed] [Google Scholar]
- 19. Schiffman M, Solomon D. 2003. Findings to date from the ASCUS-LSIL Triage Study (ALTS). Arch. Pathol. Lab. Med. 127: 946–949 [DOI] [PubMed] [Google Scholar]
- 20. Schiffman M, Wheeler CM, Dasgupta A, Solomon D, Castle PE. 2005. A comparison of a prototype PCR assay and hybrid capture 2 for detection of carcinogenic human papillomavirus DNA in women with equivocal or mildly abnormal Papanicolaou smears. Am. J. Clin. Pathol. 124: 722–732 [DOI] [PubMed] [Google Scholar]
- 21. Schlecht NF, et al. 2001. Persistent human papillomavirus infection as a predictor of cervical intraepithelial neoplasia. JAMA 286: 3106–3114 [DOI] [PubMed] [Google Scholar]
- 22. Sherman ME, et al. 2003. Baseline cytology, human papillomavirus testing, and risk for cervical neoplasia: a 10-year cohort analysis. J. Natl. Cancer Inst. 95: 46–52 [DOI] [PubMed] [Google Scholar]
- 23. Walboomers JM, et al. 1999. Human papillomavirus is a necessary cause of invasive cervical cancer worldwide. J. Pathol. 189: 12–19 [DOI] [PubMed] [Google Scholar]
- 24. Winer RL, et al. 2005. Development and duration of human papillomavirus lesions, after initial infection. J. Infect. Dis. 191: 731–738 [DOI] [PubMed] [Google Scholar]
- 25. Wright TC, Jr., et al. 2007. 2006 consensus guidelines for the management of women with abnormal cervical screening tests. J. Low Genit. Tract. Dis. 11: 201–222 [DOI] [PubMed] [Google Scholar]

