Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 19.
Published in final edited form as: Diagn Cytopathol. 2019 Oct 22;48(1):35–42. doi: 10.1002/dc.24325

Cervical cytology reproducibility and associated clinical and demographic factors

Hyunsoo Hwang 1, Michele Follen 2,3, Martial Guillaud 4, Michael Scheurer 5, Calum MacAulay 4, Calum MacAulay 4, Gregg A Staerkel 6, Dirk van Niekerk 7, Jose-Miguel Yamal 1
PMCID: PMC9017433  NIHMSID: NIHMS1788766  PMID: 31639288

Abstract

Background:

Although the Pap test has been the standard screening method for cervical precancer/cancer detection, it has been criticized for having a relatively low sensitivity and a low reproducibility between pathologists. There is limited knowledge about inter-rater agreement and what clinical and demographic factors are associated with disagreements between pathologists reading the same Pap smear.

Methods:

This study aimed to assess inter- and intra-rater agreement of the Pap smear in 1619 cytologic slides with biopsy confirmation, using kappa statistics. Clinical and demographic factors associated with higher odds of inter-rater agreement were also examined and stratified by histologic diagnosis grade.

Results:

Using a five grade classification system, the overall kappa statistics for total, inter-rater, and intra-rater samples were 0.62, 0.57, and 0.88 (unweighted) and 0.83, 0.81, and 0.95 (weighted), respectively. In stratified analyses by histologic grade, total kappas ranged from 0.40 (atypia) to 0.64 (human papilloma virus/CIN 1). Factors such as referral for abnormal Pap test (diagnostic vs screening population), recruiting site, and parity were found to be associated with higher agreement between the two cytologic readings.

Conclusions:

We observed relatively higher levels of agreement compared with other studies. However, variability was considerable and agreement was generally moderate, suggesting that cervical screening test accuracy and reproducibility needs to be improved.

Keywords: cytologic diagnosis, inter-rater reliability, IRR, kappa, Pap

1 |. INTRODUCTION

The Pap test has long been the standard cervical cancer screening, with a focus of early detection and treatment of cervical precursor lesions. This screening methodology has been quite successful, and has led to a 70% decrease in cervical cancer deaths in the developed world in the past 40 years.1 However, the Pap test has been also criticized for its relatively low sensitivity and low inter-rater reproducibility, even between experienced cytologists.24 A number of studies have used kappa statistics to estimate the level of inter-rater agreement (or inter-rater reliability, IRR).510 However, some studies were limited in cytologic sample size, and many of them did not include corresponding histologic diagnosis in the analysis, which resulted in no comparison between the Pap test and biopsy result, the gold standard. In addition, only a few studies have investigated the clinical and demographic factors associated with increased reproducibility of the cytology reading. The reproducibility of cytology readings for the various precancerous lesions and which risk factors are associated with its reproducibility can be used as a reference for future proposed screening and diagnostic tests or modalities. In the current study, we aimed to assess inter- and intra-rater agreement of the Pap smear in a sample of 1619 biopsy-confirmed specimens assessed by cancer center pathologists. Clinical and demographic factors associated with reproducibility of cervical cytology readings were also examined.

2 |. MATERIALS AND METHODS

2.1 |. Study subjects and review process

This study is part of a larger diagnostic trial of optical spectroscopy and quantitative cytology to detect cervical intraepithelial neoplasia (CIN) 2 or worse in both the screening (women who had no history of abnormal Pap smears) and diagnostic settings (women from colposcopy clinics who were referred with an abnormal Pap smear or previous treatment for CIN). Of 1850 women who were at least 18 years old and not pregnant were recruited between October 1998 and November 2005, 1000 of them were from a screening group, and the rest were from a diagnostic group. Details about the recruitment process were described previously.11 The study was carried out at three sites: The British Columbia Cancer Agency in Canada, University of Texas MD Anderson Cancer Center, and Lyndon B. Johnson Hospital in Houston, Texas, United States. All patients signed an informed consent to participate in the study. The study protocol was reviewed and approved by the Institutional Review Boards at the participating sites and the study complied with the principles of the Declaration of Helsinki.

Each patient provided complete medical history and underwent a physical exam, pelvic examination which included a Pap Test, specimen acquisition for human papillomavirus (HPV) testing, colposcopic examination of the vulva, vagina, and cervix, and spectroscopy examination. A blood sample was taken for estrogen and progesterone measurement. Following examinations, all patients were interviewed by a research nurse to obtain information on demographics and risk factors. Details about demographic and clinical information are summarized in Table 1.

TABLE 1.

Demographic and clinical characteristics of patients

Characteristic Diagnostic group, N (%) Screening group, N (%)
Total 719 900
Age 36.36 (11.66) 43.85 (12.07)
Race
 White 459 (0.64) 440 (0.49)
 Black 79 (0.11) 138 (0.28)
 Hispanic 90 (0.13) 252 (0.28)
 Native American 8 (0.01) 1 (0.00)
 Asian 57 (0.08) 60 (0.07)
 Other 26 (0.04) 9 (0.01)
Education
 Less than high school 76 (0.11) 62 (0.07)
 High school or equivalent 125 (0.17) 150 (0.17)
 Some college or more 517 (0.72) 687 (0.76)
 Refused/Unknown 1 (0.00) 1 (0.00)
Marital status
 Never Married 207 (0.29) 175 (0.19)
 Married 276 (0.38) 501 (0.56)
 Married like situation 71 (0.1) 38 (0.04)
 Divorce/separated 151 (0.210 162 (0.18)
 widowed 12 (0.02) 24 (0.03)
 Refused/Unknown 2 (0.00) 0 (0.00)
Site
 MD Anderson/LBJ Hospital (United States) 665 (0.92) 767 (0.85)
 BCCRC (Canada) 54 (0.08) 133 (0.15)
Smoking
 Yes 318 (0.44) 305 (0.34)
 No 400 (0.56) 595 (0.66)
 Refused/Unknown 1 (0.00) 0 (0.00)
Parity
 None 306 (0.43) 272 (0.30)
 1~2 284 (0.39) 397 (0.44)
 >2 129 (0.18) 227 (0.25)
 Refused/Unknown 0 (0.00) 4 (0.00)
Employment
 Full/part-time 483 (0.67) 606 (0.67)
 Unemployed/retired 84 (0.12) 143 (0.16)
 Housewife 76 (0.11) 125 (0.14)
 Student 73 (0.10) 23 (0.03)
 Refused/unknown 3 (0.00) 3 (0.00)
HPV Diagnosis
 Negative 346 (0.48) 787 (0.87)
 Low risk 22 (0.03) 15 (0.02)
 High risk 295 (0.41) 79 (0.09)
 Both 53 (0.07) 13 (0.01)
 Unascertainable 3 (0.00) 6 (0.01)
Menopause
 No 570 (0.79) 545 (0.61)
 Yes, I am now going through it 60 (0.08) 82 (0.09)
 Yes, I have gone through it 70 (0.10) 247 (0.27)
 Refused/unknown 19 (0.03) 26 (0.03)

Before the colposcopy and spectroscopy exam, conventional cervical smears were obtained using a wooden spatula and a cervical brush, placed on glass slides, fixed with fixative, and stained. The slides were processed at either The University of Texas MD Anderson Cancer Center (MDA) or the British Columbia Cancer Agency, depending on the recruitment site. During the colposcopic examination, biopsies of squamous and columnar epithelium were taken both from an area of normal appearance and from an area with the worst overall colposcopic impression. Each cytologic and pathologic specimen was evaluated twice. All reviewers were blinded to the other readings and to the colposcopic finding. At MDA, the stained Pap smear slides were reviewed by two cyto-pathologist reviewers. In Vancouver, the first review was conducted by a cytotechnologist and the second review was performed by a cytopathologist. For cytology samples, both the 2001 Bethesda System and the World Health Organization (WHO) criteria were used. The 2001 Bethesda system for cytology reading is categorized as normal, atypical squamous cells of undetermined significance (ASCUS), low-grade squamous intraepithelial lesions (LSIL), high-grade squamous intraepithelial lesions (HSIL), or cancer. The WHO system was employed for biopsy reading, which has seven categories. Classification systems are summarized and compared in Table S1 (S# is in the Supporting Information), including both WHO and 2001 Bethesda classification systems.

2.2 |. Study sample and statistical analysis

A total of 26 reviewers participated in the study. Inter- and intra-rater agreement analyses were performed to assess the degree of agreement between two cytological reviews. Cohen’s kappa12 was used to assess agreement for dichotomous classifications. Weighted kappa13 was used with quadratic weights to assess inter-rater agreement, giving less penalty to ratings that were close to each other. To compute the confidence intervals (CIs), we employed the bootstrapping technique which is based on random sampling with replacement. Kappas have a possible range from −1 to 1. Since there are several different approaches to interpret kappa values between this interval, we chose the more conservative McHugh’s classification14 for this study. The interpretation for each category is summarized in Table S2. Logistic regression was used to assess whether clinical factors (body mass index, estradiol, progesterone, screening group, sexually transmitted disease [STD], HPV infection, smoking status) and demographic factors (age, marital status, education, site) were associated with higher odds of agreement between two raters. All variables were treated as categorical except age. Variable selection was conducted using the L-1 regularized logistic regression model, with 10-fold cross-validation. A logistic regression model was then fit to the remaining variables to assess the association between variables and inter-rater agreement. Receiver operating characteristic (ROC) analysis was conducted to assess the accuracy of the two readings. ROC curves were compared using the Delong’s test for two correlated ROC curves. All analyses were conducted using R version 3.2.4 (R Foundation for Statistical Computing, Vienna, Austria) and statistical significance was achieved at 0.05.

3 |. RESULTS

Figure 1 illustrates the flow diagram of study patients and sample size in each analysis. Of the total 1850 patients, 107 had either only one or no cytology review, and thus were excluded. An additional 76 patients had no recorded reviewers’ identification and 48 patients did not have biopsy confirmation, leaving 1619. Of these, 1310 patients were assessed by two different reviewers and 309 by the same reviewer twice. Of the 1619 patients, 1379 patients had non-missing clinical and demographic information. If a patient refused to undergo a test, including a biopsy, or provide necessary information, the data were excluded from analyses.

FIGURE 1.

FIGURE 1

Patient enrollment and flow diagram of final sample size and data analysis

The distributions of cytologic interpretations for all samples, inter-rater review samples, and intra-rater review samples, are summarized in Tables 2, S3, and S4, respectively. There was agreement between the two readings in 83.63% of the total specimens (Table 2), 80.68% of the specimens rated by two different reviewers (Table S3), and 96.11% of the specimens rated by the same reviewer (Table S4). Table 3 summarizes the kappa statistics (unweighted and weighted) for all specimens (unweighted kappa = 0.62, weighted = 0.83) and stratified by histologic diagnosis. The highest agreement was observed for patients with a histologic diagnosis of HPV/CIN 1, followed by those with normal histology. Disagreement was generally higher in the worse histologic grades, but highest in the atypia group, κ = 0.40. Kappa statistics for both CIN 2 and CIN 3 or worse groups were 0.52 and 0.46, respectively. Every weighted kappa estimate was greater than 0.6, ranging from 0.6 to 0.83.

TABLE 2.

Cross-tabulation of cytologic interpretation by two reviewers in categorizing 1619 specimens

Cytology reading II
Normal ASCUS LSIL HSIL Invasive carcinoma Total
Cytology reading I
 Normal 1120 (97%) 23 (2%) 9 (1%) 3 (0%) 0 (0%) 1189
 ASCUS 83 (52%) 60 (38%) 8 (5%) 8 (5%) 0 (0%) 165
 LSIL 34 (21%) 33 (20%) 65 (40%) 31 (19%) 0 (0%) 2
 HSIL 11 (8%) 5 (4%) 11 (8%) 108 (78%) 3 (2%) 167
 Invasive carcinoma 0 (0%) 0 (0%) 0 (0%) 3 (75%) 1 (25%) 45
 Total 1286 126 7 88 82 1619

Abbreviations: ASCUS, atypical squamous cells of undetermined significance; HSIL, high-grade squamous intraepithelial lesions; LSIL, low-grade squamous intraepithelial lesions.

TABLE 3.

Kappa values for overall and five specific histologic categories in 1619 total specimens

Histologic diagnoses Unweighted kappa (95% CI) Weighted kappa (95% CI) No. of Samples
All samples 0.62 (0.58–0.65) 0.83 (0.80–0.85) 1619
Normal 0.54 (0.45–0.63) 0.74 (0.64–0.82) 764
Atypia 0.4 (0.27–0.52) 0.6 (0.41–0.74) 317
HPV/CIN 1 0.64 (0.56–0.72) 0.79 (0.71–0.85) 287
CIN 2 0.52 (0.41–0.64) 0.75 (0.63–0.83) 111
CIN 3 or worse 0.46 (0.37–0.57) 0.71 (0.59–0.80) 140

Tables S5 and S6 summarize the results for inter- and intra-rater agreement, respectively. When comparing inter- and intra-rater agreements, agreement was very different in the histologic atypia group where the intra-rater agreement kappa was high (κ = 0.92) but the inter-rater agreement kappa was only 0.30, notably lower compared to other histologic review groups. For inter-rater agreement, HPV/CIN 1 histologic group showed the highest kappa (κ = 0.57), followed by a normal histologic group (κ = 0.50). For intra-rater agreement, histologically normal samples showed the lowest kappa, 0.76, followed by a CIN 3 or worse group (κ = 0.82). However, CIN 2 and CIN 3 or worse groups require cautious interpretation since their sample sizes are very small. Based on McHugh’s guide, only the HPV/CIN 1 group showed a moderate level of agreement in all samples review, and no kappa was above moderate level of agreement in inter-rater agreement review. However, most groups in intra-rater agreement presented strong and near “almost perfect” levels of agreement.

Table 4 summarizes the kappa statistics and associated CI based on the following seven WHO categories: normal, atypia, HPV associated changes, CIN 1, CIN 2, CIN 3 & CIS, and carcinoma with a total kappa value of 0.57 (95% CI [0.53,0.60]) and inter-rater kappa = 0.52 (95% CI [0.48,0.56]) and intra-rater kappa = 0.84 (95% CI [0.75,0.90]), which show reduced reproducibility. ROC curves of cytologic readings 1 and 2 compared to the histologic diagnosis, overall and by site, are presented in Figure 2. Histologic diagnosis was dichotomized to “above or equal to CIN 2” and “below CIN 2.” Over all, there was a statistically significant difference between the two areas under the ROC curves (AUC, P = .027). However, the difference was not present in the individual site, the MD Anderson Cancer Center (P = .628), Lyndon B. Johnson Hospital (P = .433), and the British Columbia Cancer Agency (P = .619). There was no difference in the two readings either, when stratified by whether the subject was enrolled as part of the screening trial (P = .850) or the diagnostic trial (P = .259).

TABLE 4.

Kappa values after seven group recategorization

Histologic diagnoses Unweighted kappa (95% CI) Weighted kappa (95% CI) No. of Samples
Total 0.57 (0.53–0.60) 0.83 (0.79–0.85) 1619
Inter-rater 0.52 (0.48–0.56) 0.81 (0.77–0.84) 1310
Intra-rater 0.84 (0.75–0.90) 0.93 (0.81–0.97) 309

FIGURE 2.

FIGURE 2

ROC curves showing the classification accuracy of Pap smear to detect high-grade histologic diagnosis for all sites combined and by site (MDA, MD Anderson Cancer Center; LBJ, Lyndon B. Johnson Hospital; and BCCA, British Columbia Cancer Agency). ROC, Receiver operating characteristic

We also examined several clinical and demographic variables with the goal of finding what factors are associated with higher reproducibility of cytology readings. In Figure 3, forest plots depict the odds ratios from the multivariable analyses. The Canadian site, HPV positive review, and diagnostic population with an abnormal Pap referral were associated with lower odds of agreement between the two reviews. In specific histologic grades, more variables were significantly associated with odds of agreement. Screening variable was positively associated with agreement in histologic groups whose grades are normal and atypia. Higher education level was negatively associated with agreement in the atypia group. Presence of STD and HPV negative review were associated with higher odds of agreement in a group of HPV/CIN 1. Parity of 1 or 2 births were positively associated with agreement in a histologic group of CIN 3 or above. Age and US examination site were positively associated with agreement in a normal population.

FIGURE 3.

FIGURE 3

Forest plots of multivariable logistic regression analysis odds ratios for associations between factors and agreement (both inter-and intra-rater) between two Pap smear reviews, overall and by histologic grade. Note: Dependent variable is an agreement between two cytologists, and coded as 1 if they agreed each other. Age is per 10-year increase (Age/10). Education is a dichotomous variable, which represents “college education or above.” Age sex is a dichotomous variable, which represents “age at first sex.” Screening is a dichotomous variable, which represents “the patients with or without an abnormal Pap referral.” Marital Status is a dichotomous variable, which represents “currently being married.” Site is a dichotomous variable, which represents “patients are evaluated in Canada.” Menopause is a dichotomous variable, which represents “patients have gone through, or are going through it now.” Race is a dichotomous variable, which represents “Non-Hispanic white.” Estradiol and Progesterone are dichotomous variables whose cutoff points are medians. Parity 1 includes patients who had 1~2 children, parity 2 represents patients who had 3 or above child births

4 |. DISCUSSION

There have been a number of studies that have investigated the inter-rater agreement of the Pap test using kappa statistics (Table 5). The heterogeneity of agreement in the literature can be partially explained by two factors that influence kappa statistics: (a) the cytologic distribution of atypical findings in samples can affect variability in agreement, and (b) kappa statistics are inversely related to the number of categories assessed. In several studies, the high proportion of abnormal cases likely resulted in lower kappa statistics. The study by Stoler and Schiffman3 included women referred due to a mild cytologic abnormality and found κ = 0.46 between reviewers. Studies by Confortini et al,6,7 based in Italian laboratories, observed kappas of 0.44 and 0.38 in a study of intra-and inter-laboratory reproducibility in 2007 and 2003. The oversampling of ASCUS (atypical squamous cells of undetermined significance) slides is known to decrease reproducibility. Sriamporn et al10 studied a mixture of populations from rural Thailand that had either a positive Pap test, a negative Pap test but subsequent cancer during follow-up, or controls for the cancer group, and showed relatively poor agreement (κ = 0.37). The weighted kappa of the Settakorn et al study using four groups, combining LGSIL, HGSIL, and squamous cell carcinoma into a single group, was modestly high (κ = 0.62). The kappa statistic from the Kato et al study8 was modestly high (κ = 0.59) in a population limited to women with either CIN 3, invasive cancer, or normal control. The absence of populations from any histologic diagnoses of CIN 1,2, and atypia likely resulted in increased reproducibility.

TABLE 5.

Pap smear kappa measurements and study designs from previous investigators

Author Year Number of reviewers Reviewer Number of slides Categories Number of grades Interrater kappas, unweighted CI Interrater kappas, weighted CI Screening technique
Confortini et al5 1992 16 16 cytologists 100 WHO 7 0.49 Conventional
Kato et al8 1995 3 3 cytopathologists 1506 4 0.59 (0.56–0.61) 0.69 Conventional
Doornewaard et al15 1999 4 4 cytotechnologists 1996 Bethesda 4 0.69 (0.33–0.72) Conventional
Stoler and Schiffman3 2001 11+ 11+ pathologists 4948 Bethesda 4 0.46 (0.44–0.48) 0.59 (0.44–0.48) liquid-based
Confortini et al7 2003 89+ 89 laboratories 50 Bethesda 4 0.38 Conventional
Sriamporn et al10 2005 2 2 cytologists 313 7 0.37 -
Confortini et al6 2007 13+ 13 laboratories 30 Bethesda 4 0.44 liquid-based
Settakorn et al9 2008 2 2 cytopahtologists 731 4 0.62 (0.56–0.68) liquid-based

The reproducibility for ASCUS grade samples has been low in many studies, consistent with the result of this study. In our study, kappa using the Bethesda system was the lowest (κ = 0.30) among all the groups in the inter-rater agreement reviews. However, kappa in ASCUS grade reviews was the highest among all groups in the intra-rater agreement reviews, which was unexpected. It seems that low level of inter-rater agreement in this grade results from more subjective screening criteria between reviewers. Since there was only one intra-rater reviewer, it might be difficult to generalize the findings. Our study showed generally higher kappa statistics compared with other studies, based on both unweighted kappa and weighted kappa. Kappa in our study remaintained relatively high even with seven categories (Table 4) or reducing to three categories (Table S7). Working in a dedicated cancer center and having specific cancer expertise may be reasons why we observed higher agreement rates. The condition that both reviewers work in the same institution might also have affected the higher kappa values. Distribution of relatively higher proportion of normal specimens in patients might have led to higher reproducibility of cytologic reading, as well. Compared to the Canada site, the US sites generally had higher inter-rater Kappas (unweighted kappa 0.59 for United States and 0.49 for Canada site). This could partially explained by the two different specializations of the reviewers in Canada (one cytotechnologist and one cytopathologist) and the same specialization in United States (two cytopathologists).

Only a few studies actually examined which demographic and clinical factors are associated with higher reproducibility. The study by Kato et al8 investigated the association between these factors and inter-rater agreement rate. However, their study population was limited to the subjects originally diagnosed as CIN 3, which is comparable to our CIN 3 or worse group. While factors such as age, parity, and current sexual status were found to be statistically significant in their study, only parity of 1 or 2 was significantly associated with the agreement level for CIN 3 group in our study. Parity was known to be significant in both studies, but it was associated with higher agreement in our study, which is inconsistent with results from Kato et al that found parity to be associated with lower agreement. When expanding the samples to all the other grades, HPV infection, STD, screening population, recruiting site, age, and parity were found to be significantly associated with the agreement level. However, we are not exactly sure why some factors are associated with variability in agreement. The site variable was significant in both total specimens and normal histology subsets, suggesting that the observed difference in agreement level between Canada and United States could be not only due to different distribution but also due to some other reasons such as healthcare differences.

Although we observed moderately high agreement in most of samples, we still observed quite a bit of agreement variability between two cancer experts. It suggests that there is still a need for improving upon our screening test accuracy and reproducibility.

Supplementary Material

supinfo

Funding information

Cancer Prevention and Research Institute of Texas, Grant/Award Number: RP170668; National Cancer Institute, Grant/Award Number: P01-CA-82710-09

Footnotes

CONFLICT OF INTEREST

All authors have no conflict of interest to report.

SUPPORTING INFORMATION

Additional supporting information may be found online in the Supporting Information section at the end of this article.

REFERENCES

  • 1.Landis SH, Murray T, Bolden S, Wingo PA. Cancer statistics, 1999. CA Cancer J Clin. 1999;49:8–31. [DOI] [PubMed] [Google Scholar]
  • 2.Nanda K, McCrory DC, Myers ER, et al. Accuracy of the Papanicolaou test in screening for and follow-up of cervical cytologic abnormalities: a systematic review. Ann Intern Med. 2000;132:810–819. [DOI] [PubMed] [Google Scholar]
  • 3.Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL triage study. JAMA. 2001;285:1500–1505. [DOI] [PubMed] [Google Scholar]
  • 4.Thomas C, Wright J. Cervical cancer screening in the 21st century: is it time to retire the PAP smear? Clin Obstet Gynecol. 2007;50:313–323. [DOI] [PubMed] [Google Scholar]
  • 5.Confortini M, Biggeri A, Cariaggi M, et al. Intralaboratory reproducibility in cervical cytology. Results of the application of a 100-slide set. Acta Cytol 1992;37:49–54. [PubMed] [Google Scholar]
  • 6.Confortini M, Bondi A, Cariaggi M, et al. Interlaboratory reproducibility of liquid-based equivocal cervical cytology within a randomized controlled trial framework. Diagn Cytopathol. 2007;35:541–544. [DOI] [PubMed] [Google Scholar]
  • 7.Confortini M, Carozzi F, Dalla Palma P, et al. Interlaboratory reproducibility of atypical squamous cells of undetermined significance report: a national survey. Cytopathology. 2003;14:263–268. [DOI] [PubMed] [Google Scholar]
  • 8.Kato I, Santamaria M, De Ruiz PA, et al. Inter-observer variation in cytological and histological diagnoses of cervical neoplasia and its epidemiologic implication. J Clin Epidemiol. 1995;48:1167–1174. [DOI] [PubMed] [Google Scholar]
  • 9.Settakorn J, Rangdaeng S, Preechapornkul N, et al. Interobserver reproducibility with LiquiPrepTM liquid-based cervical cytology screening in a developing country. Asian Pac J Cancer Prev. 2008;9:92–96. [PubMed] [Google Scholar]
  • 10.Sriamporn S, Kritpetcharat O, Nieminen P, Suwanrungraung K, Kamsa-Ard S, Parkin D. Consistency of cytology diagnosis for cervical cancer between two laboratories. Asian Pac J Cancer Prev. 2005;6:208–212. [PubMed] [Google Scholar]
  • 11.Cantor SB, Yamal JM, Guillaud M, et al. Accuracy of optical spectroscopy for the detection of cervical intraepithelial neoplasia: testing a device as an adjunct to colposcopy. Int J Cancer. 2011;128:1151–1168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960:20:37–46. [Google Scholar]
  • 13.Cohen J. Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull. 1968;70:213. [DOI] [PubMed] [Google Scholar]
  • 14.McHugh ML. Interrater reliability: the kappa statistic. Biochem Med. 2012;22:276–282. [PMC free article] [PubMed] [Google Scholar]
  • 15.Doornewaard H, van der Schouw YT, van der Graaf Y, Bos AB, van den Tweel JG. Observer variation in cytologic grading for cervical dysplasia of Papanicolaou smears with the PAPNET testing system. Cancer Cytopathol. 1999;87:178–183. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supinfo

RESOURCES