Abstract
Background
Pathologists use diverse terminology when interpreting melanocytic neoplasms, potentially compromising quality of care.
Objective
To evaluate the Melanoma Pathology Assessment and Treatment Hierarchy (MPATH-Dx) scheme, a five-category classification system for melanocytic lesions.
Methods
Participants (n=16) of the 2013 International Melanoma Pathology Study Group Workshop provided independent case-level diagnoses and treatment suggestions for 48 melanocytic lesions. Individual diagnoses (including, when provided, least and most severe diagnoses) were mapped to corresponding MPATH-Dx classes. Inter-rater agreement [95% CI] and correlation between MPATH-Dx categorization and treatment suggestions were evaluated.
Results
Most participants were dermatopathologists (n=12), age ≥50 years (n=12), male (n=9), US-based (n=11), and primary academic faculty (n=14). Overall, participants generated 634 case-level diagnoses with treatment suggestions. Mean weighted kappa coefficients for diagnostic agreement following MPATH-Dx mapping (assuming least and most severe diagnoses, when necessary) were 0.70 [0.68, 0.71] and 0.72 [0.71, 0.73], respectively, while correlation between MPATH-Dx categorization and treatment suggestions was 0.91.
Limitations
Small sample size of experienced pathologists in a testing situation.
Conclusion
Varying diagnostic nomenclature can be consistently classified into a concise hierarchy using the MPATH-Dx scheme. Further research is needed to determine whether this classification system can facilitate diagnostic concordance in general pathology practice and improve patient care.
INTRODUCTION
Pathologists use highly varied terminology when interpreting melanocytic neoplasms. Substantial discordance from nomenclature variability may arise during evaluation of these lesions1,2 consequent to provider-level characteristics (e.g. training, clinical experience, etc.) and uncertainty regarding underlying biologic behavior associated with these neoplasms. The latter may reflect intrinsic “complexity in the histologic continuum from benign to unequivocally malignant melanocytic lesions,”1 and subjectivity in application of diagnostic criteria.
A diagnostic label may not expressly convey when additional surgical treatment is needed, and non-dermatologist clinicians performing biopsies may not accurately infer appropriate treatments from pathology reports, impacting patient-centered care.1,3 Moreover, epidemiologic research regarding melanocytic lesions and associated outcomes has been limited by diagnostic inconsistency. Treatment suggestions in pathology reports may help resolve this ambiguity. The Melanoma Pathology Assessment & Treatment Hierarchy (MPATH-Dx) classification system integrates diagnosis and treatment considerations for melanocytic lesions, but it is not known how practicing/experienced pathologists and clinicians receiving their reports might apply these guidelines.1
Accordingly, we evaluated variability in diagnostic terms applied to melanocytic lesions within a group of experienced pathologists. We tested use of the MPATH-Dx classification system1 for categorizing diagnostic terms and hypothesized that (a) numerous diagnostic terms exist for histologically identical melanocytic neoplasms, (b) these terms can be mapped to one of the five MPATH-Dx classes, (c) moderate-to-good inter-rater agreement can be achieved using MPATH-Dx diagnostic classifications, and (d) cases with myriad diagnoses can be consistently assigned to suggested treatment categories, potentially simplifying interpretation of pathology reports and aiding physician decision-making.
METHODS
Study Overview and Participants
Eligible pathologists included those attending the International Melanoma Pathology Study Group Workshop during the Society for Melanoma Research Congress in November 2013. Data collection included (a) an online survey to ascertain participant characteristics and attitudes concerning interpretation of melanocytic lesions and (b) glass slide microscopic review of a test set of melanocytic neoplasms. Participants provided their independent diagnosis and treatment suggestion for each specimen. Our study received institutional board review approval from the University of Washington.
Test Set of Cutaneous Melanocytic Neoplasms
Melanocytic skin lesions biopsied between January 1, 2010 and December 31, 2011 from patients ages ≥20 years were obtained from a private pathology practice (Dermatopathology Northwest, Bellevue, WA). Cases were selected to represent a broad range of melanocytic neoplasms, including benign nevi, atypical/dysplastic nevi, melanoma in situ, and invasive melanoma. Shave, punch, and excisional biopsies were included, while consultative cases and re-excisions were excluded.
This sample of melanocytic lesions covered the full spectrum of the five MPATH-Dx1 classes. Class I lesions, such as common and mildly dysplastic nevi, pose very low risk for adverse outcomes and no further treatment is generally recommended; class II includes moderately dysplastic and spindle cell/epithelioid nevi without atypia which have low, but presently unquantifiable, risk of progression and may merit narrow excision (<5mm margins); class III lesions, such as severely dysplastic nevi and melanoma in situ, have higher risk of progression and may require excision with larger margins (5mm to <1cm); class IV encompasses stage T1a invasive melanomas potentially warranting wide excision (≥1cm margins); and class V includes stage T1b (or greater) invasive melanomas posing more significant risk of metastasis, which may require wide excision (≥1cm margins) and additional diagnostic work-up (sentinel lymph node biopsy) and/or adjuvant therapy (e.g. interferon).
The final test cases included 240 melanocytic neoplasms. Permuted block randomization (whereby cases are assigned to blocks of equal size for all possible permutations followed by random block selection) allocated the 240 cases into five test sets, each comprising 48 cases. One of the five test sets was randomly chosen for use in the present study. Detailed information regarding test set development is provided elsewhere.1
Test Set Interpretation
Participants were first provided an overview of the MPATH-Dx classification scheme, including a description of each MPATH-Dx classes and associated treatment considerations. Participants then sequentially evaluated test cases (one hematoxylin and eosin-stained glass slide per case) using a multi-headed microscope with a digital projector, driven by one of the authors (SK). The order of case presentation was randomly assigned before viewing. Participants offered their independent diagnostic assessments (entered into a blank write-in field) and “treatment recommendations” using four pre-specified checkboxes: (1) no further treatment indicated, (2) re-excise <5mm margins (narrow but complete), (3) re-excise ≥5mm margins (but <1cm), and (4) re-excise ≥1cm margins (wide excision). Additionally, participants could choose sentinel lymph node biopsy and/or adjuvant therapy (e.g. interferon).
Participants were informed that these treatment options should be regarded as suggestions for consideration, as they were developed using U.S. guidelines (when available) and may not reflect all practice patterns. Participants were instructed to assume that the biopsy was representative of the entire lesion and that the lesion was present at the margin. Patient age, biopsy type, and anatomic site were also provided.
Data were independently transcribed into an electronic database by two authors (JL and GZ), and ambiguities in data entry due to handwriting were resolved by joint consensus review to ensure data fidelity.
Primary Outcomes
Case-level diagnoses and treatment suggestions constituted the primary outcomes. Write-in diagnoses were mapped to corresponding MPATH-Dx classes, with the following modifications. First, write-in diagnoses indicating “invasive melanoma” (or variants therein) could not be definitively mapped to MPATH-Dx classes IV or V as participants were not asked depth of invasion, mitotic rate, or presence/absence of ulceration, and thus were grouped a priori into a combined MPATH-Dx category (MPATH-Dx class IV/V). Second, write-in diagnoses for which a differential diagnosis was provided by participants (e.g. “moderate versus severe dysplastic nevus”) were classified in two ways for analytic purposes: (a) according to the least severe corresponding MPATH-Dx class and (b) according to the most severe corresponding MPATH-Dx class. This was only performed when diagnoses provided in the histologic differential corresponded to different MPATH-Dx classes. Illegible write-in diagnoses were treated as missing data.
Suggestions of wide re-excisions (≥1cm), sentinel lymph node (SLN) mapping and/or adjuvant therapy were collapsed into 1 treatment category (participants who chose either SLN or adjuvant therapy alone or in combination were assumed to have recommended wide re-excision, even if not selected). Choice of mutually exclusive treatment options (e.g. boxes checked for both “re138 excise <5mm margins (narrow but complete)” and “re-excise ≥5mm margins (but <1cm)”) were considered ineligible responses, as were surgical re-excisions <1cm combined with SLN or adjuvant therapy.
Statistical Analysis
Correlation between MPATH-Dx diagnostic classes and treatment considerations was assessed using Spearman’s rank correlation coefficient (rho). We assessed inter-observer variability of MPATH-Dx diagnostic classes by calculating means of unweighted and weighted kappa coefficients for each pairwise combination of participants’ ratings across all 48 cases, again calculated separately presuming the least and most severe diagnosis when needed. Weights for kappa coefficients were calculated using the Cicchetti-Allison method, given this analytic approach assigns a weight of 2/3 to adjacent diagnoses and 1/3 to diagnoses that are 2 categories removed (based on a 4 category classification scheme). By assigning greater weights to diagnostic classifications which are “closer” to each other, this weighting scheme reflects variation in agreement that may differ according to diagnostic severity.
The associated percentile 95% confidence interval [95% CI] for each mean kappa coefficient was estimated using cluster bootstrapping with 2,000 re-samples.4 Likewise, mean kappa coefficients were independently calculated for case-level treatment recommendations. Strength of agreement for kappa coefficients was assessed according to Fleiss’ benchmark scale (poor agreement: <0.40; intermediate to good agreement: 0.40 to <0.75; excellent agreement: ≥0.75).5 Analyses were performed using SAS version 9.4 (SAS Institute Inc, Cary, NC) and Stata SE 13.1 (StataCorp, College Station, TX), and all statistical tests were 2-tailed with alpha equal to 0.05.
RESULTS
Participants consisted of 16 pathologists, the majority of whom were dermatopathologists (n=12; 75%). Most were ages ≥50 years (n=12; 75%), male (n=9; 56%), practicing within the U.S. (n=11; 69%), and held a primary academic appointment (n=14; 88%). All reported being regarded as melanocytic lesion experts by their colleagues, with 13 pathologists reporting ≥10 years of experience in this area. Three pathologists reported being “extremely confident” in their interpretation of melanocytic skin lesions, while 11 reported being “very confident.” The majority (n=9; 56%) agreed that melanocytic lesion diagnosis was “challenging” or “very challenging,” while none rated diagnosis of these lesions as “very easy,” “easy,” or “somewhat easy.”
Of 768 potential write-in responses (16 pathologists each interpreting 48 cases), 101 (13.2%) were missing a diagnosis, usually because the participant was temporarily absent during case presentation, while 21 diagnoses (2.7%) were illegible and 12 lacked suggested treatments. A total of 634 test case interpretations had completely mapped MPATH-Dx diagnoses with treatment responses and were analyzed.
While some cases were uniformly interpreted with the same diagnostic term (e.g. all pathologists diagnosed “invasive melanoma,” with minor qualification), other cases showed substantial variability in diagnostic labeling. For example, Figure 1 shows a slide image with diagnoses ranging from benign (e.g. “lentiginous junctional nevus”) to malignant (e.g. “melanoma in situ”), with 7 distinct diagnostic terms given for this case. Of note, this case would be mapped to MPATH-Dx class III by 7 of the participants. Figure 2 shows a similar spectrum of diagnostic terms applied to another case.
The distribution of diagnoses and treatment considerations are shown in Table 1 and Figure 3. Mean unweighted and weighted kappa coefficients for MPATH-Dx classes (assuming the least severe write-in diagnoses) were 0.57 [0.56, 0.59] and 0.70 [0.68, 0.71], respectively, demonstrating intermediate-to-good agreement. A comparable magnitude of inter-rater agreement was observed when assuming the most severe write-in diagnosis, with mean unweighted and weighted kappa coefficients of 0.62 [0.60, 0.63] and 0.72 [0.71, 0.73], respectively. Mean unweighted inter-rater kappa coefficient for agreement regarding treatment was 0.54 [0.52, 0.56], while the mean weighted estimate was 0.69 [0.67, 0.71].
Table 1.
Suggested Treatment Consideration Made by Participant for the Case | |||||
---|---|---|---|---|---|
I: No further treatment |
II: Re-excise <5mm margins |
III: Re-excise ≥5mm to <1cm margins |
IV/V: Re-excise ≥1cm marginsb |
Total | |
MPATH-Dx Classification based on Participants’ Write-in Diagnosis |
|||||
Presuming Least Severe Diagnosis (when needed)c | |||||
Class I: No apparent risk for continued local proliferation and adverse outcome |
89 (67.9) | 40 (30.5) | 2 (1.5) | 0 (0.0) | 131 |
Class II: Low level risk for local proliferation of remaining cells |
2 (3.8) | 41 (78.8) | 9 (17.3) | 0 (0.0) | 52 |
Class III: Higher likelihood of local tumor progression and greater need for intervention |
0 (0.0) | 30 (21.1) | 96 (67.6) | 16 (11.3) | 142 |
Class IV/V: Invasive melanoma stage T1a or ≥T1b | 0 (0.0) | 3 (1.0) | 29 (9.4) | 277 (89.6) | 309 |
Total | 91 | 114 | 136 | 293 | 634 |
Presuming Most Severe Diagnosis (when needed)c | |||||
Class I: No apparent risk for continued local proliferation and adverse outcome |
86 (87.8) | 12 (12.2) | 0 (0.0) | 0 (0.0) | 98 |
Class II: Low level risk for local proliferation of remaining cells |
3 (6.4) | 43 (91.5) | 1 (2.1) | 0 (0.0) | 47 |
Class III: Higher likelihood of local tumor progression and greater need for intervention |
2 (1.3) | 55 (34.8) | 96 (60.8) | 5 (3.2) | 158 |
Class IV/V: Invasive melanoma stage T1a or ≥T1b | 0 (0.0) | 4 (1.2) | 39 (11.8) | 288 (87.0) | 331 |
Total | 91 | 114 | 136 | 293 | 634 |
Values expressed as count (row percent)
Includes sentinel lymph node sampling and/or adjuvant therapy (e.g. interferon) as additional treatment responses
Write-in responses that included differential diagnoses were classified using two scenarios, one presuming the least severe diagnosis and the other presuming the most severe diagnosis
Increasing severity of diagnoses after mapping into MPATH-Dx classes was associated with increasing intensity of suggested treatment considerations (Table 1). Correlation between MPATH-Dx diagnostic classification (assuming the least severe write-in diagnosis, when necessary) and treatment was 0.91 [0.90, 0.92] (P<0.001). Similarly, correlation between MPATH-Dx classes, assuming the most severe write-in diagnosis when necessary, and treatment was 0.91 [0.90, 0.93] (P<0.001; Table 2).
DISCUSSION
Achieving diagnostic agreement for melanocytic skin lesions remains challenging. In the absence of substantial technological advances enabling precise classification of these neoplasms, complete elimination of disagreement is overly ambitious. For example, while improvements in adjunctive molecular testing for melanoma appear promising, ambiguous results still occur, and it remains unclear whether sufficiently “black or white” results are possible in the near future. We believe that the litany of diagnostic terms is and will remain a substantive contributor to diagnostic discordance for melanocytic lesions. To our knowledge, this is the largest study describing the variability in diagnostic terms applied to melanocytic neoplasms.
Our results highlight the diverse diagnostic terminology used by pathologists and illustrate concise mapping of these terms into the MPATH-Dx hierarchy. We found that write-in diagnoses mapped to MPATH-Dx classes were significantly correlated with treatment suggestions, providing further validation of this integrated approach.1 Recognizing, however, that more diverse and complex terminology may provide valuable information for patient care, we do not envisage replacing current “real world” practice. Rather, potential inclusion of MPATH-DX categories (following future research to optimize this classification scheme) in pathology reports may be useful as an added feature to clarify reporting. This will likely be valuable when reports are circulated outside of pathology practice groups to local networks of referring physicians that may be less familiar with potential nuances of dermatopathology terminology and treatment implications.
Our results should be considered in context of the known complexities in interpretation of dermatopathology reports. A recent study indicated that surgeons may misinterpret pathology reports up to 30% of the time.6 The risk of misinterpretation of dermatopathology reports is likely to be clinically significant given the lack of standardized diagnostic terminology for melanocytic neoplasms.1 In addition, recent regulatory changes to Clinical Laboratory Improvement Amendments (CLIA) of 1988 enable direct patient access to pathology reports from CLIA-certified dermatopathology laboratories.7 While increasing medical record transparency will undoubtedly increase patient engagement in healthcare, many patients may be confused by the diagnostic terms they encounter, raising risks of psychological harm and increased demands for subsequent and potentially unnecessary procedures.8 Communication problems might be mitigated and pathology reporting improved through use of classification schemes such as MPATH-Dx. Our results show that varied histopathologic assessments can be simplified into a manageable number of categories using this scheme.
Comparable efforts have been successfully pursued in the radiographic interpretation of mammograms by the American College of Radiology through the development of the Breast Imaging Reporting and Data System (BI-RADS®).9 Like MPATH-Dx, BI-RADS® categorizes mammographic lesions into categories with associated clinical management recommendations. Not only has BI-RADS® improved quality of care for women with abnormal mammogram findings—reducing both under- and over-utilization of follow-up procedures services10—it has also standardized breast cancer research by providing a coherent, universal framework interpreting mammograms.11
The MPATH-Dx system is promising for dermatopathology and the histopathologic interpretation of melanocytic neoplasms. Simplified classification is both timely and requisite for longer-term efforts to advance care delivery for patients undergoing biopsies of pigmented lesions. By reducing diagnostic confusion and establishing a reference framework for diagnosis and treatment of melanocytic proliferations, use of the MPATH-Dx system may also help reduce exposure to medico-legal liability. Future research is needed to determine whether MPATH-Dx improves concordance of diagnostic interpretation among community pathologists. Broader implementation of this system will likely require additional piloting and revision by professional societies in dermatopathology and dermatology.
This study has limitations. Pathologists interpreted a single slide per case in an artificial setting characterized by time constraints and inability to “drive” the microscope or consult colleagues, and thus these results may differ from actual practice. Our analysis did not consider diagnostic therapeutic disagreement for invasive melanoma cases that were, by necessity, collapsed into a combined category (MPATH-Dx class IV/V), thereby obscuring potential additional variation in agreement. In addition, our study only included a self-selected group of pathologists skilled in diagnosis of melanocytic neoplasms, and therefore these results may not be generalizable to the broader community of pathologists.
The test set of melanocytic lesions also excluded consultative cases, which may have resulted in inclusion of “easier” cases and inflated estimates of inter-rater agreement. Finally, the MPATH272 Dx scheme also assumes that each biopsy represented the entire lesion and that the lesion was present at the margin. The utility of this classification scheme may be limited for biopsies that do not represent the entire lesion or have negative margins. Such scenarios may warrant pursuit of different/alternative treatments in ensure appropriate management.
We understand that agreement concerning treatment recommendations does not necessarily imply complete diagnostic agreement or consensus regarding the ultimate biologic behavior of melanocytic neoplasms. It is unlikely that uncertainty surrounding some of these lesions can be completely eliminated, no matter what the approach. Nonetheless, our findings highlight that many different diagnostic terms are applied to the same skin biopsies, even by expert dermatopathologists, and underscore the need for development and implementation of novel interventions, such as the MPATH-Dx system, to improve diagnostic agreement and simplify treatments for melanocytic neoplasms.
CAPSULE SUMMARY.
-
*
Pathologists use diverse terminology to diagnose melanocytic lesions.
-
*
The Melanoma Pathology Assessment and Treatment Hierarchy (MPATH-Dx)—a novel classification system—simplifies diagnosis and yields high inter-rater agreement.
-
*
Implementation of the MPATH-Dx system may aid in diagnostic consistency of melanocytic lesions.
Acknowledgments
The following exceptions: Dr. Lott is an employee of Bayer HealthCare Pharmaceuticals, Inc., Dr. Barnhill has received honoraria as a consultant for Myriad Genetics; Dr. Elder has received honoraria as a consultant for Myriad Genetics and royalties as a textbook editor for Lippincott Williams & Wilkins, Wolters Kluwer, and for Demos Medical Publishing, and as an author of the Armed Forces Institute of Pathology Fascicles on melanocytic and nonmelanocytic tumor pathology; Dr. Elenitsas has received honoraria as a consultant for Myriad Genetics and royalties as a textbook editor for Lippincott Williams & Wilkins; Dr. Gerami has received honoraria as a consultant at DermTech Inc., Castle Biosciences, and Myriad Genetics; and Dr. Mihm has received honoraria as a textbook co-author from Wiley & Sons, Inc. and advisory board member for Caliber ID, as well as received consulting fees as a consultant for Mela Sciences, Alnylam, and Novartis.
Funding/Support: This work was supported by the National Cancer Institute (R01 CA151306 and K05 CA104699).
ABBREVIATIONS
- BI-RADS®
Breast Imaging Reporting and Data System
- CI
confidence interval
- CLIA
Clinical Laboratory Improvement Amendments
- MPATH-Dx
Melanoma Pathology Assessment & Treatment Hierarchy
- SLN
sentinel lymph node
- U.S.
United States
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Conflicts of Interest: The authors have no conflicts of interest to declare
IRB Statement: The Institutional Review Board of the University of Washington approved this study.
REFERENCES
- 1.Piepkorn MW, Barnhill RL, Elder DE, et al. The MPATH-Dx reporting schema for melanocytic proliferations and melanoma. J Am Acad Dermatol. 2014;70:131–141. doi: 10.1016/j.jaad.2013.07.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Corona R, Mele A, Amini M, et al. Interobserver variability on the histopathologic diagnosis of cutaneous melanoma and other pigmented skin lesions. J Clin Oncol. 1996;14:1218–1223. doi: 10.1200/JCO.1996.14.4.1218. [DOI] [PubMed] [Google Scholar]
- 3.Niebling MG, Haydu LE, Karim RZ, et al. Pathology review significantly affects diagnosis and treatment of melanoma patients: an analysis of 5011 patients treated at a melanoma treatment center. Ann Surg Oncol. 2014;21:2245–2251. doi: 10.1245/s10434-014-3682-x. [DOI] [PubMed] [Google Scholar]
- 4.Davison AC, Hinkley DV. Bootstrap Methods and their Application (Cambridge Seriess in Statistical and Probabalistic Mathematics) 1st. Cambridge, UK: Cambridge University Press; 1997. [Google Scholar]
- 5.Fleiss JL, Levin B, Paik MC. Statistical methods for rates and proportions. 3rd. Hoboken, NJ: John Wiley & Sons; 2003. The measurement of interrater agreement. [Google Scholar]
- 6.Powsner SM, Costa J, Homer RJ. Clinicians are from Mars and pathologists are from Venus. Arch Pathol Lab Med. 2000;124:1040–1046. doi: 10.5858/2000-124-1040-CAFMAP. [DOI] [PubMed] [Google Scholar]
- 7.Young MJ, Scheinberg E, Bursztajn H. Direct-to-patient laboratory test reporting: balancing access with effective clinical communication. JAMA. 2014;312:127–128. doi: 10.1001/jama.2014.5823. [DOI] [PubMed] [Google Scholar]
- 8.Lott JP, Piepkorn MW, Elmore JG. Dermatology in an age of fully transparent electronic medical records. JAMA Dermatol. 2015;151:477–478. doi: 10.1001/jamadermatol.2014.4362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.American College of Radiology. Breast Imaging Reporting and Data System (BI-RADS) 3rd. Reston, VA: American College of Radiology; 1998. [Google Scholar]
- 10.Eberl MM, Fox CH, Edge SB, et al. Mahoney MC. BI-RADS classification for management of abnormal mammograms. J Am Board Fam Med. 2006;19:161–164. doi: 10.3122/jabfm.19.2.161. [DOI] [PubMed] [Google Scholar]
- 11.Burnside ES, Sickles EA, Bassett LW, et al. The ACR BI-RADS Experience: Learning From History. J Am Coll Radiol. 2009;6:851–860. doi: 10.1016/j.jacr.2009.07.023. [DOI] [PMC free article] [PubMed] [Google Scholar]